VMwareAPI-Team

Bug #1230047
Activity log

Activity log for bug #1230047

Date	Who	What changed	Old value	New value	Message
2013-09-25 03:35:13	Ryan Hsu	bug			added bug
2013-09-25 03:47:22	Ryan Hsu	description	BUG-DESCRIPTION: When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. Either of the 2 following error messages can be seen in the logs when an instance fails to build. Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1228, in _allocate_network_async dhcp_options=dhcp_options) File "/opt/stack/nova/nova/network/api.py", line 93, in wrapped return func(self, context, args, kwargs) File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper res = f(self, context, args, kwargs) File "/opt/stack/nova/nova/network/api.py", line 300, in allocate_for_instance nw_info = self.network_rpcapi.allocate_for_instance(context, args) File "/opt/stack/nova/nova/network/rpcapi.py", line 184, in allocate_for_instance macs=jsonutils.to_primitive(macs)) File "/opt/stack/nova/nova/rpcclient.py", line 85, in call return self._invoke(self.proxy.call, ctxt, method, kwargs) File "/opt/stack/nova/nova/rpcclient.py", line 63, in _invoke return cast_or_call(ctxt, msg, self.kwargs) File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 130, in call exc.info, real_topic, msg.get('method')) Here information from the 2 environments where the issue was observed: Environment 1: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ Environment 2: - 1 datacenter, 1 cluster, 2 hosts - iSCSI shared datastore - was able to spawn ~30 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47467/	When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. Either of the 2 following error messages can be seen in the logs when an instance fails to build. Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1228, in _allocate_network_async dhcp_options=dhcp_options) File "/opt/stack/nova/nova/network/api.py", line 93, in wrapped return func(self, context, args, kwargs) File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper res = f(self, context, args, kwargs) File "/opt/stack/nova/nova/network/api.py", line 300, in allocate_for_instance nw_info = self.network_rpcapi.allocate_for_instance(context, args) File "/opt/stack/nova/nova/network/rpcapi.py", line 184, in allocate_for_instance macs=jsonutils.to_primitive(macs)) File "/opt/stack/nova/nova/rpcclient.py", line 85, in call return self._invoke(self.proxy.call, ctxt, method, kwargs) File "/opt/stack/nova/nova/rpcclient.py", line 63, in _invoke return cast_or_call(ctxt, msg, self.kwargs) File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 130, in call exc.info, real_topic, msg.get('method')) Here information from the 2 environments where the issue was observed: Environment 1: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ Environment 2: - 1 datacenter, 1 cluster, 2 hosts - iSCSI shared datastore - was able to spawn ~30 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47467/
2013-09-25 23:05:36	Ryan Hsu	description	When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. Either of the 2 following error messages can be seen in the logs when an instance fails to build. Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1228, in _allocate_network_async dhcp_options=dhcp_options) File "/opt/stack/nova/nova/network/api.py", line 93, in wrapped return func(self, context, args, kwargs) File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper res = f(self, context, args, kwargs) File "/opt/stack/nova/nova/network/api.py", line 300, in allocate_for_instance nw_info = self.network_rpcapi.allocate_for_instance(context, args) File "/opt/stack/nova/nova/network/rpcapi.py", line 184, in allocate_for_instance macs=jsonutils.to_primitive(macs)) File "/opt/stack/nova/nova/rpcclient.py", line 85, in call return self._invoke(self.proxy.call, ctxt, method, kwargs) File "/opt/stack/nova/nova/rpcclient.py", line 63, in _invoke return cast_or_call(ctxt, msg, self.kwargs) File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 130, in call exc.info, real_topic, msg.get('method')) Here information from the 2 environments where the issue was observed: Environment 1: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ Environment 2: - 1 datacenter, 1 cluster, 2 hosts - iSCSI shared datastore - was able to spawn ~30 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47467/	UPDATE: Removed information related to the iSCSI environment as the problem was due to testing using an Openstack server that had very little CPU and memory. The issue remains on the NFS server. When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. The following error message can be seen in the logs when an instance fails to build. Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Environment information: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/
2013-09-25 23:06:07	Ryan Hsu	summary	VMware: errors spawning large amounts of VMs	VMware: spawning large amounts of VMs sometimes causes errors
2013-10-06 02:56:14	Ryan Hsu	description	UPDATE: Removed information related to the iSCSI environment as the problem was due to testing using an Openstack server that had very little CPU and memory. The issue remains on the NFS server. When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. The following error message can be seen in the logs when an instance fails to build. Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Environment information: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/	When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. There are two errors seen in the logs that are causing the instance spawn failures. The first is the ESX host not finding the image in the nfs datastore (even though it is there, otherwise other instances couldn't be spawned). The second is the ESX host not being able to access the vmdk image because it is locked. Image not found error: Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Image locked error: Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1407, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 623, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 504, in spawn root_gb_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 900, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: Unable to access file [ryan-nfs] vmware_base/f110bb94-2170-4a3a-ae0d-760f95eb8b47.0. Environment information: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/
2013-10-09 17:33:28	Vui Lam	nova: status	New	Confirmed
2013-10-09 17:34:23	Vui Lam	nova: assignee		Vui Lam (vui)
2013-10-09 17:41:25	Tracy Jones	nova: importance	Undecided	High
2013-11-19 18:37:19	Tracy Jones	tags	vmware	havana-backport-potential vmware
2013-11-19 18:37:39	Shawn Hartsock	bug task added		openstack-vmwareapi-team
2013-11-19 18:37:46	Shawn Hartsock	openstack-vmwareapi-team: status	New	Confirmed
2013-11-19 18:37:49	Shawn Hartsock	openstack-vmwareapi-team: importance	Undecided	High
2013-11-19 18:37:59	Shawn Hartsock	openstack-vmwareapi-team: assignee		Vui Lam (vui)
2013-11-26 22:54:17	Shawn Hartsock	summary	VMware: spawning large amounts of VMs sometimes causes errors	VMware: spawning large amounts of VMs concurrently sometimes causes errors
2013-12-01 07:31:33	Gary Kotton	nova: assignee	Vui Lam (vui)	Gary Kotton (garyk)
2013-12-01 07:31:38	Gary Kotton	openstack-vmwareapi-team: assignee	Vui Lam (vui)	Gary Kotton (garyk)
2013-12-01 07:31:39	Gary Kotton	nova: milestone		icehouse-1
2013-12-02 20:44:44	dan wendlandt	summary	VMware: spawning large amounts of VMs concurrently sometimes causes errors	VMware: spawning large amounts of VMs concurrently sometimes causes "VMDK lock" error
2013-12-03 22:56:11	Russell Bryant	nova: milestone	icehouse-1	icehouse-2
2013-12-05 09:53:47	OpenStack Infra	nova: status	Confirmed	In Progress
2014-01-22 20:23:59	Thierry Carrez	nova: milestone	icehouse-2	icehouse-3
2014-03-05 12:34:36	Thierry Carrez	nova: milestone	icehouse-3	icehouse-rc1
2014-03-06 13:50:08	OpenStack Infra	nova: status	In Progress	Fix Committed
2014-03-31 19:02:56	Thierry Carrez	nova: status	Fix Committed	Fix Released
2014-04-17 09:12:55	Thierry Carrez	nova: milestone	icehouse-rc1	2014.1