Activity log for bug #1230047

Date Who What changed Old value New value Message
2013-09-25 03:35:13 Ryan Hsu bug added bug
2013-09-25 03:47:22 Ryan Hsu description BUG-DESCRIPTION: When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running: nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. Either of the 2 following error messages can be seen in the logs when an instance fails to build. Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn vmdk_file_size_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1228, in _allocate_network_async dhcp_options=dhcp_options) File "/opt/stack/nova/nova/network/api.py", line 93, in wrapped return func(self, context, *args, **kwargs) File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper res = f(self, context, *args, **kwargs) File "/opt/stack/nova/nova/network/api.py", line 300, in allocate_for_instance nw_info = self.network_rpcapi.allocate_for_instance(context, **args) File "/opt/stack/nova/nova/network/rpcapi.py", line 184, in allocate_for_instance macs=jsonutils.to_primitive(macs)) File "/opt/stack/nova/nova/rpcclient.py", line 85, in call return self._invoke(self.proxy.call, ctxt, method, **kwargs) File "/opt/stack/nova/nova/rpcclient.py", line 63, in _invoke return cast_or_call(ctxt, msg, **self.kwargs) File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 130, in call exc.info, real_topic, msg.get('method')) Here information from the 2 environments where the issue was observed: Environment 1: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ Environment 2: - 1 datacenter, 1 cluster, 2 hosts - iSCSI shared datastore - was able to spawn ~30 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47467/ When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running:     nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. Either of the 2 following error messages can be seen in the logs when an instance fails to build. Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn     block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn     admin_password, network_info, block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn     vmdk_file_size_in_kb, linked_clone)   File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm     self._session._wait_for_task(instance_uuid, reconfig_task)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task     ret_val = done.wait()   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait     return hubs.get_hub().switch()   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch     return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1228, in _allocate_network_async     dhcp_options=dhcp_options)   File "/opt/stack/nova/nova/network/api.py", line 93, in wrapped     return func(self, context, *args, **kwargs)   File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper     res = f(self, context, *args, **kwargs)   File "/opt/stack/nova/nova/network/api.py", line 300, in allocate_for_instance     nw_info = self.network_rpcapi.allocate_for_instance(context, **args)   File "/opt/stack/nova/nova/network/rpcapi.py", line 184, in allocate_for_instance     macs=jsonutils.to_primitive(macs))   File "/opt/stack/nova/nova/rpcclient.py", line 85, in call     return self._invoke(self.proxy.call, ctxt, method, **kwargs)   File "/opt/stack/nova/nova/rpcclient.py", line 63, in _invoke     return cast_or_call(ctxt, msg, **self.kwargs)   File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 130, in call     exc.info, real_topic, msg.get('method')) Here information from the 2 environments where the issue was observed: Environment 1: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ Environment 2: - 1 datacenter, 1 cluster, 2 hosts - iSCSI shared datastore - was able to spawn ~30 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47467/
2013-09-25 23:05:36 Ryan Hsu description When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running:     nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. Either of the 2 following error messages can be seen in the logs when an instance fails to build. Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn     block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn     admin_password, network_info, block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn     vmdk_file_size_in_kb, linked_clone)   File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm     self._session._wait_for_task(instance_uuid, reconfig_task)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task     ret_val = done.wait()   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait     return hubs.get_hub().switch()   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch     return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1228, in _allocate_network_async     dhcp_options=dhcp_options)   File "/opt/stack/nova/nova/network/api.py", line 93, in wrapped     return func(self, context, *args, **kwargs)   File "/opt/stack/nova/nova/network/api.py", line 49, in wrapper     res = f(self, context, *args, **kwargs)   File "/opt/stack/nova/nova/network/api.py", line 300, in allocate_for_instance     nw_info = self.network_rpcapi.allocate_for_instance(context, **args)   File "/opt/stack/nova/nova/network/rpcapi.py", line 184, in allocate_for_instance     macs=jsonutils.to_primitive(macs))   File "/opt/stack/nova/nova/rpcclient.py", line 85, in call     return self._invoke(self.proxy.call, ctxt, method, **kwargs)   File "/opt/stack/nova/nova/rpcclient.py", line 63, in _invoke     return cast_or_call(ctxt, msg, **self.kwargs)   File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 130, in call     exc.info, real_topic, msg.get('method')) Here information from the 2 environments where the issue was observed: Environment 1: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ Environment 2: - 1 datacenter, 1 cluster, 2 hosts - iSCSI shared datastore - was able to spawn ~30 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47467/ UPDATE: Removed information related to the iSCSI environment as the problem was due to testing using an Openstack server that had very little CPU and memory. The issue remains on the NFS server. When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running:     nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. The following error message can be seen in the logs when an instance fails to build. Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn     block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn     admin_password, network_info, block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn     vmdk_file_size_in_kb, linked_clone)   File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm     self._session._wait_for_task(instance_uuid, reconfig_task)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task     ret_val = done.wait()   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait     return hubs.get_hub().switch()   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch     return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Environment information: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/
2013-09-25 23:06:07 Ryan Hsu summary VMware: errors spawning large amounts of VMs VMware: spawning large amounts of VMs sometimes causes errors
2013-10-06 02:56:14 Ryan Hsu description UPDATE: Removed information related to the iSCSI environment as the problem was due to testing using an Openstack server that had very little CPU and memory. The issue remains on the NFS server. When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running:     nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. The following error message can be seen in the logs when an instance fails to build. Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn     block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn     admin_password, network_info, block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn     vmdk_file_size_in_kb, linked_clone)   File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm     self._session._wait_for_task(instance_uuid, reconfig_task)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task     ret_val = done.wait()   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait     return hubs.get_hub().switch()   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch     return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Environment information: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/ When using the VMwareVCDriver, spawning large amounts of virtual machines concurrently causes some instances to spawn with status ERROR. The number of machines that fail to build is unpredictable and sometimes all instances do end up spawning successfully. The issue can be reproduced by running:     nova boot --image debian-2.6.32-i686 --flavor 1 --num-instances 32 nameless The number of instances that causes the errors differ from environment to environment. Start with 30-40. There are two errors seen in the logs that are causing the instance spawn failures. The first is the ESX host not finding the image in the nfs datastore (even though it is there, otherwise other instances couldn't be spawned). The second is the ESX host not being able to access the vmdk image because it is locked. Image not found error: Traceback (most recent call last):   File "/opt/stack/nova/nova/compute/manager.py", line 1408, in _spawn     block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 609, in spawn     admin_password, network_info, block_device_info)   File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 440, in spawn     vmdk_file_size_in_kb, linked_clone)   File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm     self._session._wait_for_task(instance_uuid, reconfig_task)   File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 795, in _wait_for_task     ret_val = done.wait()   File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait     return hubs.get_hub().switch()   File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch     return self.greenlet.switch() NovaException: File [ryan-nfs] vmware_base/e8c42ed8-05e7-45bc-90c3-49a34e5a37c6.vmdk was not found Image locked error: Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 1407, in _spawn block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 623, in spawn admin_password, network_info, block_device_info) File "/opt/stack/nova/nova/virt/vmwareapi/vmops.py", line 504, in spawn root_gb_in_kb, linked_clone) File "/opt/stack/nova/nova/virt/vmwareapi/volumeops.py", line 71, in attach_disk_to_vm self._session._wait_for_task(instance_uuid, reconfig_task) File "/opt/stack/nova/nova/virt/vmwareapi/driver.py", line 900, in _wait_for_task ret_val = done.wait() File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait return hubs.get_hub().switch() File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch return self.greenlet.switch() NovaException: Unable to access file [ryan-nfs] vmware_base/f110bb94-2170-4a3a-ae0d-760f95eb8b47.0. Environment information: - 1 datacenter, 1 cluster, 7 hosts - NFS shared datastore - was able to spawn 7 instances before errors appeared - screen log with tracebacks: http://paste.openstack.org/show/47410/
2013-10-09 17:33:28 Vui Lam nova: status New Confirmed
2013-10-09 17:34:23 Vui Lam nova: assignee Vui Lam (vui)
2013-10-09 17:41:25 Tracy Jones nova: importance Undecided High
2013-11-19 18:37:19 Tracy Jones tags vmware havana-backport-potential vmware
2013-11-19 18:37:39 Shawn Hartsock bug task added openstack-vmwareapi-team
2013-11-19 18:37:46 Shawn Hartsock openstack-vmwareapi-team: status New Confirmed
2013-11-19 18:37:49 Shawn Hartsock openstack-vmwareapi-team: importance Undecided High
2013-11-19 18:37:59 Shawn Hartsock openstack-vmwareapi-team: assignee Vui Lam (vui)
2013-11-26 22:54:17 Shawn Hartsock summary VMware: spawning large amounts of VMs sometimes causes errors VMware: spawning large amounts of VMs concurrently sometimes causes errors
2013-12-01 07:31:33 Gary Kotton nova: assignee Vui Lam (vui) Gary Kotton (garyk)
2013-12-01 07:31:38 Gary Kotton openstack-vmwareapi-team: assignee Vui Lam (vui) Gary Kotton (garyk)
2013-12-01 07:31:39 Gary Kotton nova: milestone icehouse-1
2013-12-02 20:44:44 dan wendlandt summary VMware: spawning large amounts of VMs concurrently sometimes causes errors VMware: spawning large amounts of VMs concurrently sometimes causes "VMDK lock" error
2013-12-03 22:56:11 Russell Bryant nova: milestone icehouse-1 icehouse-2
2013-12-05 09:53:47 OpenStack Infra nova: status Confirmed In Progress
2014-01-22 20:23:59 Thierry Carrez nova: milestone icehouse-2 icehouse-3
2014-03-05 12:34:36 Thierry Carrez nova: milestone icehouse-3 icehouse-rc1
2014-03-06 13:50:08 OpenStack Infra nova: status In Progress Fix Committed
2014-03-31 19:02:56 Thierry Carrez nova: status Fix Committed Fix Released
2014-04-17 09:12:55 Thierry Carrez nova: milestone icehouse-rc1 2014.1