libvirt: blockCommit fails if domain is not running, for attached cinder volumes

Bug #1471726 reported by Deepak C Shetty
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Low
Unassigned

Bug Description

Using a devstack setup, fairly latest!

1) Create a cinder volume (used GlusterFS as the cinder backed) - cv1
2) Attach cv1 to vm1 (vm1 is a nova VM in running state)
3) Create 2 snapshots of vol1 using cinder snapshot-create ... cv1-snap1, cv1-snap2
4) Stop the nova vm vm1 (Note that cinder still reports the volume cv1 as 'in-use')
5) From cinder, delete cv1-snap1. Since cv1-snap1 is _Not_ the active file, nova tries to do a blockCommit and fails with excp below:

2015-07-06 09:33:00.479 ERROR oslo_messaging.rpc.dispatcher [req-695dd8c5-2722-4cf2-ab0a-583b0dacd388 nova service] Exception during message handling: Requested operation is not valid: domain is not running
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher return func(*args, **kwargs)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher payload)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 72, in wrapped
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 2954, in volume_snapshot_delete
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher snapshot_id, delete_info)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2024, in volume_snapshot_delete
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher base_file = delete_info['base_file']
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2017, in volume_snapshot_delete
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher "snapshots.") % ver
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2003, in _volume_snapshot_delete
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher # paths are maintained relative by qemu.
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher result = proxy_call(self._autowrap, f, *args, **kwargs)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher rv = execute(f, *args, **kwargs)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher six.reraise(c, e, tb)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher rv = meth(*args, **kwargs)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib64/python2.7/site-packages/libvirt.py", line 642, in blockCommit
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher if ret == -1: raise libvirtError ('virDomainBlockCommit() failed', dom=self)
2015-07-06 09:33:00.479 TRACE oslo_messaging.rpc.dispatcher libvirtError: Requested operation is not valid: domain is not running

Revision history for this message
Deepak C Shetty (dpkshetty) wrote :

This bug is similar to :

https://bugs.launchpad.net/cinder/+bug/1444806
(test_volume_boot_pattern tempest test failure for glusterfs backend - Part 2)

https://bugs.launchpad.net/nova/+bug/1465416
(os-assisted-volume-snapshots:delete doesn't work if instance is SHUTOFF )

except that those are for blockRebase, and this bug covers blockCommit case.

In talking with folks on IRC it seemed a failing testcase for blockCommit was needed, hence I created this bug report

Revision history for this message
Deepak C Shetty (dpkshetty) wrote :
Download full text (8.4 KiB)

Here are the steps used to re-create the fail:

[stack@devstack-f21 ~]$ [admin] cinder list
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
| ID | Status | Display Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
| 1d7ce8ac-b440-44aa-80a7-932a938186a9 | in-use | cv1 | 1 | glusterfs | false | f886d5e8-257e-47b0-8b51-155d8a927af5 |
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+

[stack@devstack-f21 ~]$ [admin] cinder snapshot-list
+--------------------------------------+--------------------------------------+-----------+--------------+------+
| ID | Volume ID | Status | Display Name | Size |
+--------------------------------------+--------------------------------------+-----------+--------------+------+
| 22387337-8583-4d68-96a3-fa851056d0cd | 1d7ce8ac-b440-44aa-80a7-932a938186a9 | available | cv1-snap2 | 1 |
| ca7646b9-ca33-48e2-b27f-ea55010f7354 | 1d7ce8ac-b440-44aa-80a7-932a938186a9 | available | cv1-snap1 | 1 |
+--------------------------------------+--------------------------------------+-----------+--------------+------+

[stack@devstack-f21 ~]$ [admin] nova list
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| f886d5e8-257e-47b0-8b51-155d8a927af5 | vm1 | ACTIVE | - | Running | private=10.0.0.3, fdaf:7a10:7b10:0:f816:3eff:fe4b:69ae |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+

[stack@devstack-f21 ~]$ [admin] nova stop vm1
Request to stop server vm1 has been accepted.

[stack@devstack-f21 ~]$ [admin] nova list
+--------------------------------------+------+---------+------------+-------------+--------------------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+---------+------------+-------------+--------------------------------------------------------+
| f886d5e8-257e-47b0-8b51-155d8a927af5 | vm1 | SHUTOFF | - | Shutdown | private=10.0.0.3, fdaf:7a10:7b10:0:f816:3eff:fe4b:69ae |
+--------------------------------------+------+---------+------------+-------------+--------------------------------------------------------+

[stack@devstack-f21 ~]$ [admin] cinder snapshot-delete cv1-snap1

[stack@devstack-f21 ~]$ [admi...

Read more...

Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

I have removed the "nova" tag because there is no subteam which watches this tag.

tags: removed: nova
tags: added: volumes
removed: blockcommit
Changed in nova:
status: New → Confirmed
Revision history for this message
John Garbutt (johngarbutt) wrote :

To clarify, the volume should stay in-use when a VM is shutoff.

Certainly seems like libvirt driver needs to check instance if instance is running before making the above call. Probably need to make sure we lock on the instance for the duration of the operation to make sure we don't hit races where the VM turns on half way through, etc.

Anyways, assuming this still reproduces, it's worth a look still.

tags: added: gluster
Changed in nova:
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.