libvirt live_snapshot periodically explodes on libvirt 1.2.2 in the gate

Bug #1334398 reported by Matt Riedemann
76
This bug affects 13 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Undecided
Unassigned

Bug Description

Seeing this here:

http://logs.openstack.org/70/97670/5/check/check-tempest-dsvm-postgres-full/7d4c7cf/console.html

2014-06-24 23:15:41.714 | tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name[gate]
2014-06-24 23:15:41.714 | ---------------------------------------------------------------------------------------------------------------------------------------
2014-06-24 23:15:41.714 |
2014-06-24 23:15:41.714 | Captured traceback-1:
2014-06-24 23:15:41.714 | ~~~~~~~~~~~~~~~~~~~~~
2014-06-24 23:15:41.715 | Traceback (most recent call last):
2014-06-24 23:15:41.715 | File "tempest/services/compute/json/images_client.py", line 86, in delete_image
2014-06-24 23:15:41.715 | resp, body = self.delete("images/%s" % str(image_id))
2014-06-24 23:15:41.715 | File "tempest/common/rest_client.py", line 224, in delete
2014-06-24 23:15:41.715 | return self.request('DELETE', url, extra_headers, headers, body)
2014-06-24 23:15:41.715 | File "tempest/common/rest_client.py", line 430, in request
2014-06-24 23:15:41.715 | resp, resp_body)
2014-06-24 23:15:41.715 | File "tempest/common/rest_client.py", line 474, in _error_checker
2014-06-24 23:15:41.715 | raise exceptions.NotFound(resp_body)
2014-06-24 23:15:41.715 | NotFound: Object not found
2014-06-24 23:15:41.715 | Details: {"itemNotFound": {"message": "Image not found.", "code": 404}}
2014-06-24 23:15:41.716 |
2014-06-24 23:15:41.716 |
2014-06-24 23:15:41.716 | Captured traceback:
2014-06-24 23:15:41.716 | ~~~~~~~~~~~~~~~~~~~
2014-06-24 23:15:41.716 | Traceback (most recent call last):
2014-06-24 23:15:41.716 | File "tempest/api/compute/images/test_images_oneserver.py", line 31, in tearDown
2014-06-24 23:15:41.716 | self.server_check_teardown()
2014-06-24 23:15:41.716 | File "tempest/api/compute/base.py", line 161, in server_check_teardown
2014-06-24 23:15:41.716 | 'ACTIVE')
2014-06-24 23:15:41.716 | File "tempest/services/compute/json/servers_client.py", line 173, in wait_for_server_status
2014-06-24 23:15:41.716 | raise_on_error=raise_on_error)
2014-06-24 23:15:41.717 | File "tempest/common/waiters.py", line 107, in wait_for_server_status
2014-06-24 23:15:41.717 | raise exceptions.TimeoutException(message)
2014-06-24 23:15:41.717 | TimeoutException: Request timed out
2014-06-24 23:15:41.717 | Details: (ImagesOneServerTestJSON:tearDown) Server 90c79adf-4df1-497c-a786-13bdc5cca98d failed to reach ACTIVE status and task state "None" within the required time (196 s). Current status: ACTIVE. Current task state: image_pending_upload.

Looks like it's trying to delete image with uuid 518a32d0-f323-413c-95c2-dd8299716c19 which doesn't exist, because it's still uploading?

This is maybe related to bug 1320617 as a general performance issue with glance.

Looking in the glance registry log, the image is created here:

2014-06-24 22:51:23.538 15740 INFO glance.registry.api.v1.images [13c1b477-cd22-44ca-ba0d-bf1b19202df6 d01d4977b5cc4e20a99e1d7ca58ce444 207d083a31944716b9cd2ecda0f09ce7 - - -] Successfully created image 518a32d0-f323-413c-95c2-dd8299716c19

The image is deleted here:

2014-06-24 22:54:53.146 15740 INFO glance.registry.api.v1.images [7c29f253-acef-41a0-b62b-c3087f7617ef d01d4977b5cc4e20a99e1d7ca58ce444 207d083a31944716b9cd2ecda0f09ce7 - - -] Successfully deleted image 518a32d0-f323-413c-95c2-dd8299716c19

And the 'not found' is here:

2014-06-24 22:54:56.508 15740 INFO glance.registry.api.v1.images [c708cf1f-27a8-4003-9c29-6afca7dd9bb8 d01d4977b5cc4e20a99e1d7ca58ce444 207d083a31944716b9cd2ecda0f09ce7 - - -] Image 518a32d0-f323-413c-95c2-dd8299716c19 not found

Tags: libvirt
Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (3.6 KiB)

The n-cpu logs have several errors for the libvirt connection being reset:

http://logs.openstack.org/70/97670/5/check/check-tempest-dsvm-postgres-full/7d4c7cf/logs/screen-n-cpu.txt.gz?level=TRACE#_2014-06-24_22_54_52_973

2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] Traceback (most recent call last):
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/opt/stack/new/nova/nova/compute/manager.py", line 352, in decorated_function
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] *args, **kwargs)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/opt/stack/new/nova/nova/compute/manager.py", line 2788, in snapshot_instance
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] task_states.IMAGE_SNAPSHOT)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/opt/stack/new/nova/nova/compute/manager.py", line 2819, in _snapshot_instance
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] update_task_state)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1532, in snapshot
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] image_format)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 1631, in _live_snapshot
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] domain.blockJobAbort(disk_path, 0)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 179, in doit
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] result = proxy_call(self._autowrap, f, *args, **kwargs)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 139, in proxy_call
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] rv = execute(f,*args,**kwargs)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 77, in tworker
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] rv = meth(*args,**kwargs)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/usr/lib/python2.7/dist-packages/libvirt.py", line 646, in blockJobAbort
2014-06-24 22:54:52.973...

Read more...

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Nova bug 1255624 is tracking libvirt connection reset errors, in that case it failed during virDomainSuspend, here it fails during virDomainBlockJobAbort.

tags: added: libvirt
Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in nova:
importance: Undecided → High
no longer affects: glance
summary: - test_images_oneserver times out in tearDown during task_state
- "image_pending_upload"
+ snapshot hangs when libvirt connection is reset
Revision history for this message
Matt Riedemann (mriedem) wrote : Re: snapshot hangs when libvirt connection is reset

e-r patch: https://review.openstack.org/#/c/102608/

Looks like this really spiked on 6/24 and goes down again on 6/25.

Revision history for this message
Sean Dague (sdague) wrote :

Going down on 6/25 is a mirage, we're just backed up on ES data.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Wondering if bug 1193146 has any interesting historical information.

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

I also faced this problem many times today, and most failure happened on ListImageFiltersTestXML.
I'm not sure why it does not happen on ListImageFiltersTestJSON..

Revision history for this message
Sean Dague (sdague) wrote :

It looks like there is a very particular explode around _live_snapshot. The bug actually seems to only explode when we are in _live_snapshot, and not any other cases. (Modifying the elastic recheck search string for _live_snapshot has the same # of counts).

Changed in nova:
assignee: nobody → Sean Dague (sdague)
summary: - snapshot hangs when libvirt connection is reset
+ libvirt live_snapshot periodically explodes on libvirt 1.2.2 in the gate
Revision history for this message
Sean Dague (sdague) wrote :

It's worth noting that the _live_snapshot code path was never tested by us until the trusty update, as it was hidden behind a version flag that meant we didn't run it before in the gate.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This should get us around the gate failures for now https://review.openstack.org/#/c/102643/

Revision history for this message
Matt Riedemann (mriedem) wrote :

@Ken'ichi, per comment 8, it's a pretty intensive setup:

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/images/test_list_image_filters.py#n29

The setup creates 2 servers and then 3 snapshots from those 2 servers, and the JSON and XML test classes are running concurrently, so we could be creating multiple snapshots concurrently which is in theory overloading libvirt/qemu and causing the connection reset with libvirt.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/102643
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c1c159460de376a06b8479f90f42d9f62eace961
Submitter: Jenkins
Branch: master

commit c1c159460de376a06b8479f90f42d9f62eace961
Author: Sean Dague <email address hidden>
Date: Wed Jun 25 16:56:04 2014 -0400

    effectively disable libvirt live snapshotting

    As being seen in the gate, libvirt 1.2.2 doesn't appear to actually
    handle live snapshotting under any appreciable load (possibly
    related to parallel operations). It isn't a 100% failure, but it's
    currently being hit in a large number of runs.

    Effectively turn this off by increasing the
    MIN_LIBVIRT_LIVESNAPSHOT_VERSION to something that doesn't yet
    exist. This can get us back to a working state, then we can decide
    if live snapshotting is something that can be made to actually work
    under load.

    DocImpact

    Related-Bug: #1334398

    Change-Id: I9908b743df2093c8dd2723af89be51630eafc99f

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Thought I'd add the below.

I just created a simple test[1] which creates an external live snapshot[2] of a libvirt guest (with versions affecting the gate -- libvirt 1.2.2 and QEMU 2.0 on Fedora 20), by executing the below command in a loop of 100 (I also tested for 1000, it ran just fine too). I ran the script for 3 virtual machines in parallel.

$ virsh snapshot-create-as --domain $DOMAIN \
    --name snap-$i \
    --description snap$i-desc \
    --disk-only \
    --diskspec hda,snapshot=external,file=$SNAPSHOTS_DIR/$DOMAIN-snap$i.qcow2 \
    --atomic

Result of a 100 loop run would be an image with a backing chain of 100 qcow2 images[3].

The above script just creates a snapshot, nothing more. Matt Riedemann pointed out on #openstack-nova that on Gate there could be others tests running concurrently that could be doing things like suspend/resume/rescue, etc.

  [1] https://github.com/kashyapc/ostack-misc/blob/master/libvirt-live-snapshots-stress.sh
  [2] "external live snapshot "meaning: Every time you take a snapshot, the current disk becomes a (read-only) 'backing file' and a new qcow2 overlay is created to track the current 'delta'.
  [3] http://kashyapc.fedorapeople.org/virt/guest-with-backing-chain-of-100-qcow2-images.txt

Revision history for this message
Daniel Berrange (berrange) wrote :

The interesting thing in the logs is the stack trace about virDomainBlockJobAbort failing.

Nova issues this API call right at the start of the snapshot funtion to validated there's no old stale job left over:

            # Abort is an idempotent operation, so make sure any block
            # jobs which may have failed are ended. This operation also
            # confirms the running instance, as opposed to the system as a
            # whole, has a new enough version of the hypervisor (bug 1193146).
            try:
                virt_dom.blockJobAbort(disk_path, 0)

As the comment says, aborting the job is supposed to be a completely safe thing todo. We don't even expect any existing job to be running, so it should basically end up as a no-op inside QEMU.

Now the error message libvirt reports when virDomaniBlockJobAbort fails is:

libvirtError: Unable to read from monitor: Connection reset by peer

This is a generic message you get when QEMU crashes & burns unexpectedly, causing the monitor connection to be dropped.

We've not even got as far as running the libvirt snapshot API at this point when QEMU crashes & burns. This likely explains why Kashyap can't see the error in this test script which just invokes snapshot.

This all points the finger towards a flaw in QEMU of some kind, but there's no easy way to figure out what this might be from the libvirtd logs.

What we need here is the /var/log/libvirt/qemu/instanceNNNNNN.log files corresponding to the failed test run. If we are lucky there might be some assertion error printed from QEMU before it crashed & burned. Failing that, what we need is a stack trace from QEMU when it crashes & burns. IIUC, ubuntu has something similar to Fedora's abrt daemon which captures stack traces of any process which SEGVs or core dumps. We really need to try to get a stack trace of QEMU from Ubuntu's crash handler too.

Revision history for this message
Daniel Berrange (berrange) wrote :

This is the service I was talking about

  https://wiki.ubuntu.com/Apport

We need to re-configure it to collect core dumps as it is disabled by default. Then capture any crashes in

/var/crash/

Revision history for this message
Daniel Berrange (berrange) wrote :

Sorry, I was looking at the wrong blockJobAbort call in the code earlier. The actual one that is failing is in this code

            # NOTE (rmk): Establish a temporary mirror of our root disk and
            # issue an abort once we have a complete copy.
            domain.blockRebase(disk_path, disk_delta, 0,
                               libvirt.VIR_DOMAIN_BLOCK_REBASE_COPY |
                               libvirt.VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT |
                               libvirt.VIR_DOMAIN_BLOCK_REBASE_SHALLOW)

            while self._wait_for_block_job(domain, disk_path):
                time.sleep(0.5)

            domain.blockJobAbort(disk_path, 0)

So we've done a block rebase, and then when we finish waiting for it to finish, we abort the job and at that point we see the crashed QEMU. The QEMU logs are stuff something useful to get

Revision history for this message
Daniel Berrange (berrange) wrote :

I'm actually beginning to wonder if there is a flaw in the tempest tests rather than in QEMU. The " Unable to read from monitor: Connection reset by peer" error message can actually indicate that a second thread has killed QEMU, while the first thread is talking to it - so this is a potential alternative idea to explore vs my previous QEMU-SEGV bug theory.

I've been examining the screen-n-cpu.log file to see what happens with instance 90c79adf-4df1-497c-a786-13bdc5cca98d which is the one with the virDomainBlockJob error trace

First I see the snapshot process starting

2014-06-24 22:51:24.314 INFO nova.virt.libvirt.driver [req-e4651efe-7c84-4a57-bbb1-88b107d4a282 ImagesOneServerTestJSON-967160715 ImagesOneServerTestJSON-32972017] [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] Beginning live snapshot process

Then I see the something killing this very same instance

2014-06-24 22:54:40.255 AUDIT nova.compute.manager [req-218dba14-516c-4805-9908-b55cd73a00e5 ImagesOneServerTestJSON-967160715 ImagesOneServerTestJSON-32972017] [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] Terminating instance

And a lifecycle event to show that it was killed

2014-06-24 22:54:51.033 16186 INFO nova.compute.manager [-] Lifecycle event 1 on VM 90c79adf-4df1-497c-a786-13bdc5cca98d

then we see the snapshot process crash & burn

2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] File "/usr/lib/python2.7/dist-packages/libvirt.py", line 646, in blockJobAbort
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self)
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d] libvirtError: Unable to read from monitor: Connection reset by peer
2014-06-24 22:54:52.973 16186 TRACE nova.compute.manager [instance: 90c79adf-4df1-497c-a786-13bdc5cca98d]

So this looks very much to me like something in the test is killing the instance while the snapshot is still being done

Now, as for why this doesn't affect non-live snapshots we were testing before...

For non-live snapshots, we issue a 'managedSave' call, this terminates the guest. Then we do the snapshot process. Then we start up the guest against from the managed save image. My guess is that this race-ing 'Terminate instance' call is happening while the guest is already shutdown and hence does not cause a failure of the test suite when doing non-live snapshot (or at least the window in which the race could hit is dramatically smaller).

So based on the sequence in the screen-n-cpu.log file my money is currently on a race in the test scripts where something explicitly kills the instance while snapshot is being taken, and that the non-live snapshot code is not exposed to the race.

Revision history for this message
Sean Dague (sdague) wrote :

We do kill the snapshot if it exceeds the timeout, which is currently 196s, because at some point we need to actually move on. When these are successful, they typically succeed in about 10s.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

Ok so it looks like the problem is that the snaphsot is not completing in a reasonable amount of time. The time stamps look like it took 2.5 minutes before it was killed which aligns with the above. So it looks like the BlockMirror is not completing.

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

After looking a little bit more (at' _live_snapshot' function in Nova[*]) , the below seem to be the precise equivalent sequence of (libvirt) operations of what's happening in Nova's '_live_snapshot' function. Thanks to libvirt developer Eric Blake for reviewing this:

(0) Take the Libvirt guest's XML backup:

    $ virsh dumpxml --inactive vm1 > /var/tmp/vm1.xml

(1) Abort any failed/finished block operations:

    $ virsh blockjob vm1 vda --abort

(2) Undefine a running domain. (Note: Undefining a running domain does not _kill_ the domain, it just converts it from persistent to
      transient.)

    $ virsh undefine vm1

(3) Invoke 'virsh blockcopy' (This will take time, depending on the size of disk image vm1):

    $ virsh blockcopy \
        --domain vm1 vda \
        /export/backups/vm1-copy.qcow2 \
        --wait \
        --verbose

(4) Abort any failed/finished block operations: (as Dan pointed out in comment #17, this the abort operation where QEMU
    might be failing):

    $ virsh blockjob vm1 vda --abort

   NOTE: If we use '--finish' command in step 3 it is equivalent to the
   above command (consequently, step 4 can be skipped).

(5) Define the guest again (to make it persistent):

    $ virsh define /var/tmp/vm1.xml

(6) From the obtained new copy, convert the QCOW2 with a backing file to a flat (raw) image with no backing file:

    $ qemu-img convert -f qcow2 -O raw vm1.qcow2 conv-vm1.img

Notes (from Eric Blake):
    The _live_snapshot function concludes it all with redefining the
    domain (umm, that part looks fishy in the code - you undefine it
    only if it was persistent, but redefine the domain unconditionally;
    so if you call your function on a domain that is initially
    transient, you end up with a persistent domain at the end of your
    function).

  [*] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L1593

Revision history for this message
Dan Genin (daniel-genin) wrote :

FWIW, I have encountered libvirt connection reset errors when my DevStack VM ran low on memory. I first saw these when the more recent versions of DevStack started spawning numerous nova-api and nova-conductor instances, which ate up the relatively small RAM of the VM. When this happened, I was unable to boot any instances, with libvirt connection reset errors reported in the n-cpu log. Not sure that low memory is the what's causing errors here (maybe some other resource starvation) but they look awfully similar.

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :
Download full text (5.5 KiB)

Here's some investigation of what happens when _live_snapshot is
invoked, at Libvirt level. I performed the live_snapshot test w/ current
git (after I modified MIN_LIBVIRT_LIVESNAPSHOT_VERSION=1.0.0 for testing
purpose) with below log filters in /etc/libvirt/libvirtd.conf (and
restared libvirt):

    log_level = 1
    log_outputs="1:file:/var/tmp/libvirtd.log"

    Find what QMP commands libvirt is sending to QEMU
    log_filters="1:qemu_monitor"

Libvirt call sequence (More
---------------------

(1) virDomainGetXMLDesc
(2) virDomainBlockJobAbort
(3) virDomainUndefine
(4) virDomainBlockRebase
     - NOTE (from libvirt documentation): By default, the copy job runs
       in the background, and consists of two phases: (a) the block
       operation copies all data from the source, and during this phase,
       the job can only be canceled to revert back to the source disk,
       with no guarantees about the destination. (b) After phase (a0
       completes, both the source and the destination remain mirrored
       until a call to the block opertation with the --abort.

(5) virDomainBlockJobAbort.

Test
----

Boot a new Nova instance:

    $ nova boot --flavor 1 --key_name oskey1 --image \
        cirros-0.3.0-x86_64-disk cvm1

Issue a snapshot (this should trigger the _live_snapshot code path):

    $ nova image-create --poll cvm1 snap1-cvm1

Ensure that "live snapshot" _did_ take place by searching the
'screen-n-cpu.log':

    $ grep -i "Beginning live snapshot process" ../data/new/screen-logs/screen-n-cpu.log
    2014-06-30 03:34:32.237 INFO nova.virt.libvirt.driver [req-b02ac5c5-7694-44ce-9185-129b91eaf6b5 admin admin] [instance: 4b62dab0-0efe-4125-84ec-3e24d3371082] Beginning live snapshot process
    $

Libvirt logs
------------

(1) Save copy of libvirt XML(virDomainGetXMLDesc):
----
2014-06-30 09:08:13.586+0000: 8470: debug : virDomainGetXMLDesc:4344 : dom=0x7f2994000fd0, (VM: name=instance-00000001, uuid=4b62dab0-0efe-4125-84ec-3e24d3371082), flags=0
----

(2) Issue a BlockJobAbort (virDomainBlockJobAbort), so that any prior active block operation on that disk will be cancelled:
----
2014-06-30 09:08:13.632+0000: 8470: debug : virDomainBlockJobAbort:19492 : dom=0x7f2994000af0, (VM: name=instance-00000001, uuid=4b62dab0-0efe-4125-84ec-3e24d3371082), disk=/home/kashyapc/src/openstack/data/nova/instances/4b62dab0-0efe-4125-84ec-3e24d3371082/disk, flags=0
----

(3) Undefining the running libvirt domain (virDomainUndefine), to make it transient[*]:
----
2014-06-30 09:08:14.069+0000: 8471: debug : virDomainUndefine:8683 : dom=0x7f29a0000910, (VM: name=instance-00000001, uuid=4b62dab0-0efe-4125-84ec-3e24d3371082)
2014-06-30 09:08:14.069+0000: 8471: info : qemuDomainUndefineFlags:6334 : Undefining domain 'instance-00000001'
----

We'll define the guest again, further below from the saved copy from step(1).

[*] Reasoning for making the domain transient: BlockRebase ('blockcopy')
jobs last forever until canceled, which implies that they should last
across domain restarts if the domain were persistent. But, QEMU doesn't
yet provide a way to restart a copy job on domain restart (while
mirroring is still intact). So the trick is to tempo...

Read more...

Revision history for this message
Daniel Berrange (berrange) wrote :

I've added some more debugging to the live snapshot code in this change:

  https://review.openstack.org/#/c/103066/

When it failed in this test run:

  http://logs.openstack.org/66/103066/7/check/check-tempest-dsvm-postgres-full/02e97b1

I see

2014-06-30 11:55:55.398+0000: 18078: debug : virDomainGetBlockJobInfo:19415 : dom=0x7f13200018f0, (VM: name=instance-00000020, uuid=b1e3c5de-af31-4a4d-94dd-ef382936583b), disk=/opt/stack/data/nova/instances/b1e3c5de-af31-4a4d-94dd-ef382936583b/disk, info=0x7f1315ffa390, flags=0
2014-06-30 11:55:55.415 WARNING nova.virt.libvirt.driver [req-3e682ad2-5af5-47d2-a72d-9bac23e8c2bc ListImageFiltersTestJSON-1350287421 ListImageFiltersTestJSON-1927810029] blockJobInfo snapshot cur=0 end=25165824

2014-06-30 11:55:56.074+0000: 18071: debug : virDomainGetBlockJobInfo:19415 : dom=0x7f13200018f0, (VM: name=instance-00000020, uuid=b1e3c5de-af31-4a4d-94dd-ef382936583b), disk=/opt/stack/data/nova/instances/b1e3c5de-af31-4a4d-94dd-ef382936583b/disk, info=0x7f13256ea390, flags=0
2014-06-30 11:55:56.094 WARNING nova.virt.libvirt.driver [req-3e682ad2-5af5-47d2-a72d-9bac23e8c2bc ListImageFiltersTestJSON-1350287421 ListImageFiltersTestJSON-1927810029] blockJobInfo snapshot cur=25165824 end=25165824

This shows that as far as virDomainGetBlockJobInfo is concerned, the job has completed copying the data in not very much time at all, which seems reasonable considering it is a cirros image.

We then go into a virDomainBlockJobAbort call to finish the snapshot operation:

  2014-06-30 11:55:56.127+0000: 18070: debug : virDomainBlockJobAbort:19364 : dom=0x7f13200018f0, (VM: name=instance-00000020, uuid=b1e3c5de-af31-4a4d-94dd-ef382936583b), disk=/opt/stack/data/nova/instances/b1e3c5de-af31-4a4d-94dd-ef382936583b/disk, flags=0

This should take a fraction of a second, but after 3 minute it still isn't done. Tempest gets fed up waiting and so issues a call to destroy the guest:

  2014-06-30 11:59:10.341+0000: 18090: debug : virDomainDestroy:2173 : dom=0x7f12f0002910, (VM: name=instance-00000020, uuid=b1e3c5de-af31-4a4d-94dd-ef382936583b)

Shortly thereafter QEMU is dead and the virDomainBlockJobAbort call returns, obviously with an error,

2014-06-30 11:59:21.279 17542 TRACE nova.compute.manager [instance: b1e3c5de-af31-4a4d-94dd-ef382936583b] libvirtError: Unable to read from monitor: Connection reset by peer

So, based on this debug info I think that Nova is doing the right thing, and this is probably a bug in QEMU (or possibly, but unlikely, a bug in libvirt). My inclination is that QEMU is basically hanging in the block job abort call, due to some fairly infrequently hit race condition.

Revision history for this message
Daniel Berrange (berrange) wrote :
Download full text (8.2 KiB)

I have managed to capture a failure with verbose libvirtd.log enabled.

http://logs.openstack.org/66/103066/11/check/check-tempest-dsvm-postgres-full/d8d3b5b/logs/libvirtd.txt.gz
http://logs.openstack.org/66/103066/11/check/check-tempest-dsvm-postgres-full/d8d3b5b/logs/screen-n-cpu.txt.gz

2014-07-09 11:05:50.701+0000: 21774: debug : virDomainBlockJobAbort:19364 : dom=0x7fc73c010700, (VM: name=instance-0000001d, uuid=6cfca6f5-8741-4844-a045-a14a644dd82d), disk=/opt/stack/data/nova/instances/6cfca6f5-8741-4844-a045-a14a644dd82d/disk, flags=0
2014-07-09 11:05:50.701+0000: 21774: debug : qemuDomainObjBeginJobInternal:1050 : Starting job: modify (async=none vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.701+0000: 21774: debug : qemuDomainObjBeginJobInternal:1092 : Started job: modify (async=none vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.701+0000: 21774: debug : qemuDomainObjEnterMonitorInternal:1278 : Entering monitor (mon=0x7fc73c010ad0 vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.701+0000: 21774: debug : qemuMonitorBlockJob:3314 : mon=0x7fc73c010ad0, device=drive-virtio-disk0, base=<null>, bandwidth=0M, info=(nil), mode=0, modern=1
2014-07-09 11:05:50.701+0000: 21774: debug : qemuMonitorJSONCommandWithFd:264 : Send command '{"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-14"}' for write with FD -1
2014-07-09 11:05:50.701+0000: 21774: debug : qemuMonitorSend:959 : QEMU_MONITOR_SEND_MSG: mon=0x7fc73c010ad0 msg={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-14"}
2014-07-09 11:05:50.705+0000: 21774: debug : qemuMonitorJSONCommandWithFd:269 : Receive command reply ret=0 rxObject=0x7fc76096cac0
2014-07-09 11:05:50.705+0000: 21774: debug : qemuDomainObjExitMonitorInternal:1301 : Exited monitor (mon=0x7fc73c010ad0 vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.705+0000: 21774: debug : qemuDomainObjEnterMonitorInternal:1278 : Entering monitor (mon=0x7fc73c010ad0 vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.705+0000: 21774: debug : qemuMonitorBlockJob:3314 : mon=0x7fc73c010ad0, device=drive-virtio-disk0, base=<null>, bandwidth=0M, info=0x7fc7572be9a0, mode=1, modern=1
2014-07-09 11:05:50.705+0000: 21774: debug : qemuMonitorJSONCommandWithFd:264 : Send command '{"execute":"query-block-jobs","id":"libvirt-15"}' for write with FD -1
2014-07-09 11:05:50.705+0000: 21774: debug : qemuMonitorSend:959 : QEMU_MONITOR_SEND_MSG: mon=0x7fc73c010ad0 msg={"execute":"query-block-jobs","id":"libvirt-15"}
2014-07-09 11:05:50.709+0000: 21774: debug : qemuMonitorJSONCommandWithFd:269 : Receive command reply ret=0 rxObject=0x7fc76099cce0
2014-07-09 11:05:50.709+0000: 21774: debug : qemuDomainObjExitMonitorInternal:1301 : Exited monitor (mon=0x7fc73c010ad0 vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.759+0000: 21774: debug : qemuDomainObjEnterMonitorInternal:1278 : Entering monitor (mon=0x7fc73c010ad0 vm=0x7fc740009ee0 name=instance-0000001d)
2014-07-09 11:05:50.759+0000: 21774: debug : qemuMonitorBlockJob:3314 : mon=0x7fc73c010ad0, device=drive-virtio-disk0, base=<null>, bandwidth=0M, info=0x7...

Read more...

Revision history for this message
Kashyap Chamarthy (kashyapc) wrote :

Ping.

I recall DanB noted somewhere in a mailing list thread and the QEMU devs I spoke to also suggested -- one of the possible next steps to investigate this is to have GDB attached to the hanging QEMU in the CI gate to get stack traces out of it.

Revision history for this message
Sean Dague (sdague) wrote :

We've basically just disabled this feature until someone can dig into why qemu doesn't work with it

Changed in nova:
status: New → Confirmed
assignee: Sean Dague (sdague) → nobody
Chet Burgess (cfb-n)
Changed in nova:
assignee: nobody → Chet Burgess (cfb-n)
Revision history for this message
Chet Burgess (cfb-n) wrote :

This came up in the ML and in openstack-nova IRC chat today.

I'm going to look into adding some sane libvirt/qemu version checking to the feature as well as wrapping the feature with some type of config option so that its possible to conditionally enable it on different platforms.

Revision history for this message
Chet Burgess (cfb-n) wrote :

This came up again today in openstack-nova and recently on the ML.

I'll look at adding some version guarding around this feature. It defiantly works on precise with libvirtd 1.1.3.5 and qemu 1.5 as we have been running it in production with those version. So I think some basic version guarding should be sufficient.

mriedem has also requested the ability to conditionally enable it on different OS gates for easier testing so I will work on that as well.

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/147332

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/150639

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/150639
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3f69bca12c484a08d8f80c1db581e12607f13f7a
Submitter: Jenkins
Branch: master

commit 3f69bca12c484a08d8f80c1db581e12607f13f7a
Author: Tony Breeds <email address hidden>
Date: Tue Jan 27 14:05:02 2015 -0800

    Use a workarounds group option to disable live snaphots.

    Create a workarounds option to disable live snapshotting rather than
    hack MIN_LIBVIRT_LIVESNAPSHOT_VERSION.

    DocImpact
    Related-Bug: #1334398
    Change-Id: Iee9afc0afaffa1de509c357fcb9fdb18c650a70b

Revision history for this message
Matt Riedemann (mriedem) wrote :

https://review.openstack.org/#/c/171795/ adds the fedora 21 job to nova's experimental queue so we can test changes against newer libvirt/qemu than what's in ubuntu 14.04.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Joe Gordon (<email address hidden>) on branch: master
Review: https://review.openstack.org/147332
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Changed in nova:
assignee: Chet Burgess (cfb-n) → nobody
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote : Cleanup

Solving an inconsistency: The bug is 'In Progress' but without an assignee. I set the status back to the last known status before the change to 'In Progress'.

Feel free to assign the bug to yourself. If you do so, please set it to 'In Progress'.

Changed in nova:
status: In Progress → Confirmed
Changed in nova:
assignee: nobody → Pranav Salunke (dguitarbite)
Changed in nova:
assignee: Pranav Salunke (dguitarbite) → nobody
Revision history for this message
Matt Riedemann (mriedem) wrote :

At some point soon in Newton we'll have ubuntu 16.04 in the gate jobs with libvirt 1.3.1, we should try turning this back on and see if it works.

Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (LIBERTY, MITAKA, OCATA, NEWTON).
  Valid example: CONFIRMED FOR: LIBERTY

Changed in nova:
importance: High → Undecided
status: Confirmed → Expired
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

CONFIRMED FOR: NEWTON

Changed in nova:
status: Expired → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.