Cinder fails to create image-based volume if mirroring is enabled
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph RBD Mirror Charm |
Invalid
|
Undecided
|
Unassigned | ||
Cinder |
In Progress
|
Undecided
|
Corey Bryant | ||
Ubuntu Cloud Archive |
Fix Released
|
High
|
Unassigned | ||
Mitaka |
Won't Fix
|
High
|
Unassigned | ||
Queens |
Fix Released
|
High
|
Unassigned | ||
Stein |
Fix Released
|
High
|
Unassigned | ||
Train |
Fix Released
|
High
|
Unassigned | ||
Ussuri |
Fix Released
|
High
|
Unassigned | ||
Victoria |
Fix Released
|
High
|
Unassigned | ||
cinder (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Xenial |
Won't Fix
|
High
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Unassigned | ||
Focal |
Fix Released
|
High
|
Unassigned | ||
Groovy |
Fix Released
|
High
|
Unassigned |
Bug Description
[Impact]
OpenStack Train, Ceph Nautilus, ceph-rbd-mirror is deployed in dual-way.
Cinder has Ceph as backend for volumes.
Creating a volume from qcow2 image.
Current flow is the following:
1. Cinder creates empty volume in Ceph
2. Cinder downloads the image
3. Image is converted to raw
4. Volume is deleted https:/
5. Cinder performs "rbd import" using unpacked raw image as source.
Apparently rbd-mirror daemon creates a snapshot upon the image creation in Ceph (seems for mirroring purposes) which for empty image lasts for about a second.
It happens that step4 may be performed (very often) during the period of time when the snapshot exists, and it fails with "Cannot delete the volume with snapshots" error.
The only way to fix this behaviour - disable mirroring of the backend pool which is not desired.
[Test Case]
This is a light-weight test to ensure the code is working as expected, using the unit test from the patch:
lxc launch ubuntu-
lxc exec h1 /bin/bash
root@h1:~# sudo apt install python3-cinder
root@h1:~# cd /usr/lib/
root@h1:~# python3 -m unittest cinder.
/usr/lib/
self.
.
-------
Ran 1 test in 0.701s
OK
The test will fail if the fixed code is not installed.
[Regression Potential]
This is a very minimal change that simply adds a retry when exception.
[Discussion]
This is accompanied by a unit test fix for: https:/
Changed in charm-ceph-rbd-mirror: | |
assignee: | nobody → Aurelien Lourot (aurelien-lourot) |
Changed in cinder (Ubuntu Groovy): | |
status: | New → Triaged |
Changed in cinder (Ubuntu Focal): | |
status: | New → Triaged |
Changed in cinder (Ubuntu Bionic): | |
status: | New → Triaged |
Changed in cinder (Ubuntu Xenial): | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in cinder (Ubuntu Focal): | |
importance: | Undecided → High |
Changed in cinder (Ubuntu Groovy): | |
importance: | Undecided → High |
Changed in cinder (Ubuntu Bionic): | |
importance: | Undecided → High |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Our environment has two ceph clusters (az1 and az2) with a separate ceph-rbd-mirror charm deployed to each. The ceph-rbd-mirror charm enables pool mirroring on all rbd pools.
Glance is backed by only one of the ceph clusters, az1.
We have volume types named after the ceph cluster, arbor-az1 and arbor-az2.
When we copy an image to cinder using the following openstack cli, it's fast and reliable when copying from az1 to az1, but we have greater than 50% failure when copying from az1 to az2.
openstack volume create --image bionic --size 3 --type arbor-az1 volume-az1 # works
openstack volume create --image bionic --size 3 --type arbor-az2 volume-az2 # fails >50%
When the az2 image upload fails, it's always with "cinder. exception. VolumeIsBusy: deleting volume volume-<id> that has snapshot"
The full log from the cinder-volume log follows.
2020-10-19 10:15:35.871 45458 INFO cinder. volume. flows.manager. create_ volume [req-a4905f43- c6f8-42e4- a602-a002380684 92 f2ee9e8060e54d0 58060d220fab840 88 0c0bb65d03f446c a8a845d1ae92297 90 - 1e59a4057fb2400 3a902bc4df6247d 3c 1e59a4057fb2400 3a902bc4df6247d 3c] Volume 63e4d69c- 0722-40fc- b44f-cb59fca0d4 35: being created as image with specification: {'status': 'creating', 'volume_name': 'volume- 63e4d69c- 0722-40fc- b44f-cb59fca0d4 35', 'volume_size': 20, 'image_id': 'e93f97e0- f514-435d- a277-0ac288ed0c 6c', 'image_location': ('rbd:/ /36bd979e- 0511-11eb- bced-ecebb88db4 76/glance/ e93f97e0- f514-435d- a277-0ac288ed0c 6c/snap' , [{'url': 'rbd:// 36bd979e- 0511-11eb- bced-ecebb88db4 76/glance/ e93f97e0- f514-435d- a277-0ac288ed0c 6c/snap' , 'metadata': {}}]), 'image_meta': {'name': 'UBUNTU-18.04', 'disk_format': 'qcow2', 'container_format': 'bare', 'visibility': 'public', 'size': 359923712, 'virtual_size': None, 'status': 'active', 'checksum': '9aa011b2b79b1f e42a7c306555923 b1b', 'protected': False, 'min_ram': 0, 'min_disk': 0, 'owner': '86fcc0b3839b45 029dd325641ddc2 a09', 'os_hidden': False, 'os_hash_algo': 'sha512', 'os_hash_value': '03786c51866c1c 6c50ef671502b26 5afb71084e0d018 748ca9b29f871ca 5445c9eefffffdf 927d540bddaf0e8 597adc673e218e7 85fcf16013fba7e 7d9898e6e' , 'id': 'e93f97e0- f514-435d- a277-0ac288ed0c 6c', 'created_at': datetime. datetime( 2020, 10, 15, 16, 49, tzinfo= <iso8601. Utc>), 'updated_at': datetime. datetime( 2020, 10, 15, 16, 49, 18, tzinfo= <iso8601. Utc>), 'locations': [{'url': 'rbd:// 36bd979e- 0511-11eb- bced-ecebb88db4 76/glance/ e93f97e0- f514-435d- a277-0ac288ed0c 6c/snap' , 'metadata': {}}], 'direct_url': 'rbd:// 36bd979e- 0511-11eb- bced-ecebb88db4 76/glance/ e93f97e0- f514-435d- a277-0ac288ed0c 6c/snap' , 'tags': [], 'file': '/v2/images/ e93f97e0- f514-435d- a277-0ac288ed0c 6c/file' , 'properties': {}}, 'image_service': <cinder. image.glance. GlanceImageServ ice object at 0x7fb92712cda0>} image.image_ utils [req-a4905f43- c6f8-42e4- a602-a002380684 92 f2ee9e8060e54d0 58060d220fab840 88 0c0bb65d03f446c a8a845d1ae92297 90 - 1e59a4057fb2400 3a902bc4df6247d 3c 1e59a4057fb2400 3a902bc4df6247d 3c] Image download 343.25 MB at 70.20 MB/s image.image_ utils [req-a4905f43- c6f8-42e4- a602-a002380684 92 f2ee9e8060e54d0 58060d220fab840 88 0c0bb65d03f446c a8a845d1ae92297 90 - 1e59a4057fb2400 3a902bc. ..
2020-10-19 10:15:40.810 45458 INFO cinder.
2020-10-19 10:15:44.685 45458 INFO cinder.