RFE: enhancements in cinder-volume service configuration

Bug #1881911 reported by Andre Ruiz
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Cinder Charm
In Progress
Wishlist
Andre Ruiz

Bug Description

This is a proposal to enhance the cinder charm to include some new features that are needed in a customer installation. All changes are related to the cinder-volume functionality of the charm. The cinder-volume basically implements the CinderLVM driver and uses LVM + iSCSI (tgt) to present "external/permanent" storage to a cluster that only have internal disks.

A little more context:

This is a "telco cloud" that has some specific characteristics. One of them is that workloads in this cloud are deployed using an orchestrator and tasks like creating VMs, deleting VMs and upgrading software inside those VMs are automatic and hardcoded, difficult to change (from different vendor).

Of special note is the fact that said orchestrator uses a particular method for upgrading software on the cloud, and that is: it keeps all important data (databases and other data) in secondary disks, keeping on the first disk only the operating system. When it needs to upgrade a machine, it tears down that machine, detaches all external volumes, deletes the VM, provision a new VM with more current image (newer software versions), configure the service inside that VM, re-attaches the external disks on that VM and start all services again.

It's clear that internal nova ephemeral storage is not adequate for this task. Said VMs have frequently 2 to 4 extra disks, which cannot be done with ephemeral storage, and worse, all those disks would not survive the upgrade method used by the orchestrator.

Unfortunately, this cloud was designed by the client to only have local disks on compute nodes (a lot of disks) and no external storage (actually it does include a small amount of ceph but its only used for backups since the performance is not good enough for the databases, so we are ignoring it here). The client had the expectation of having permanent storage on local disks and the software vendor did not steered the client away from the problem (and we came to know that too late).

The simplest solution to this was to make use of CinderLVM (cinder-volume service in cinder charm) to use part of the internal disks with the semantics of external storage. This solved the problem while keeping the performance good enough.

Unfortunately, there are some details of this implementation that are not addressed by the charm at the moment. Those are:

- Thin provisioning vs Thick provisioning

The CinderLVM driver can provision both thin and thick volumes, but the charm only knows how to configure thin volumes. We need to add an option to enable the use of thick volumes (client request).

This is simple enough to implement, it's just a value = true/false in the volume definition section in the config file, and can be added to the template and controlled by an option in the charm.

- Multiple named backends

The charm today only accepts one device for provisioning. The client needs more than one device per host. For example, he has one NVMe and one RAID5 array (bcached by a second NVMe) on each compute host and would like to be able to address them separately, respectively named "fast" and "slow", for example.

Having more than one backend is just a matter of having more than one section in the config file, but the charm currently only provision one. It needs to be changed to accept a list of devices (actually a dictionary with list-of-devices: name). List-of-devices are one or more PVs that are going to be added to a VG, and name is the name of that specific device when addressed in the cinderlvm driver.

Note that there are two enhancements here: one is allowing more than one device to be added to the same VG, and second is allowing more than one VG giving them specific names (like "lvm-fast" instead of the default "lvm").

- Individually addressable hosts

Because of performance reasons, the client wants to keep volumes in the same host the VM is. Benchmarks and tests have been made and it has been confirmed that when on the same host performance is much improved. To be able to control that, each cinderlvm backend on each compute node needs to have the compute node name added to it, so that cinder sees lots of different backends instead of a single one that spans many hosts. This has been tested and proved to work well.

This is also a very easy change, it's just a matter of including the compute node name in the backend name. This can be changed in the template and controlled by a charm option. When enabled, it would make the backend name change from "lvm-fast" to "lvm-compute3-fast" for example.

- Control erase of old volumes

Add option to control erasing of volumes. Default is to erase complete volume when deleting, which can take a lot of time. An option should be added to control that.

---

Note: this is a very application-specific cloud and VMs are not move around normally. It is known that to keep volumes local to the VM the volume would need to be moved in addition to the VM being moved, this is an acceptable limitation.

With the last two features working together (Multiple named backends and Individually addressable hosts) we can have a scenario like this (this is a true example from the client today):

15 compute nodes
  - #1 NVMe
  - #2 NVMe
  - #3 RAID10 Array
  - #4 RAID10 array
  - #5 RAID10 array

Configured as follows:

- #1 is mounted for nova local ephemeral storage
- #2 is fast storage for CinderLVM
- #3 is slow storage for CinderLVM
- #4 is slow storage for CinderLVM
- #5 is slow storage for CinderLVM

Expected config in the charm (cinder-volume) is:

- #2 appears as "lvm-<computename>-fast"
- #3 + #4 + #5 appears as "lvm-<computename>-slow" (many PVs to the same VG in the backend)

This is an example of a config from a cinder.conf in a compute node:

========================8<-------------------------
enabled_backends = LVM-compute5-fast,LVM-compute5-slow

[LVM-compute5-slow]
volumes_dir = /var/lib/cinder/volumes
volume_name_template = volume-%s
volume_group = cinder-volumes-slow
volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver
volume_backend_name = LVM-compute5-slow
lvm_type = default
volume_clear_size=50

[LVM-compute5-fast]
volumes_dir = /var/lib/cinder/volumes
volume_name_template = volume-%s
volume_group = cinder-volumes-fast
volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver
volume_backend_name = LVM-compute5-fast
lvm_type = default
volume_clear_size=50
========================8<-------------------------

After configuring all hosts like that, openstack presents those as different backends in cinder:

ubuntu@infra1:~$ openstack volume service list | grep -i LVM-
| cinder-volume | compute12@LVM-compute12-fast | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute12@LVM-compute12-slow | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute7@LVM-compute7-fast | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute7@LVM-compute7-slow | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute14@LVM-compute14-fast | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute14@LVM-compute14-slow | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute10@LVM-compute10-fast | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute10@LVM-compute10-slow | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute3@LVM-compute3-fast | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute3@LVM-compute3-slow | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute15@LVM-compute15-fast | nova | enabled | up | 2020-06-03T13:39:10.000000 |
| cinder-volume | compute15@LVM-compute15-slow | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute13@LVM-compute13-fast | nova | enabled | up | 2020-06-03T13:39:10.000000 |
| cinder-volume | compute13@LVM-compute13-slow | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute6@LVM-compute6-fast | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute6@LVM-compute6-slow | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute11@LVM-compute11-fast | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute11@LVM-compute11-slow | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute4@LVM-compute4-fast | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute4@LVM-compute4-slow | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute9@LVM-compute9-fast | nova | enabled | up | 2020-06-03T13:39:10.000000 |
| cinder-volume | compute9@LVM-compute9-slow | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute2@LVM-compute2-fast | nova | enabled | up | 2020-06-03T13:39:09.000000 |
| cinder-volume | compute2@LVM-compute2-slow | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute8@LVM-compute8-fast | nova | enabled | up | 2020-06-03T13:39:08.000000 |
| cinder-volume | compute8@LVM-compute8-slow | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute5@LVM-compute5-fast | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute5@LVM-compute5-slow | nova | enabled | up | 2020-06-03T13:39:10.000000 |
| cinder-volume | compute1@LVM-compute1-fast | nova | enabled | up | 2020-06-03T13:39:11.000000 |
| cinder-volume | compute1@LVM-compute1-slow | nova | enabled | up | 2020-06-03T13:39:12.000000 |

You can create different "types" to use when creating the volumes:

ubuntu@infra1:~$ openstack volume type list
+--------------------------------------+--------------------+-----------+
| ID | Name | Is Public |
+--------------------------------------+--------------------+-----------+
| 7a897b36-f98f-4e89-a85b-26f34e97c358 | lvm-compute15-slow | True |
| f09660e8-209a-44c5-b907-39743e00e27b | lvm-compute15-fast | True |
| 2d5ca8db-bcfa-49da-a0e0-06d3f4d4d276 | lvm-compute14-slow | True |
| ccfad51d-48b2-42b0-937b-bdbe95365f45 | lvm-compute14-fast | True |
| b720fb1b-21d2-487d-8aec-2e662202acf9 | lvm-compute13-slow | True |
| dd21e676-c53c-4d90-90ee-9ace58ddf1eb | lvm-compute13-fast | True |
| 33ca11ba-2b23-413f-902b-ba48ad93daba | lvm-compute12-slow | True |
| f2e2f294-5ae4-49bd-9b41-e70cc248c66b | lvm-compute12-fast | True |
| 4df2a465-191d-4801-8fb3-14b859330737 | lvm-compute11-slow | True |
| 12ed6955-be72-4af8-9148-cfff92f19d48 | lvm-compute11-fast | True |
| db074d42-a4e7-478e-b360-d6bdabcad9eb | lvm-compute10-slow | True |
| 840e169e-2bcf-4d63-9af5-430e6fb3db3c | lvm-compute10-fast | True |
| 2e277532-e389-42ad-87de-9ac2e0c4ddc3 | lvm-compute9-slow | True |
| 466c073a-0f7d-4564-a37f-1ecfa527ac41 | lvm-compute9-fast | True |
| ded6d88f-ab89-4faf-9917-33a9b2d159c9 | lvm-compute8-slow | True |
| 35fe4835-255d-4cc0-8feb-5996647cb35c | lvm-compute8-fast | True |
| c71448d4-91f2-473a-b178-aab8076c86c2 | lvm-compute7-slow | True |
| 983121f0-63e2-4222-b54c-79bc6978f84c | lvm-compute7-fast | True |
| 34d89d60-d367-4338-bb4b-776da6c42aaa | lvm-compute6-slow | True |
| d03f6b4b-ce4c-4743-b305-e07a9daaf58d | lvm-compute6-fast | True |
| d71e994c-fe05-4e20-808a-eedd179ba490 | lvm-compute5-slow | True |
| 78aec217-cb5b-4678-b108-1fe09dfb9c4d | lvm-compute5-fast | True |
| 70145942-81c3-4899-9545-8578bab2bfb5 | lvm-compute4-slow | True |
| 1206915a-0596-43ef-bb5d-f82195855e1b | lvm-compute4-fast | True |
| 194203a7-bab2-4094-a294-8fb6db4a0d0a | lvm-compute3-slow | True |
| 10e07b2b-7966-4d9f-b7e5-0040baabf5e1 | lvm-compute3-fast | True |
| 9083017d-b617-4a02-ae26-3bfc59e9579d | lvm-compute2-slow | True |
| efec0c91-e8b0-4f6c-9d2e-5bd5d6640652 | lvm-compute2-fast | True |
| 14986062-58d2-4b37-9232-eca02aec7ae6 | lvm-compute1-slow | True |
| 218c7e82-ae68-4322-a0c4-f5b6b861e95e | lvm-compute1-fast | True |
| 18cdbdd6-946d-4a8f-a808-a0cfa346390f | ceph | True |
+--------------------------------------+--------------------+-----------+

Each of these types have a backend filter that allows you to ask for a specific backend on a specific host:

ubuntu@infra1:~$ openstack volume type show lvm-compute15-slow
+--------------------+------------------------------------------+
| Field | Value |
+--------------------+------------------------------------------+
| access_project_ids | None |
| description | None |
| id | 7a897b36-f98f-4e89-a85b-26f34e97c358 |
| is_public | True |
| name | lvm-compute15-slow |
| properties | volume_backend_name='LVM-compute15-slow' |
| qos_specs_id | None |
+--------------------+------------------------------------------+

All of this works well and have been in testing for the last 6 months without problems. The only thing missing now is to incorporate changes to the charm so that no manual change is necessary after deploying a new cloud.

This bug was open as a way to keep track of this enhancements.

Tags: rfe
Ryan Beisner (1chb1n)
Changed in charm-cinder:
importance: Undecided → Wishlist
Revision history for this message
Andrea Ieri (aieri) wrote :

Marking as field-high to better reflect the importance of this bug since this functionality is needed by a customer. The code will however be provided by Andre Ruiz

Andre Ruiz (andre-ruiz)
Changed in charm-cinder:
assignee: nobody → Andre Ruiz (andre-ruiz)
Revision history for this message
Ryan Beisner (1chb1n) wrote :

New features should not be flagged as SLA field-high. After there is something to review as a contribution, they may be flagged as field-medium.

Changed in charm-cinder:
status: New → Opinion
Revision history for this message
Ryan Beisner (1chb1n) wrote :

When there is code to review, please reset to a New state. Thanks again!

Changed in charm-cinder:
status: Opinion → Incomplete
Andre Ruiz (andre-ruiz)
summary: - Feature Request: enhancements in cinder-volume service configuration
+ RFE: enhancements in cinder-volume service configuration
tags: added: rfe
Andre Ruiz (andre-ruiz)
description: updated
Revision history for this message
Chris Sanders (chris.sanders) wrote :

@Andre Ruiz, this has not been set back to 'new' what's the status on the code for this bug?

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Status is in progress. It is under development right now. No ETA yet, I expected a few days for changing and testing original charm but after discussing this with Openstack Eng. it was decided to create a separate charm (subordinate) and move all this logic there. This does complicate things a bit (it makes it easy in the long term but will be more work now).

Changed in charm-cinder:
status: Incomplete → In Progress
Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

The development reached a stable point where it's usable. The charm is still undergoing review from the openstack engineering team and unit/integration tests still need to be written but a preliminary version is available and has passed all field tests for functionality.

Code is at:

~andre-ruiz/+git/charm-cinder-lvm
https://git.launchpad.net/~andre-ruiz/+git/charm-cinder-lvm/tree/

Test charm is at:

Cinder Lvm charm at charm store
https://jaas.ai/u/andre-ruiz/cinder-lvm

Example of use is documented in the charm README but feel free to reach out if you have any questions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.