cinder schedules backups on disabled backup services
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
New
|
Medium
|
Unassigned |
Bug Description
To reproduce:
1) Set up multiple cinder-backup nodes
2) Mark one of them as disabled via 'openstack volume service set --disable'
3) Create some backups
4) Note that plenty of those backups get sent to the disabled node
Why this matters:
Running 17.2, I've seen quite a few ugly lockups where a backup is scheduled, gets placed in the 'creating' state, and then stays there forever.
At least some of the time this is caused by a backup node being down but the scheduler not noticing yet; in that case the job gets scheduled on the down node and sticks forever until/if the node comes back up.
A possible (if unfortunate) workaround for this would be to mark a node as disabled (by hand) anytime it's down, but the scheduler doesn't care if a node is marked as disabled.
(Obviously the REAL issue here is the lack of any kind of feedback during the backup; ideally there would be some way to detect stuck jobs and retry but)
Changed in cinder: | |
importance: | Undecided → Medium |
tags: | added: backup-service schedules |