In rare cases L3 wasn't rescheduled after destroying controller
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mirantis OpenStack |
Fix Released
|
High
|
MOS Maintenance | ||
8.0.x |
Invalid
|
High
|
MOS Maintenance | ||
9.x |
Fix Released
|
High
|
Oleg Bondarev |
Bug Description
8.0 iso - 566
Scenario:
1. Deploy next cluster - Neutron Vxlan, all default other settings, 3 controller, 2 compute, 1 cinder nodes
2. Create an instance with a key pair
3. Manually reschedule router from primary controller
to another one (from node-3 to node-4)
4. Destroy controller with l3-agent (node-4)
5. Wait all HA OSTF tests passed
6. Check l3-agent was rescheduled
Actual result - it's still hosted by dead agent
neutron l3-agent-
+------
| id | host | admin_state_up | alive | ha_state |
+------
| 75362c69-
+------
Changed in fuel: | |
assignee: | MOS Neutron (mos-neutron) → Oleg Bondarev (obondarev) |
Changed in fuel: | |
status: | New → Confirmed |
tags: | added: area-neutron |
tags: | added: move-to-mu |
tags: | added: release-notes |
summary: |
- L3 wasn't rescheduled after destroying controller + In rare L3 wasn't rescheduled after destroying controller |
summary: |
- In rare L3 wasn't rescheduled after destroying controller + In rare cases L3 wasn't rescheduled after destroying controller |
tags: |
added: 8.0 release-notes-done removed: release-notes |
no longer affects: | fuel |
no longer affects: | fuel/8.0.x |
tags: | added: wait-for-stable |
tags: | added: on-verification |
For some reason the looping "rescheduling" task in running neutron server stops working. It should check for down bindings and reschedule routers from down agents. Restarting neutron server fixes the issue and router is rescheduled - this means that it's not related to wrong db logic on identifying dead bindings but something with looping task itself, which stops working or hanging. Need to investigate further