[HA] aborting a starting node introspection can lead to an error state

Bug #1721536 reported by milan k
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic Inspector
In Progress
Medium
milan k

Bug Description

There seems to be a state transition error "leak" somewhere in the introspection abort call chain making it eventually to the node state instead of just being reported to the user:

Mon Sep 25 15:23:24 CEST 2017
+-------------+-----------------------------------------------------------------------------------+
| Field | Value |
+-------------+-----------------------------------------------------------------------------------+
| error | Can not transition from state 'starting' on event 'abort' (no defined transition) |
| finished | True |
| finished_at | 2017-09-25T13:23:24 |
| started_at | 2017-09-25T13:23:21 |
| state | error |
| uuid | 28c1ea1d-f03e-4603-95e6-0ece84d914e5 |
+-------------+-----------------------------------------------------------------------------------+

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to ironic-inspector (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/510052

Dmitry Tantsur (divius)
Changed in ironic-inspector:
status: New → Confirmed
status: Confirmed → Triaged
importance: Undecided → Medium
Revision history for this message
milan k (vetrisko) wrote :

The root cause is the abort call being decorated with a `fsm_transition(istate.Events.abort`, reentrant=False)`[1]; Errors like:

  2017-09-27 04:33:20.560 ERROR ironic_inspector.node_cache [-] [node: 28c1ea1d-
  f03e-4603-95e6-0ece84d914e5 state error] Processing the error event because of an exception
  <class 'ironic_inspector.utils.NodeStateRaceCondition'>: Node state mismatch detected between the
  DB and the cached node_info object raised by ironic_inspector.introspect._abort:
  NodeStateRaceCondition: Node state mismatch detected between the DB and the cached node_info
  object

Can appear if two inspector instances compete for a single node.
[1] https://github.com/openstack/ironic-inspector/blob/master/ironic_inspector/introspect.py#L154

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic-inspector (master)

Fix proposed to branch: master
Review: https://review.openstack.org/510929

Changed in ironic-inspector:
assignee: nobody → milan k (vetrisko)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ironic-inspector (master)

Change abandoned by Milan Kováčik (<email address hidden>) on branch: master
Review: https://review.openstack.org/510052
Reason: not needed anymore

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Milan Kováčik (<email address hidden>) on branch: master
Review: https://review.openstack.org/510929
Reason: not needed anymore

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.