Node stuck in CLEANWAIT if agent ramdisk fails to boot

Bug #1483120 reported by Ramakrishnan G (rameshg87)
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Lucas Alvares Gomes

Bug Description

Current if the agent ramdisk fails to boot when cleaning in initiated in agent_ipmitool driver, the node is stuck in states.CLEANWAIT for ever. We need a periodic task to check for nodes stuck in states.CLEANWAIT and node having node.clean_step (which indicates that cleaning was just initiated on the node).

Tags: agent
Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

Yeah and we don't offer any API to actually abort the cleaning task

Changed in ironic:
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/213240

Changed in ironic:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/213241

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/213698

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/213699

Dmitry Tantsur (divius)
tags: added: agent
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ironic (master)

Change abandoned by Lucas Alvares Gomes (<email address hidden>) on branch: master
Review: https://review.openstack.org/213241
Reason: Won't be needed anymore

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Lucas Alvares Gomes (<email address hidden>) on branch: master
Review: https://review.openstack.org/213240
Reason: Won't be needed anymore

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Lucas Alvares Gomes (<email address hidden>) on branch: master
Review: https://review.openstack.org/213698
Reason: Not needed anymore

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/213699
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=ea1b012e93da66d28369eadf687cacf1c1d7d994
Submitter: Jenkins
Branch: master

commit ea1b012e93da66d28369eadf687cacf1c1d7d994
Author: Lucas Alvares Gomes <email address hidden>
Date: Thu Aug 20 09:54:02 2015 +0100

    Periodically checks for nodes being cleaned

    This patch is adding a periodic task to check for nodes waiting for the
    ramdisk callback when cleaning is being executed.

    A new configuration option called "clean_callback_timeout" was added,
    its value is the number of seconds that the Ironic conductor will wait
    for the ramdisk to doing the cleaning to contact Ironic back. Defaults
    to 1800.

    Closes-Bug: #1483120
    Change-Id: Id7f9e9018b5cb2389bbe556171e7a9d46425afba

Changed in ironic:
status: In Progress → Fix Committed
Changed in ironic:
milestone: none → 4.1.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.