Fuel for OpenStack

Stop deployment on redeployment after reset failed with Orchestrator error

Bug #1282065 reported by Anastasiia Naboikina on 2014-02-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Released	Medium	Vladimir Sharshov	Fuel for OpenStack 4.1

Bug Description

{"build_id": "2014-02-19_02-05-26", "mirantis": "no", "build_number": "161", "nailgun_sha": "f97f3edcd8056aba3d4863a93d0d6ea917e23657", "ostf_sha": "f86abe5544b5ffcf621e0c450bca15737c92361f", "fuelmain_sha": "0b9ba969d1cff3d9de78d9feb4fb0f4539fc74de", "astute_sha": "581643fb9ace27282150fa3951660a9796acb867", "release": "4.1", "fuellib_sha": "8f5fc7f397646933ffba3acab8bb665756caa58b"}

Steps to reproduce:
1. Install iso 161 on kvm.
2. Create cluster with the following parameters:
  - CentOS simple;
  - nova network DHCP flat;
  - choose Ceilometer;
  - choose Ceph for images;
  - change common setting for usage of common scheduler;
3. Add the following nodes:
   - 1 controller;
   - 1 compute + ceph;
   - 1 cinder + ceph;
4. Deploy cluster, wait until cluster successfully deploys;
5. Click reset cluster.
6. After reset, change network settings to VLAN.
7. Re-deploy cluster.
8. When controller starts to install OpenStack, stop cluster deployment.
9. Wait until stop finishes.

Expected result:
Cluster is successfully stopped, nodes are not in error state.

Actual result:
Cluster is in failed state, all nodes are in error state. There is an orchestrator error:

2014-02-19 12:20:47 ERR
[1560] Error running RPC method stop_deploy_task: killed thread, trace: ["/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/dispatcher.rb:190:in `run'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/dispatcher.rb:190:in `stop_current_task'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/dispatcher.rb:156:in `stop_deploy_task'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/server.rb:132:in `dispatch_message'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/server.rb:85:in `block in dispatch'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/server.rb:83:in `each'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/server.rb:83:in `each_with_index'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/server.rb:83:in `dispatch'", "/opt/rbenv/versions/1.9.3-p392/lib/ruby/gems/1.9.1/gems/naily-0.1.0/lib/naily/server.rb:78:in `block in perform_service_job'"]

Tags:

Revision history for this message

Anastasiia Naboikina (anaboikina) wrote on 2014-02-19:

fuel-snapshot-2014-02-19_12-33-02.tgz Edit (1.9 MiB, application/x-tar)

Evgeniy L (rustyrobot) on 2014-02-19

Changed in fuel:
status:	New → Confirmed
importance:	Undecided → High
assignee:	nobody → Vladimir Sharshov (vsharshov)

Revision history for this message

Anastasiia Naboikina (anaboikina) wrote on 2014-02-19:

cluster_error.png Edit (711.8 KiB, image/png)

Revision history for this message

Mike Scherbakov (mihgen) wrote on 2014-02-21:

I'm curios if this is really High priority issue. Looks like issue happens in some tricky situation... Vladimir, what's the status for it? We need to triage this.

Mike Scherbakov (mihgen) on 2014-02-21

Changed in fuel:
assignee:	Vladimir Sharshov (vsharshov) → Nikolay Markov (nmarkov)

Nikolay Markov (nmarkov) on 2014-02-24

Changed in fuel:
importance:	High → Medium

Revision history for this message

Nikolay Markov (nmarkov) wrote on 2014-02-24:

I had a discussion with Vladimir sharshov about this bug, he said it is some kind of really rare case. We'll discuss possible solutions, but this is definitely not a blocker, I updated it's status to "Medium" until further discussion.

Revision history for this message

Vladimir Sharshov (vsharshov) wrote on 2014-02-24:

Problem was here:
https://github.com/stackforge/fuel-web/blob/master/naily/lib/naily/dispatcher.rb#L190

This code just run main deploy process which should do some clean up actions. Error 'killed thread' can be raised only if main deploy process already dead. It is really rarely situation (cancel task running in time when deploy task almost done), but i can easily provide one line fix for this.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-02-25: Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/76098

Changed in fuel:
assignee:	Nikolay Markov (nmarkov) → Vladimir Sharshov (vsharshov)
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-02-25: Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/76098
Committed: https://git.openstack.org/cgit/stackforge/fuel-web/commit/?id=511153a10a8e1d5bbc0bbfd9078eebed04bb22a1
Submitter: Jenkins
Branch: master

commit 511153a10a8e1d5bbc0bbfd9078eebed04bb22a1
Author: Vladimir Sharshov <email address hidden>
Date: Mon Feb 24 13:57:23 2014 +0400

New way to stop a main thread

    Use the kill instead of raise a custom exception.
    For some reason mcollective capture all exceptions
    if one of node becames inaccessible.

Bug 1282065 closes because the problem condition was deleted.

    Change-Id: Ia7b9ef9734883a470bea592c398359f75b807d45
    Closes-Bug: #1283812
    Closes-Bug: #1282065

Changed in fuel:
status:	In Progress → Fix Committed

Anastasia Palkina (apalkina) on 2014-02-26

tags:

added: in progress

Revision history for this message

Anastasia Palkina (apalkina) wrote on 2014-02-26:

Verified on ISO #211
"build_id": "2014-02-26_13-39-45",
"mirantis": "yes",
"build_number": "211",
"nailgun_sha": "ea08cef3e06a72f47cfaa8cd8fe6d034e2cf722e",
"ostf_sha": "8e6681b6d06c7cb20a84c1cc740d5f2492fb9d85",
"fuelmain_sha": "baa8bb07393698f1186cb67bb65f1b93907c59bd",
"astute_sha": "10cccc87f2ee35510e43c8fa19d2bf916ca1fced",
"release": "4.1",
"fuellib_sha": "0a2e5bdc01c1e3bb285acb7b39125101e950ac72"

tags:	removed: in progress
Changed in fuel:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.