Maintenance updates break controller addition/replacement

Bug #1470650 reported by Dmitry Nikishov
46
This bug affects 10 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Won't Fix
Critical
Alexander Nevenchannyy
5.1.x
Won't Fix
High
Vitaly Sedelnik
6.0.x
Won't Fix
High
Alexander Nevenchannyy
6.1.x
Invalid
High
Vitaly Sedelnik

Bug Description

MOS 6.0 (may also be relevant to 6.1, but there are no MUs yet available).

Environment: HA, CentOS (Ubuntu to be verified later)

Steps to reproduce:
1. Deploy HA/CentOS environment. All other options are irrelevant.
2. Apply Maintenance Update
3. Remove existing controller (optional)
4. Add controller(s)
5. Set up HDDs/NICs for new node(s)
6. Press "Deploy changes"

Result: deployment fails.

Root cause: library uses centos-versions.yaml and ubuntu-versions.yaml to lookup package versions during puppet runs. When adding a controller node, puppet is being run on all controllers, not just new one. When puppet encounters a package of a different version, than in centos/ubuntu-versions.yaml, it tries to reinstall it. In case of Maintenance Updates this leads to a failed deployment.

Revision history for this message
Dmitry Nikishov (nikishov-da) wrote :

This affects all our 6.0 CentOS deployments (until this is verified on Ubuntu as well, but there is a good chance, that it will won't work there either.)

description: updated
description: updated
summary: - Maintenance updates break controller replacement
+ Maintenance updates break controller addition/replacement
Revision history for this message
Dmitry Nikishov (nikishov-da) wrote :

When the deployment has already failed. after removing centos-versions.yaml/ubuntu-versions.yaml and pressing "Deploy changes" again, it failed with an error, which corresponds to this bug: https://bugs.launchpad.net/fuel/+bug/1402637

Changed in fuel:
assignee: nobody → Dmitry Ilyin (idv1985)
Revision history for this message
Dmitry Nikishov (nikishov-da) wrote :

Workaround:

1. Remove centos/ubuntu-versions.yaml from /etc/puppet/manifests
E.g. mv /etc/puppet/manifests/*.yaml ~/
2. (If controller node has been deleted) Remove deleted controller node from:
2a. CIB:
crm node delete <node>
(Might have to execute this command 2 time since first time it can time out and not delete the node from CIB, just from membership)
2b. neutron agents:
for agent in `neutron agent-list | awk '/<node>/ {print $2}'`
2c. nova services:
mysql nova -e "delete from services where host like '%<node>%'"
2d. ceph osd (if enabled):
ceph osd remove <node>
3. Press deploy changes

Revision history for this message
Dmitry Nikishov (nikishov-da) wrote :

Workaround (updated):

1. Remove centos/ubuntu-versions.yaml from /etc/puppet/manifests
E.g. mv /etc/puppet/manifests/*.yaml ~/

2. (If controller node has been deleted) Remove deleted controller node from:
2a. CIB:
crm node delete <node>
(Might have to execute this command 2 time since first time it can time out and not delete the node from CIB, just from membership)

2b. neutron agents:
for agent in `neutron agent-list | awk '/<node>/ {print $2}'`; do neutron agent-delete $agent; done

2c. nova services:
mysql nova -e "delete from services where host like '%<node>%'"

2d. ceph osd (if enabled):
ceph osd remove <node>

3. Press deploy changes

Revision history for this message
Dmitry Ilyin (idv1985) wrote :

Actually you should remove "package" module. It's left from unsuccessful patching feature and is not used at all.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/6.0)

Fix proposed to branch: stable/6.0
Review: https://review.openstack.org/197977

Revision history for this message
Mike Scherbakov (mihgen) wrote :

We need to involve sustaining team here, folks. Dmitry Ilyin - do I understand right, that if you just remove package puppet module (patchset above), then the issue goes away? How can we fix live installs, not something we deploy from scratch? Just by delivering updated puppet modules archive, without this package module?
Sustaining team should be able to provide required tarball / updated package (if modules were packaged already to RPM in 6.0).

Changed in fuel:
assignee: Dmitry Ilyin (idv1985) → MOS Sustaining (mos-sustaining)
milestone: none → 6.0-updates
Changed in fuel:
importance: Undecided → Critical
Revision history for this message
Miroslav Anashkin (manashkin) wrote :

Proposed workaround basically correct, with exception for Ceph OSD node removal procedure.
Dmitry, please follow the steps described in this technical bulletin in order to remove Ceph OSD node correctly, without possible data loss:
http://online.mirantis.com/hubfs/Technical_Bulletins/Mirantis-Technical-Bulletin-5-Removing-Ceph-OSD-node.pdf

Changed in fuel:
status: New → Confirmed
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

I think that we should create a separate bug for ceph-osd removal problems.

Changed in fuel:
milestone: 6.0-updates → 6.0-mu-4
Changed in fuel:
assignee: MOS Sustaining (mos-sustaining) → Vitaly Sedelnik (vsedelnik)
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

I am working with Documentation team to document this as known issue and provide step-by-step instruction to add a controller to existing MOS 6.0 deployment with maintenance updates applied. ETA for this - 07/07.

Next action items wrt this issue are:
1. Figure out whether it's possible to fix it permanently for 6.0 with MU5.
2. Test if this issue affects 6.1 and 5.1.1/5.1 (I nominated the bug to appropriate series) and proceed with fixes if needed.

Revision history for this message
Alexander Adamov (aadamov) wrote :

Known issue in release notes 6.0 https://review.openstack.org/#/c/199011/

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Alexander Nevenchannyy is working on validating the proposed workaround for both CentOS and Ubuntu (MOS 6.0 + MU4), we will proceed with publishing when done. ETA - 07/08

Changed in fuel:
assignee: Vitaly Sedelnik (vsedelnik) → Alexander Nevenchannyy (anevenchannyy)
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

In conversation with Dmitry Ilyin and Vladimir Kuklin was stated that the suggested fix (https://review.openstack.org/#/c/197977/) should be used to avoid this bug in future. After the intense testing it will be possible to include the fix into the MU5 as a automated script.

Revision history for this message
Alexander Nevenchannyy (anevenchannyy) wrote :

Folks, i'm was try use workaround from https://bugs.launchpad.net/fuel/+bug/1470650/comments/4 for CentOs based cluster.
But provisioning of new controller was failed, with this bug https://bugs.launchpad.net/mos/+bug/1391438

After starting metadata agents by hands, provisioning was started.

Revision history for this message
Alexander Nevenchannyy (anevenchannyy) wrote :

I'm was verified workaround with deleting centos/ubuntu-versions.yaml files at MOS-6.0 (Ubuntu/CentOS) it's works fine.

But in two of the three cases deployment was accompanied by the error https://bugs.launchpad.net/mos/+bug/1391438

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

The documentation update is merged so the issue is now documented as known issue with workaround - https://docs.mirantis.com/openstack/fuel/fuel-6.0/release-notes.html#how-to-update-the-product (see OpenStack Fuel section in Maintenance updates chapter).

Let's proceed with implementing permanent with in MU5 for 6.0 and MU1 for 5.1/5.1.1.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/6.0)

Reviewed: https://review.openstack.org/197977
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=cc866d995f11b652568b345ece51f19258d1b06c
Submitter: Jenkins
Branch: stable/6.0

commit cc866d995f11b652568b345ece51f19258d1b06c
Author: Dmitry Ilyin <email address hidden>
Date: Thu Jul 2 16:38:23 2015 +0300

    Remove package module for 6.0

    Change-Id: Id053d80517cd8d7dfc337886e8987067347c3e32
    Closes-Bug: 1470650

Revision history for this message
OSCI Robot (oscirobot) wrote :

Changeset merged. Package placed on primary repository.
RPM package fuel-library6.0 has been built for project stackforge/fuel-library.
Files placed in repository:
fuel-ha-utils6.0-6.0.0-6204.1.noarch.rpm
fuel-library6.0-6.0.0-6204.1.noarch.rpm
Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-updates-stable/centos .

Revision history for this message
OSCI Robot (oscirobot) wrote :

Changeset merged. Package placed on primary repository.
DEB package fuel-library has been built for project stackforge/fuel-library.
Files placed in repository:
fuel-ha-utils6.0_6.0.0-6204.1_all.deb
fuel-library6.0_6.0.0-6204.1_all.deb
Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-updates-stable/ubuntu .

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.1)

Fix proposed to branch: stable/5.1
Review: https://review.openstack.org/201786

Changed in fuel:
status: Confirmed → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/201786
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=afed4badf133fe65543676ecf650209e38f7f892
Submitter: Jenkins
Branch: stable/5.1

commit afed4badf133fe65543676ecf650209e38f7f892
Author: Dmitry Ilyin <email address hidden>
Date: Thu Jul 2 16:38:23 2015 +0300

    Remove package module for 5.1

    Change-Id: Id053d80517cd8d7dfc337886e8987067347c3e32
    Closes-Bug: 1470650
    (cherry picked from commit cc866d995f11b652568b345ece51f19258d1b06c)

Changed in fuel:
milestone: 6.0-mu-4 → 6.0-mu-5
Revision history for this message
Vadim Rovachev (vrovachev) wrote :

Version 6.0. Reopen.
Need to update file /etc/puppet/${OPENSTACK_VERSION}/manifests/site.pp or we have fail on deploy new env.

Changed in fuel:
status: Fix Committed → Confirmed
tags: added: done release-notes
tags: added: 6.0-mu-5
Changed in fuel:
milestone: 6.0-mu-5 → 6.0-mu-6
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Won't Fix for 6.0-updates as there is no way to update 6.0 puppet manifests using packages. We investigated possibility of backporting fuel-library package from 6.1 but it turned out there is high risk of introducing regression. This issue is documented in 6.0 release notes as known issue with workaround. We need to do the same for 5.1/5.1.1

Changed in fuel:
milestone: 6.0-mu-6 → 6.1-updates
status: Confirmed → Won't Fix
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Won't Fix for 5.1.1-updates per comment above

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.