IBP provision fails for previosly deployed node

Bug #1588260 reported by Nikita Koshikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Alexander Gordeev
6.1.x
Fix Released
High
Sergii Rizvan
7.0.x
Fix Released
High
Sergii Rizvan
8.0.x
Fix Released
High
Sergii Rizvan
Mitaka
Fix Released
High
Alexander Gordeev

Bug Description

This is MOS7+MU3

Steps to reproduce:
1)Install environment with ceph nodes. (ceph hardware is DL180 G9s in this case)
2)Prepare ceph node for deletion (https://docs.mirantis.com/openstack/fuel/fuel-7.0/operations.html#how-to-safely-remove-a-ceph-osd-node)
3)Delete one ceph node
4)Deploy changes

Than 2 disks were changed in this node.

5) Try to add the same node(with new disks) to environment
6) Deploy changes

Actual result - deployment not finished, because system can't boot - after ibp install OS. While rebooting there is error - no such device "id-goes-here".

So, fuel agent while running commands - didn't erase disks data. Booting this system to repair-cd and generating report - confirms this(report will be attached to this bug). And grub.cfg have old("id-goes-here") id, that remained from previous setup.

I will also attach full bootstap directory for this node, where you can find fuel-agen.log.

Expected result - deployment finished successfully

Revision history for this message
Nikita Koshikov (nkoshikov) wrote :
Revision history for this message
Nikita Koshikov (nkoshikov) wrote :
Revision history for this message
Nikita Koshikov (nkoshikov) wrote :
Changed in fuel:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Alexander Gordeev (a-gordeev)
milestone: none → 10.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/324513

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Ilya Kutukov (ikutukov)
tags: added: area-python
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/324513
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=e2a20044b23ece6627a90827f3fc235b24d3880a
Submitter: Jenkins
Branch: master

commit e2a20044b23ece6627a90827f3fc235b24d3880a
Author: Alexander Gordeev <email address hidden>
Date: Thu Jun 2 15:53:57 2016 +0300

    Use any disk for /boot regardless of its size

    Apparently, the disks which're bigger than 2T were excluded from list
    of bootable disks if possible. Just because in the past, those disks
    were unrecognized by BIOS (or UEFI under CSM).

    In fact, it was just misconfiguration of RAID controller or BIOS
    itself.

    GRUB uses BIOS INT13h in order to find all available disks.
    Therefore, unless particular disk is not configured as 'bootable',
    there's no change for GRUB to find it.

    One should configure hardware in the following way assuming that
    the first disk (hd0) is bootable and is used for operating system
    purposes.
    In case of hardware RAID, FC multipath or any other HBA, the disk
    (or lun, or whatever) which was configured as 'bootable' will be
    reported as hd0 via INT13h. So, GRUB will be able to boot from it.

    DocImpact
    Closes-Bug: #1588260

    Change-Id: I7bc729ffafa3b9d6bfe8521fa38599d36d02f7e1

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/325832

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/mitaka)

Reviewed: https://review.openstack.org/325832
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=7ffbf39caf5845bd82b8ce20a7766cf24aa803fb
Submitter: Jenkins
Branch: stable/mitaka

commit 7ffbf39caf5845bd82b8ce20a7766cf24aa803fb
Author: Alexander Gordeev <email address hidden>
Date: Thu Jun 2 15:53:57 2016 +0300

    Use any disk for /boot regardless of its size

    Apparently, the disks which're bigger than 2T were excluded from list
    of bootable disks if possible. Just because in the past, those disks
    were unrecognized by BIOS (or UEFI under CSM).

    In fact, it was just misconfiguration of RAID controller or BIOS
    itself.

    GRUB uses BIOS INT13h in order to find all available disks.
    Therefore, unless particular disk is not configured as 'bootable',
    there's no change for GRUB to find it.

    One should configure hardware in the following way assuming that
    the first disk (hd0) is bootable and is used for operating system
    purposes.
    In case of hardware RAID, FC multipath or any other HBA, the disk
    (or lun, or whatever) which was configured as 'bootable' will be
    reported as hd0 via INT13h. So, GRUB will be able to boot from it.

    DocImpact
    Closes-Bug: #1588260

    Change-Id: I7bc729ffafa3b9d6bfe8521fa38599d36d02f7e1
    (cherry picked from commit e2a20044b23ece6627a90827f3fc235b24d3880a)

tags: added: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/327179

Revision history for this message
Andrey Lavrentyev (alavrentyev) wrote :

Unable to reproduce it on 9.0-mos #450

[root@nailgun ~]# shotgun2 short-report
cat /etc/fuel_build_id:
 450
cat /etc/fuel_build_number:
 450
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6347.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8723.noarch
 python-packetary-9.0.0-1.mos140.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 fuel-migrate-9.0.0-1.mos8435.noarch
 rubygem-astute-9.0.0-1.mos748.noarch
 fuel-mirror-9.0.0-1.mos140.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-openstack-metadata-9.0.0-1.mos8723.noarch
 fuel-notify-9.0.0-1.mos8435.noarch
 nailgun-mcagents-9.0.0-1.mos748.noarch
 python-fuelclient-9.0.0-1.mos323.noarch
 fuel-9.0.0-1.mos6347.noarch
 fuel-utils-9.0.0-1.mos8435.noarch
 fuel-setup-9.0.0-1.mos6347.noarch
 fuel-misc-9.0.0-1.mos8435.noarch
 fuel-library9.0-9.0.0-1.mos8435.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2715.noarch
 fuel-ostf-9.0.0-1.mos935.noarch
 fuelmenu-9.0.0-1.mos272.noarch
 fuel-nailgun-9.0.0-1.mos8723.noarch

tags: removed: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/7.0)

Fix proposed to branch: stable/7.0
Review: https://review.openstack.org/327549

Revision history for this message
Sergii Rizvan (srizvan) wrote :

Note for QA:
On 8.0 the issue is reproducible only if the size of the first disk on ceph node is more than 2TB, and the size of the second disk is less than 2TB.
For example, here the output for lsblk command on bootstrap node:
root@bootstrap:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
vdb 253:16 0 100G 0 disk
loop0 7:0 0 261.5M 0 loop /lib/live/mount/rootfs/root.squashfs

In this situation issue is reproducible.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (stable/6.1)

Fix proposed to branch: stable/6.1
Review: https://review.openstack.org/328238

Revision history for this message
Sergii Rizvan (srizvan) wrote :

Notes for QA (6.1, 7.0, 8.0):

The issue is only reproducible on specific hardware, that's why I've verified the fix in such way:
You need at least one node in environment with the first disk larger than 2TB:
root@bootstrap:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
vdb 253:16 0 100G 0 disk
loop0 7:0 0 261.5M 0 loop /lib/live/mount/rootfs/root.squashfs

Start deploying cluster. After OS provisioning we able to see partitions on disks.
Before applying the patch /boot partition won't be on the first disk (again, if the size of the first disk is more than 2TB):

root@node-8:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
|-vda1 253:1 0 24M 0 part
|-vda2 253:2 0 200M 0 part
|-vda3 253:3 0 54.1G 0 part
| |-os-root (dm-0) 252:0 0 50G 0 lvm /
| `-os-swap (dm-1) 252:1 0 4G 0 lvm [SWAP]
`-vda4 253:4 0 20M 0 part
vdb 253:16 0 100G 0 disk
|-vdb1 253:17 0 24M 0 part
|-vdb2 253:18 0 200M 0 part
|-vdb3 253:19 0 200M 0 part /boot
`-vdb4 253:20 0 99.5G 0 part

After applying the patch for approriate version, /boot partition will be located on the first drive regardless of its size:

root@node-9:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
|-vda1 253:1 0 24M 0 part
|-vda2 253:2 0 200M 0 part
|-vda3 253:3 0 200M 0 part /boot
|-vda4 253:4 0 54.1G 0 part
| |-os-root (dm-0) 252:0 0 50G 0 lvm /
| `-os-swap (dm-1) 252:1 0 4G 0 lvm [SWAP]
`-vda5 253:5 0 20M 0 part
vdb 253:16 0 100G 0 disk
|-vdb1 253:17 0 24M 0 part
|-vdb2 253:18 0 200M 0 part
`-vdb3 253:19 0 99.5G 0 part

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/8.0)

Reviewed: https://review.openstack.org/327179
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=8c118a477603e3c75e336c3a5cc9b36cc9edc05f
Submitter: Jenkins
Branch: stable/8.0

commit 8c118a477603e3c75e336c3a5cc9b36cc9edc05f
Author: Alexander Gordeev <email address hidden>
Date: Thu Jun 2 15:53:57 2016 +0300

    Use any disk for /boot regardless of its size

    Apparently, the disks which're bigger than 2T were excluded from list
    of bootable disks if possible. Just because in the past, those disks
    were unrecognized by BIOS (or UEFI under CSM).

    In fact, it was just misconfiguration of RAID controller or BIOS
    itself.

    GRUB uses BIOS INT13h in order to find all available disks.
    Therefore, unless particular disk is not configured as 'bootable',
    there's no change for GRUB to find it.

    One should configure hardware in the following way assuming that
    the first disk (hd0) is bootable and is used for operating system
    purposes.
    In case of hardware RAID, FC multipath or any other HBA, the disk
    (or lun, or whatever) which was configured as 'bootable' will be
    reported as hd0 via INT13h. So, GRUB will be able to boot from it.

    DocImpact
    Closes-Bug: #1588260

    Change-Id: I7bc729ffafa3b9d6bfe8521fa38599d36d02f7e1
    (cherry picked from commit e2a20044b23ece6627a90827f3fc235b24d3880a)
    (cherry picked from commit 7ffbf39caf5845bd82b8ce20a7766cf24aa803fb)

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 8.0 + MU2 updates.

One node in environment with the first disk larger than 2TB:
root@bootstrap:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
vdb 253:16 0 50G 0 disk
vdc 253:32 0 50G 0 disk
loop0 7:0 0 261.5M 0 loop /lib/live/mount/rootfs/root.squashfs

Actual result:
Before applying the patch, /boot partition was on the second disk 'vdb', with size less than 2T.
After applying MU2 patch, /boot partition is located on the first drive 'vda' with size 3T.

tags: removed: on-verification
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/7.0)

Reviewed: https://review.openstack.org/327549
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=5cc463edfbc04e763cb77c01c3e5a3b430b778be
Submitter: Jenkins
Branch: stable/7.0

commit 5cc463edfbc04e763cb77c01c3e5a3b430b778be
Author: Alexander Gordeev <email address hidden>
Date: Thu Jun 2 15:53:57 2016 +0300

    Use any disk for /boot regardless of its size

    Apparently, the disks which're bigger than 2T were excluded from list
    of bootable disks if possible. Just because in the past, those disks
    were unrecognized by BIOS (or UEFI under CSM).

    In fact, it was just misconfiguration of RAID controller or BIOS
    itself.

    GRUB uses BIOS INT13h in order to find all available disks.
    Therefore, unless particular disk is not configured as 'bootable',
    there's no change for GRUB to find it.

    One should configure hardware in the following way assuming that
    the first disk (hd0) is bootable and is used for operating system
    purposes.
    In case of hardware RAID, FC multipath or any other HBA, the disk
    (or lun, or whatever) which was configured as 'bootable' will be
    reported as hd0 via INT13h. So, GRUB will be able to boot from it.

    DocImpact
    Closes-Bug: #1588260

    Change-Id: I7bc729ffafa3b9d6bfe8521fa38599d36d02f7e1
    (backported from commit 8c118a477603e3c75e336c3a5cc9b36cc9edc05f)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-web (stable/6.1)

Reviewed: https://review.openstack.org/328238
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=0d50791fe6021644eaf0f86de85251cf49f7ff9c
Submitter: Jenkins
Branch: stable/6.1

commit 0d50791fe6021644eaf0f86de85251cf49f7ff9c
Author: Alexander Gordeev <email address hidden>
Date: Thu Jun 2 15:53:57 2016 +0300

    Use any disk for /boot regardless of its size

    Apparently, the disks which're bigger than 2T were excluded from list
    of bootable disks if possible. Just because in the past, those disks
    were unrecognized by BIOS (or UEFI under CSM).

    In fact, it was just misconfiguration of RAID controller or BIOS
    itself.

    GRUB uses BIOS INT13h in order to find all available disks.
    Therefore, unless particular disk is not configured as 'bootable',
    there's no change for GRUB to find it.

    One should configure hardware in the following way assuming that
    the first disk (hd0) is bootable and is used for operating system
    purposes.
    In case of hardware RAID, FC multipath or any other HBA, the disk
    (or lun, or whatever) which was configured as 'bootable' will be
    reported as hd0 via INT13h. So, GRUB will be able to boot from it.

    DocImpact
    Closes-Bug: #1588260

    Change-Id: I7bc729ffafa3b9d6bfe8521fa38599d36d02f7e1
    (backported from commit 5cc463edfbc04e763cb77c01c3e5a3b430b778be)

tags: added: on-verification
Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on MOS6.1 + MU7 updates.

One node in environment with the first disk larger than 2TB:
Before applying the patch /boot partition is on the vdb, but with fix /boot is located on the first drive:
root@node-9:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
|-vda1 253:1 0 24M 0 part
|-vda2 253:2 0 200M 0 part
|-vda3 253:3 0 200M 0 part /boot
|-vda4 253:4 0 53.1G 0 part
| |-os-root (dm-1) 252:1 0 50G 0 lvm /
| `-os-swap (dm-2) 252:2 0 3G 0 lvm [SWAP]
|-vda5 253:5 0 3T 0 part
| `-image-glance (dm-0) 252:0 0 3T 0 lvm /var/lib/glance
`-vda6 253:6 0 20M 0 part

tags: removed: on-verification
tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on MOS 7.0 + MU5 updates.

One node in environment has the first disk with size 3TB:
[root@bootstrap ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
vdb 253:16 0 50G 0 disk
vdc 253:32 0 50G 0 disk

Actual result:
Before applying the patch /boot partition was on the second disk 'vdb', with size less than 2TB:
vdb 253:16 0 50G 0 disk
|-vdb1 253:17 0 24M 0 part
|-vdb2 253:18 0 200M 0 part
|-vdb3 253:19 0 200M 0 part /boot
`-vdb4 253:20 0 49.3G 0 part
  `-image-glance (dm-0) 252:0 0 3T 0 lvm /var/lib/glance

After fix, /boot partition is located on the first disk with size 3TB:
root@node-12:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 253:0 0 3T 0 disk
|-vda1 253:1 0 24M 0 part
|-vda2 253:2 0 200M 0 part
|-vda3 253:3 0 200M 0 part /boot
|-vda4 253:4 0 53.1G 0 part
| |-os-root (dm-3) 252:3 0 50G 0 lvm /
| `-os-swap (dm-4) 252:4 0 3G 0 lvm [SWAP]
|-vda5 253:5 0 10.1G 0 part
| `-logs-log (dm-2) 252:2 0 10G 0 lvm /var/log
|-vda6 253:6 0 20.1G 0 part
| `-mysql-root (dm-1) 252:1 0 20G 0 lvm /var/lib/mysql
|-vda7 253:7 0 2.9T 0 part
| `-image-glance (dm-0) 252:0 0 3T 0 lvm /var/lib/glance
`-vda8 253:8 0 20M 0 part

tags: removed: on-verification
tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on 10.0 build #1566.

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.