Cobbler Error during the Fuel master node installation: Could not evaluate: cobblerd does not appear to be running/accessible

Bug #1338552 reported by Timur Nurlygayanov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Matthew Mosesohn
5.0.x
Won't Fix
High
Sergii Golovatiuk

Bug Description

This issue reproduced not for all environments. My environment: 1 hardware server with 24Gb RAM and Ubuntu12.04.4 / Ubuntu 14.10, KVM.

{"build_id": "2014-07-04_13-44-50", "mirantis": "yes", "build_number": "97", "ostf_sha": "09b6bccf7d476771ac859bb3c76c9ebec9da9e1f", "nailgun_sha": "d01b4efc0fc4af9d0e316b9dfc7974f16975f822", "production": "docker", "api": "1.0", "fuelmain_sha": "e312e03dbe29d3436958f7ac024402b1c468e2e4", "astute_sha": "644d279970df3daa5f5a2d2ccf8b4d22d53386ff", "release": "5.0.1", "fuellib_sha": "8a7d86a033b82520abe611bc2c286a10eae42d93"}

Steps To Reproduce:
1. Install Fuel Master node with Fuel 5.0.1 or 5.1+
2. Reboot Fuel master node.
3. Start slave nodes.
4. Login to master node via SSH.
5. Check status of cobbler.

Observed Result:
We can see that slave nodes will not be bootstrapped and we can see many errors in cobbler logs
http://paste.openstack.org/show/85568/

diagnostic snapshot failed with error: exit code: 1 stderr: (please see attached screenshot)

Tags: cobbler
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

^^^^ archive with all files from /var/log/*

tags: added: cobbler
Changed in fuel:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Download full text (4.1 KiB)

From Fuel master node:
___________________________
[root@nailgun ~]# ps ax | grep cobbler
19394 ? S 0:00 /bin/bash /usr/bin/dockerctl start cobbler --attach
22209 pts/8 S+ 0:00 grep cobbler
[root@nailgun ~]#

____________________________________

[root@nailgun ~]# docker images;docker ps
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
storage/log latest 4ac401f34608 About an hour ago 2.489 MB
storage/puppet latest 42960fc6da1f About an hour ago 2.489 MB
storage/repo latest f34d9db59a75 About an hour ago 2.489 MB
storage/dump latest 031227898e16 About an hour ago 2.489 MB
fuel/rsyslog_5.0.1 latest 7ce4606aba69 3 days ago 401.6 MB
fuel/postgres_5.0.1 latest 98751c2ac81e 3 days ago 475.9 MB
fuel/rabbitmq_5.0.1 latest 2f1b50386068 3 days ago 566.3 MB
fuel/rsync_5.0.1 latest a28cc131a443 3 days ago 388.8 MB
fuel/ostf_5.0.1 latest 84e85eef7e4c 3 days ago 532.8 MB
fuel/nginx_5.0.1 latest 143c69ed3be3 3 days ago 447.2 MB
fuel/nailgun_5.0.1 latest bc5a61d3c0b1 3 days ago 528.9 MB
fuel/mcollective_5.0.1 latest 0bff15040728 3 days ago 485.5 MB
fuel/cobbler_5.0.1 latest 03d4f0c8c309 3 days ago 574.5 MB
fuel/astute_5.0.1 latest 92c5ef5ed603 3 days ago 439.5 MB
busybox latest 2d8e5b282c81 10 weeks ago 2.489 MB
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e87e6bb2a7d1 fuel/mcollective_5.0.1:latest /usr/local/bin/start About an hour ago Up About an hour fuel-core-5.0.1-mcollective
60d387fe906f fuel/nginx_5.0.1:latest /usr/local/bin/start About an hour ago Up About an hour 0.0.0.0:8000->8000/tcp, 0.0.0.0:8080->8080/tcp fuel-core-5.0.1-nginx
cf81fb4797d1 fuel/ostf_5.0.1:latest /bin/sh -c /usr/loca About an hour ago Up About an hour 0.0.0.0:8777->8777/tcp fuel-core-5.0.1-ostf
d40ed3d06ad7 fuel/nailgun_5.0.1:latest /bin/sh -c /usr/loca About an hour ago Up About an hour 0.0.0.0:8001->8001/tcp fuel-core-5.0.1-nailgun
938613f70c1a fuel/rsyslog_5.0.1:latest /usr/local/bin/start About an hour ago Up About an hour 0.0.0.0:514->514/tcp, 0.0.0.0:514->514/udp, 0.0.0.0:49153->25150/tcp ...

Read more...

description: updated
Changed in fuel:
milestone: none → 5.0.1
assignee: nobody → Fuel Library Team (fuel-library)
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

Timur, you will need to wait for containers to restart, e.g. cobbler to configure again. Or you will have half-started master node. In this case, nothing is guaranteed to be working. Please, provide output of "dockerctl check cobbler" command at the moment when you start slave nodes.

Changed in fuel:
status: Confirmed → Incomplete
milestone: 5.0.1 → none
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: none → 5.1
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Vladimir I just want to install master node.

after the master node installation I can see that master node doesn't work successfully.

[root@nailgun ~]# dockerctl check cobbler
checking container cobbler
checking with command "shell_container cobbler ps aux | grep -q 'cobblerd -F'"
lxc-attach: failed to get the init pid
try number 1
return code is 1
lxc-attach: failed to get the init pid
try number 2
return code is 1
lxc-attach: failed to get the init pid
try number 3
return code is 1
lxc-attach: failed to get the init pid
try number 4
return code is 1
lxc-attach: failed to get the init pid
try number 5
return code is 1
lxc-attach: failed to get the init pid
try number 6
return code is 1
lxc-attach: failed to get the init pid
try number 7
return code is 1
lxc-attach: failed to get the init pid
try number 8
return code is 1
lxc-attach: failed to get the init pid
try number 9
return code is 1
lxc-attach: failed to get the init pid
try number 10
return code is 1
lxc-attach: failed to get the init pid
try number 11
return code is 1
lxc-attach: failed to get the init pid
try number 12
return code is 1
lxc-attach: failed to get the init pid
try number 13
return code is 1
lxc-attach: failed to get the init pid

Changed in fuel:
status: Incomplete → Confirmed
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

After the installation we can see that cobbler doesn't work:

[root@nailgun ~]# supervisorctl status
dhcrelay_monitor BACKOFF Exited too quickly (process log may have details)
docker-astute RUNNING pid 8716, uptime 0:01:23
docker-cobbler RUNNING pid 8556, uptime 0:01:24
docker-mcollective RUNNING pid 8784, uptime 0:01:23
docker-nailgun RUNNING pid 8754, uptime 0:01:23
docker-nginx RUNNING pid 8847, uptime 0:01:23
docker-ostf RUNNING pid 8813, uptime 0:01:23
docker-postgres RUNNING pid 8557, uptime 0:01:24
docker-rabbitmq RUNNING pid 8601, uptime 0:01:23
docker-rsync RUNNING pid 8639, uptime 0:01:23
docker-rsyslog RUNNING pid 8877, uptime 0:01:23

___________________________

[root@nailgun ~]# dockerctl check
checking container postgres
checking with command "PGPASSWORD=nailgun shell_container postgres psql -h 127.0.0.1 -U nailgun nailgun -c '\copyright' 2>&1 1>/dev/null"
postgres is ready.
checking container rabbitmq
checking with command "curl -f -L -i -u naily:naily http://127.0.0.1:15672/api/nodes 1>/dev/null 2>&1"
checking with command "curl -f -L -u mcollective:marionette -s http://127.0.0.1:15672/api/exchanges | grep -qw 'mcollective_broadcast'"
checking with command "curl -f -L -u mcollective:marionette -s http://127.0.0.1:15672/api/exchanges | grep -qw 'mcollective_directed'"
rabbitmq is ready.
checking container rsync
checking with command "shell_container rsync netstat -ntl | grep -q 873"
rsync is ready.
checking container astute
checking with command "shell_container astute ps aux | grep -q 'astuted'"
checking with command "curl -f -L -u naily:naily -s http://127.0.0.1:15672/api/exchanges | grep -qw 'nailgun'"
checking with command "curl -f -L -u naily:naily -s http://127.0.0.1:15672/api/exchanges | grep -qw 'naily_service'"
astute is ready.
checking container rsyslog
checking with command "shell_container rsyslog netstat -nl | grep -q 514"
rsyslog is ready.
checking container nailgun
checking with command "[ $(curl --connect-timeout 1 -s -w %{http_code} http://127.0.0.1:8000/api/version -o /dev/null) = "200" ]"
nailgun is ready.
checking container ostf
checking with command "[ $(curl --connect-timeout 1 -s -w %{http_code} http://127.0.0.1:8000/ostf/not_found -o /dev/null) = "404" ]"
ostf is ready.
checking container nginx
checking with command "shell_container nginx ps aux | grep -q nginx"
nginx is ready.
checking container cobbler
checking with command "shell_container cobbler ps aux | grep -q 'cobblerd -F'"
lxc-attach: failed to get the init pid
try number 1
return code is 1
lxc-attach: failed to get the init pid
try number 2
return code is 1
lxc-attach: failed to get the init pid
try number 3
return code is 1

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

[root@nailgun ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ce9db2d1d838 fuel/mcollective_5.0.1:latest /usr/local/bin/start 56 minutes ago Up 53 minutes fuel-core-5.0.1-mcollective
c854ffa5befa fuel/cobbler_5.0.1:latest /bin/sh -c /usr/loca 56 minutes ago Up 53 minutes fuel-core-5.0.1-cobbler
1837633cfa59 fuel/nginx_5.0.1:latest /usr/local/bin/start 56 minutes ago Up 53 minutes 0.0.0.0:8000->8000/tcp, 0.0.0.0:8080->8080/tcp fuel-core-5.0.1-nginx
1bf95213be7f fuel/ostf_5.0.1:latest /bin/sh -c /usr/loca 56 minutes ago Up 53 minutes 0.0.0.0:8777->8777/tcp fuel-core-5.0.1-ostf
cf69626d0ff5 fuel/nailgun_5.0.1:latest /bin/sh -c /usr/loca 56 minutes ago Up 53 minutes 0.0.0.0:8001->8001/tcp fuel-core-5.0.1-nailgun
c25f68d00025 fuel/rsyslog_5.0.1:latest /usr/local/bin/start 56 minutes ago Up 53 minutes 0.0.0.0:514->514/tcp, 0.0.0.0:514->514/udp, 0.0.0.0:49153->25150/tcp fuel-core-5.0.1-rsyslog
cb033fc96a1b fuel/astute_5.0.1:latest /bin/sh -c /usr/loca 56 minutes ago Up 53 minutes fuel-core-5.0.1-astute
14deeaa80e2d fuel/rsync_5.0.1:latest /bin/sh -c /usr/loca 56 minutes ago Up 53 minutes 0.0.0.0:873->873/tcp fuel-core-5.0.1-rsync
32ac5feaa8d9 fuel/rabbitmq_5.0.1:latest /usr/local/bin/start 57 minutes ago Up 53 minutes 0.0.0.0:4369->4369/tcp, 0.0.0.0:5672->5672/tcp, 0.0.0.0:15672->15672/tcp, 0.0.0.0:61613->61613/tcp fuel-core-5.0.1-rabbitmq
9021e744da10 fuel/postgres_5.0.1:latest /usr/local/bin/start 57 minutes ago Up 53 minutes 0.0.0.0:5432->5432/tcp fuel-core-5.0.1-postgres

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

[root@nailgun ~]#tail -n 200 -f /var/log/docker-cobbler.log
2014/07/08 12:17:46 dial unix /var/run/docker.sock: no such file or directory
fuel-core-5.0.1-cobbler is already running.
Attaching to container fuel-core-5.0.1-cobbler...
fuel-core-5.0.1-cobbler is already running.
Starting dhcrelay: [FAILED]
Attaching to container fuel-core-5.0.1-cobbler...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/105449

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Matthew Mosesohn (raytrac3r)
status: Confirmed → In Progress
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

An update on this issue. I worked with Timur to try to reproduce this bug approximately ~40 times in a controlled environment. It seems to be quite a random bug. I have patch 105449 and a test iso on build. I hope it will solve the issue by adding a check for httpd to be ready.

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Matthew, with this custom ISO I can see: http://paste.openstack.org/show/85685/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/105449
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=c9dc047f8065820f99716b47d162037148147edf
Submitter: Jenkins
Branch: master

commit c9dc047f8065820f99716b47d162037148147edf
Author: Matthew Mosesohn <email address hidden>
Date: Tue Jul 8 17:36:16 2014 +0400

    Ensure web service starts before cobbler sync

    An issue occurs on certain deployment types
    (mainly virtual) that causes cobbler sync to
    fail because httpd is not ready yet. Adding a
    service check should cover this case.

    Change-Id: Ibc47b3a970756fd7ed49e7fd5558bf24929f7a76
    Closes-Bug: #1338552

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.0)

Fix proposed to branch: stable/5.0
Review: https://review.openstack.org/114214

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/5.0)

Change abandoned by Sergii Golovatiuk (<email address hidden>) on branch: stable/5.0
Review: https://review.openstack.org/114214
Reason: There won't be ISO for 5.0.2. There will be tarball for upgrade only.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.