Nova compute failed to start Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

Bug #1534069 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Confirmed
High
MOS Nova
8.0.x
New
Undecided
Unassigned

Bug Description

Steps to reproduce:
1. Deploy neutron cluster : 1 controller + 1 compute + 1 cinder
2. Run ostf

Actual result:
Test for instance creation failed with:
The requested availability zone is not available

at the same time in nova-compute:
016-01-14T09:15:53.281436+00:00 info: 2016-01-14 09:15:53.278 13596 INFO nova.service [-] Starting compute node (version 12.0.0)
2016-01-14T09:15:53.282614+00:00 debug: 2016-01-14 09:15:53.279 13596 DEBUG nova.virt.libvirt.host [-] Starting native event thread _init_events /usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py:452
2016-01-14T09:15:53.283157+00:00 debug: 2016-01-14 09:15:53.280 13596 DEBUG nova.virt.libvirt.host [-] Starting green dispatch thread _init_events /usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py:458
2016-01-14T09:15:53.283735+00:00 debug: 2016-01-14 09:15:53.280 13596 DEBUG nova.virt.libvirt.host [-] Connecting to libvirt: qemu:///system _get_new_connection /usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py:463
2016-01-14T09:15:53.291281+00:00 info: 2016-01-14 09:15:53.288 13596 INFO nova.virt.libvirt.driver [-] Connection event '0' reason 'Failed to connect to libvirt'
2016-01-14T09:15:53.315712+00:00 warning: 2016-01-14 09:15:53.312 13596 WARNING nova.virt.libvirt.driver [req-7662a4bf-0b28-46a2-b8e8-75bffa653ead - - - - -] Cannot update service status on host "node-2.test.domain.local" since it is not registered.
2016-01-14T09:15:53.318287+00:00 err: 2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host [req-7662a4bf-0b28-46a2-b8e8-75bffa653ead - - - - -] Connection to libvirt failed: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host Traceback (most recent call last):
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py", line 528, in get_connection
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host conn = self._get_connection()
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py", line 515, in _get_connection
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host wrapped_conn = self._get_new_connection()
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py", line 467, in _get_new_connection
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host wrapped_conn = self._connect(self._uri, self._read_only)
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py", line 321, in _connect
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host libvirt.openAuth, uri, auth, flags)
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 141, in proxy_call
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host rv = execute(f, *args, **kwargs)
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 122, in execute
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host six.reraise(c, e, tb)
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 80, in tworker
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host rv = meth(*args, **kwargs)
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host File "/usr/lib/python2.7/dist-packages/libvirt.py", line 105, in openAuth
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host if ret is None:raise libvirtError('virConnectOpenAuth() failed')
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
2016-01-14 09:15:53.313 13596 ERROR nova.virt.libvirt.host

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "9.0"
  api: "1.0"
  build_number: "87"
  build_id: "87"
  fuel-nailgun_sha: "5e1d3d02c2174957113a7146557c351b63d4547d"
  python-fuelclient_sha: "d665010426f1d3270590b8d78a030af0336a0d5a"
  fuel-agent_sha: "cfa3fee5815c93b02e8e3d4e3bed1367b66341a1"
  fuel-nailgun-agent_sha: "9d818c7f8908182016550c0bc2946363d69af73e"
  astute_sha: "330ac35b26d01f882adf62b060888349e3200b0b"
  fuel-library_sha: "39aac5a938605ea0b08637ec1fd697dabbcb39ec"
  fuel-ostf_sha: "c42f2e3e85d0287a0907999f3f8703c9db93b3b8"
  fuel-mirror_sha: "a22ce4730e7c671ac4b18287717bcb98ed0c8f58"
  fuelmenu_sha: "29ddf3d8007e25c922b4a2788c52c750a7145461"
  shotgun_sha: "0682f20c42150962096e6e43ddbcfe1fe1a6d98f"
  network-checker_sha: "cfd9dbc995ee85a6f2dee9c53299f26e42205fd4"
  fuel-upgrade_sha: "e95dc05071b40b630e9c5bc11b6dc00025f6de79"
  fuelmain_sha: "f4035a263472c8523eb308abd4b1509927acbf77"

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

Set to critical according to this issue happens on 9.0 bvt https://product-ci.infra.mirantis.net/job/9.0-liberty.ubuntu.smoke_neutron/

Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Changed in mos:
status: New → Confirmed
tags: added: area-nova
removed: area-mos
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

This is weird, after reverting of a snapshot libvirtd is down on both compute nodes:

root@node-2:~# sudo service libvirtd status
 * Checking status of libvirt management daemon libvirtd [fail]
root@node-2:~# ps -ef | grep libvirt
root 2894 304 0 11:07 pts/5 00:00:00 grep --color=auto libvirt

At the same time restart helps:

root@node-1:~# sudo service libvirtd restart
 * Restarting libvirt management daemon /usr/sbin/libvirtd [ OK ]
root@node-1:~# sudo service libvirtd status
 * Checking status of libvirt management daemon libvirtd [ OK ]
root@node-1:~# ps -ef | grep libvirt
root 4989 1 1 11:07 ? 00:00:00 /usr/sbin/libvirtd -d -d -l

The only log entry about libvirtd is in dmesg:

http://paste.openstack.org/show/483869/

It's not clear why the process didn't not start and why there are no entries about it in upstart logs.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

According to runlevel settings libvirtd should have been started:

root@node-2:~# runlevel
N 2
root@node-2:~# ls -la /etc/rc2.d/S28libvirtd
lrwxrwxrwx 1 root root 18 Jan 14 09:15 /etc/rc2.d/S28libvirtd -> ../init.d/libvirtd
root@node-2:~# head -n 20 /etc/rc2.d/S28libvirtd
#! /bin/sh
#
# Init script for libvirtd
#
# (c) 2007 Guido Guenther <email address hidden>
# based on the skeletons that comes with dh_make
#
### BEGIN INIT INFO
# Provides: libvirtd
# Required-Start: $network $local_fs $remote_fs $syslog
# Required-Stop: $local_fs $remote_fs $syslog
# Should-Start: avahi-daemon cgconfig
# Should-Stop: avahi-daemon cgconfig
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: libvirt management daemon
### END INIT INFO

Revision history for this message
Sergey Nikitin (snikitin) wrote :

I tried to reproduce it on iso 87. All OSTF tests are passed.

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

BVT passed as well, let's wait for another repro.

Changed in mos:
status: Confirmed → Incomplete
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

@Roman, what BVT passed as well?

Changed in mos:
status: Incomplete → Confirmed
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

@Nastya, I meant

https://product-ci.infra.mirantis.net/view/9.0-liberty/job/9.0-liberty.ubuntu.bvt_2/ never failed with such error

https://product-ci.infra.mirantis.net/view/9.0-liberty/job/9.0-liberty.ubuntu.smoke_neutron/ failed only once (original failure this bug was filed for). Other failures are unrelated.

We didn't manage to reproduce this locally on 9.0 either. This looks like a sporadic issue. We did the RCA (please see my previous comments), but it's not clear why this failed in the first place.

I suggest we keep this as Incomplete, until there is a new repro.

Changed in mos:
status: Confirmed → Incomplete
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

Set to confirm according to we have one more occurrence, but downgrade importance, because it happens from time to time (after last occurrence 2 tests are green)

Changed in mos:
importance: Critical → High
status: Incomplete → Confirmed
tags: added: customer-found
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.