rabbitmq bundle failed or stopped after fresh install

Bug #1875238 reported by Reza
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
New
Undecided
Unassigned

Bug Description

Description
===========
After fresh installation of TripleO Stein/Stable on 5 nodes (3 HA Controllers and 2 Computes),
rabbitmq bundle and some other resources Failed in Pacemaker.

Steps to reproduce
==================
1- installing undercloud
2- installing overcloud with this command:

openstack overcloud deploy \
--control-flavor control \
--compute-flavor compute \
--templates ~/openstack-tripleo-heat-templates \
-r /home/stack/roles_data.yaml \
-e /home/stack/containers-prepare-parameter.yaml \
-e environment.yaml \
-e ~/openstack-tripleo-heat-templates/environments/services/neutron-ovn-dvr-ha.yaml \
-e ~/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e ~/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ~/openstack-tripleo-heat-templates/environments/network-environment.yaml \
--timeout 360 \
--ntp-server pool.ntp.org

I got same result without network isolation and custom network environment, and completely default settings.

Expected result
===============
Fresh healthy HA OpenStack.

Actual result
=============
pcs status output is as follows:

Full list of resources:

 Docker container set: rabbitmq-bundle [192.168.24.1:8787/tripleostein/centos-binary-rabbitmq:pcmklatest]
   rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): FAILED overcloud-controller-0 (Monitoring)
   rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Stopped overcloud-controller-1
   rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Stopped overcloud-controller-2
 Docker container set: galera-bundle [192.168.24.1:8787/tripleostein/centos-binary-mariadb:pcmklatest]
   galera-bundle-0 (ocf::heartbeat:galera): Master overcloud-controller-0
   galera-bundle-1 (ocf::heartbeat:galera): Master overcloud-controller-1
   galera-bundle-2 (ocf::heartbeat:galera): Master overcloud-controller-2
 Docker container set: redis-bundle [192.168.24.1:8787/tripleostein/centos-binary-redis:pcmklatest]
   redis-bundle-0 (ocf::heartbeat:redis): Master overcloud-controller-0
   redis-bundle-1 (ocf::heartbeat:redis): Slave overcloud-controller-1
   redis-bundle-2 (ocf::heartbeat:redis): Slave overcloud-controller-2
 ip-192.168.24.10 (ocf::heartbeat:IPaddr2): Started overcloud-controller-0
 ip-X.X.X.X (ocf::heartbeat:IPaddr2): Started overcloud-controller-1
 ip-172.16.2.175 (ocf::heartbeat:IPaddr2): Started overcloud-controller-2
 ip-172.16.2.41 (ocf::heartbeat:IPaddr2): Started overcloud-controller-1
 ip-172.16.1.166 (ocf::heartbeat:IPaddr2): Stopped
 ip-172.16.3.10 (ocf::heartbeat:IPaddr2): Stopped
 Docker container set: haproxy-bundle [192.168.24.1:8787/tripleostein/centos-binary-haproxy:pcmklatest]
   haproxy-bundle-docker-0 (ocf::heartbeat:docker): Started overcloud-controller-1
   haproxy-bundle-docker-1 (ocf::heartbeat:docker): Started overcloud-controller-2
   haproxy-bundle-docker-2 (ocf::heartbeat:docker): Started overcloud-controller-0
 Docker container set: ovn-dbs-bundle [192.168.24.1:8787/tripleostein/centos-binary-ovn-northd:pcmklatest]
   ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master overcloud-controller-1
   ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers): Slave overcloud-controller-2
   ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers): Slave overcloud-controller-0
 Docker container: openstack-cinder-volume [192.168.24.1:8787/tripleostein/centos-binary-cinder-volume:pcmklatest]
   openstack-cinder-volume-docker-0 (ocf::heartbeat:docker): Started overcloud-controller-0

Failed Resource Actions:
* ip-172.16.1.166_start_0 on overcloud-controller-0 'unknown error' (1): call=89, status=complete, exitreason='[findif] failed',
    last-rc-change='Sun Apr 26 17:19:01 2020', queued=0ms, exec=111ms
* ip-172.16.3.10_start_0 on overcloud-controller-0 'unknown error' (1): call=95, status=complete, exitreason='[findif] failed',
    last-rc-change='Sun Apr 26 17:19:42 2020', queued=0ms, exec=103ms
* ip-172.16.1.166_start_0 on overcloud-controller-1 'unknown error' (1): call=87, status=complete, exitreason='[findif] failed',
    last-rc-change='Sun Apr 26 17:19:00 2020', queued=1ms, exec=147ms
* ip-172.16.3.10_start_0 on overcloud-controller-1 'unknown error' (1): call=93, status=complete, exitreason='[findif] failed',
    last-rc-change='Sun Apr 26 17:19:41 2020', queued=0ms, exec=99ms
* ip-172.16.1.166_start_0 on overcloud-controller-2 'unknown error' (1): call=87, status=complete, exitreason='[findif] failed',
    last-rc-change='Sun Apr 26 17:19:00 2020', queued=0ms, exec=105ms
* ip-172.16.3.10_start_0 on overcloud-controller-2 'unknown error' (1): call=93, status=complete, exitreason='[findif] failed',
    last-rc-change='Sun Apr 26 17:19:42 2020', queued=0ms, exec=93ms
* rabbitmq_start_0 on rabbitmq-bundle-1 'unknown error' (1): call=2121, status=Timed Out, exitreason='',
    last-rc-change='Sun Apr 26 18:18:42 2020', queued=0ms, exec=200049ms
* rabbitmq_start_0 on rabbitmq-bundle-2 'unknown error' (1): call=1979, status=Timed Out, exitreason='',
    last-rc-change='Sun Apr 26 18:04:59 2020', queued=0ms, exec=200031ms
* rabbitmq_monitor_10000 on rabbitmq-bundle-0 'unknown error' (1): call=2298, status=Timed Out, exitreason='',
    last-rc-change='Sun Apr 26 18:36:59 2020', queued=0ms, exec=40036ms
* ovndb_servers_monitor_30000 on ovn-dbs-bundle-2 'not running' (7): call=23, status=complete, exitreason='',
    last-rc-change='Sun Apr 26 17:33:03 2020', queued=1ms, exec=1806ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

It seems cluster is completely unhealthy. even running these commands don't help:
pcs resource restart rabbitmq-bundle
pcs resource cleanup rabbitmq-bundle

or restarting the whole cluster or all nodes with deleting rmenia directory.
All requests on overcloud are extremely slow, Horizon takes one minute for each refresh.
adding additional services like Octavia cause failed overcloud installation due to 504 Gateway timeout.

Environment
===========
TripleO OpenStack Stable/Stein

Logs & Configs
==============

I can provide any required log or config.

Reza (reza-b2008)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.