Ceilometer Cant Connect to AMQP after Controller Down

Bug #1373569 reported by Tyler Wilson
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Bogdan Dobrelya
5.1.x
Fix Released
High
Bogdan Dobrelya
6.0.x
Fix Released
High
Bogdan Dobrelya

Bug Description

{"build_id": "2014-09-18_06-04-08", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "31", "auth_required": true, "api": "1.0", "nailgun_sha": "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker", "fuelmain_sha": "8ef433e939425eabd1034c0b70e90bdf888b69fd", "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13", "feature_groups": ["experimental"], "release": "5.1", "release_versions": {"2014.1.1-5.1": {"VERSION": {"build_id": "2014-09-18_06-04-08", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "31", "api": "1.0", "nailgun_sha": "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker", "fuelmain_sha": "8ef433e939425eabd1034c0b70e90bdf888b69fd", "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13", "feature_groups": ["experimental"], "release": "5.1", "fuellib_sha": "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"}}}, "fuellib_sha": "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"}

1. Create new environment (Ubuntu, HA mode)
2. Choose GRE segmentation
3. Add controller x3 + Ceilometer
4. Add computes x3 + Ceph OSD

Shutdown primary controller and bring it up again, then attempt to use the Resource Usage pages in horizon

==> /var/log/docker-logs/remote/node-70.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:58:38.372884+01:00 info: 2014-09-24 18:58:38.369 19184 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 127.0.0.1:5673
2014-09-24T19:58:38.374048+01:00 info: 2014-09-24 18:58:38.370 19184 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:58:40.411124+01:00 info: 2014-09-24 18:58:40.408 10059 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.5:5673
2014-09-24T19:58:40.412381+01:00 info: 2014-09-24 18:58:40.409 10059 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

==> /var/log/docker-logs/remote/node-70.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:58:43.394037+01:00 err: 2014-09-24 18:58:43.390 19184 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 192.168.0.3:5673 is unreachable: timed out. Trying again in 30 seconds.

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:58:45.432197+01:00 err: 2014-09-24 18:58:45.429 10059 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 127.0.0.1:5673 is unreachable: [Errno 32] Broken pipe. Trying again in 19 seconds.

==> /var/log/docker-logs/remote/node-69.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:58:55.426343+01:00 info: 2014-09-24 18:58:55.428 30562 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 127.0.0.1:5673
2014-09-24T19:58:55.427363+01:00 info: 2014-09-24 18:58:55.428 30562 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...
2014-09-24T19:59:00.449697+01:00 err: 2014-09-24 18:59:00.451 30562 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 192.168.0.5:5673 is unreachable: Socket closed. Trying again in 30 seconds.

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:04.452501+01:00 info: 2014-09-24 18:59:04.449 10059 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.5:5673
2014-09-24T19:59:04.453660+01:00 info: 2014-09-24 18:59:04.451 10059 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...
2014-09-24T19:59:09.469792+01:00 err: 2014-09-24 18:59:09.467 10059 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 127.0.0.1:5673 is unreachable: [Errno 32] Broken pipe. Trying again in 19 seconds.

==> /var/log/docker-logs/remote/node-70.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:13.422414+01:00 info: 2014-09-24 18:59:13.419 19184 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 127.0.0.1:5673
2014-09-24T19:59:13.424022+01:00 info: 2014-09-24 18:59:13.420 19184 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...
2014-09-24T19:59:18.433425+01:00 err: 2014-09-24 18:59:18.430 19184 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 192.168.0.3:5673 is unreachable: timed out. Trying again in 30 seconds.

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:28.490133+01:00 info: 2014-09-24 18:59:28.487 10059 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.5:5673
2014-09-24T19:59:28.491235+01:00 info: 2014-09-24 18:59:28.488 10059 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

==> /var/log/docker-logs/remote/node-69.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:30.479988+01:00 info: 2014-09-24 18:59:30.481 30562 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.3:5673
2014-09-24T19:59:30.481187+01:00 info: 2014-09-24 18:59:30.482 30562 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:33.515571+01:00 err: 2014-09-24 18:59:33.512 10059 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 127.0.0.1:5673 is unreachable: [Errno 32] Broken pipe. Trying again in 19 seconds.

==> /var/log/docker-logs/remote/node-69.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:35.504238+01:00 err: 2014-09-24 18:59:35.505 30562 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 192.168.0.5:5673 is unreachable: Socket closed. Trying again in 30 seconds.

==> /var/log/docker-logs/remote/node-70.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:48.464659+01:00 info: 2014-09-24 18:59:48.461 19184 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 127.0.0.1:5673
2014-09-24T19:59:48.465735+01:00 info: 2014-09-24 18:59:48.462 19184 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:52.523420+01:00 info: 2014-09-24 18:59:52.520 10059 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.5:5673
2014-09-24T19:59:52.524857+01:00 info: 2014-09-24 18:59:52.522 10059 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

==> /var/log/docker-logs/remote/node-70.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:53.471741+01:00 err: 2014-09-24 18:59:53.468 19184 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 192.168.0.4:5673 is unreachable: timed out. Trying again in 30 seconds.

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T19:59:57.533816+01:00 err: 2014-09-24 18:59:57.531 10059 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 127.0.0.1:5673 is unreachable: [Errno 32] Broken pipe. Trying again in 19 seconds.

==> /var/log/docker-logs/remote/node-69.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T20:00:05.531734+01:00 info: 2014-09-24 19:00:05.533 30562 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.3:5673
2014-09-24T20:00:05.532827+01:00 info: 2014-09-24 19:00:05.534 30562 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...
2014-09-24T20:00:10.543608+01:00 err: 2014-09-24 19:00:10.545 30562 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 192.168.0.5:5673 is unreachable: Socket closed. Trying again in 30 seconds.

==> /var/log/docker-logs/remote/node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
2014-09-24T20:00:16.554651+01:00 info: 2014-09-24 19:00:16.551 10059 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.0.5:5673
2014-09-24T20:00:16.556032+01:00 info: 2014-09-24 19:00:16.553 10059 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...

root@node-68:~# crm status
Last updated: Wed Sep 24 19:01:46 2014
Last change: Wed Sep 24 19:00:30 2014 via crm_attribute on node-69
Stack: classic openais (with plugin)
Current DC: node-69 - partition with quorum
Version: 1.1.10-42f2063
3 Nodes configured, 3 expected votes
22 Resources configured

Online: [ node-68 node-69 node-70 ]

 vip__management_old (ocf::mirantis:ns_IPaddr2): Started node-69
 vip__public_old (ocf::mirantis:ns_IPaddr2): Started node-70
 p_ceilometer-alarm-evaluator (ocf::mirantis:ceilometer-alarm-evaluator): Started node-68
 p_ceilometer-agent-central (ocf::mirantis:ceilometer-agent-central): Started node-69
 Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     Masters: [ node-68 ]
     Slaves: [ node-69 node-70 ]
 Clone Set: clone_p_mysql [p_mysql]
     Started: [ node-68 node-69 node-70 ]
 Clone Set: clone_p_haproxy [p_haproxy]
     Started: [ node-68 node-69 node-70 ]
 p_heat-engine (ocf::mirantis:heat-engine): Started node-69
 Clone Set: clone_p_neutron-plugin-openvswitch-agent [p_neutron-plugin-openvswitch-agent]
     Started: [ node-68 node-69 node-70 ]
 Clone Set: clone_p_neutron-metadata-agent [p_neutron-metadata-agent]
     Started: [ node-68 node-69 node-70 ]
 p_neutron-dhcp-agent (ocf::mirantis:neutron-agent-dhcp): Started node-69
 p_neutron-l3-agent (ocf::mirantis:neutron-agent-l3): Started node-68

Failed actions:
    p_mysql_monitor_120000 (node=node-68, call=97, rc=1, status=complete, last-rc-change=Wed Sep 24 15:17:52 2014
, queued=0ms, exec=0ms
): unknown error
    p_neutron-metadata-agent_monitor_60000 (node=node-68, call=283, rc=7, status=complete, last-rc-change=Wed Sep 24 01:58:05 2014
, queued=0ms, exec=0ms
): not running

Fuel Snapshot: http://23.80.0.2:8000/dump/fuel-snapshot-2014-09-24_20-01-30.tgz

Tags: ceilometer
Tyler Wilson (loth)
description: updated
Changed in fuel:
milestone: none → 6.0
importance: Undecided → High
assignee: nobody → Fuel Library Team (fuel-library)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The crm status results look wrong - then you shut down a controller node, it should be OFFLINE in crm and all resources should migrate as well. Please double check the description

Changed in fuel:
status: New → Incomplete
Revision history for this message
Tyler Wilson (loth) wrote :

Hello,

The controller shut down then came back up, we started seeing the ceilometer connection issues to AMQP after it came back up.

Changed in fuel:
status: Incomplete → Confirmed
description: updated
Changed in fuel:
status: Confirmed → Invalid
Revision history for this message
Tyler Wilson (loth) wrote :

@Bogdan The issue remains after the network traffic has dissipated, would it still be valid if so or is this an issue of pacemaker configuration?

Changed in fuel:
status: Invalid → Confirmed
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Tyler, according to logs, node-68 was shutdown at
2014-09-24T01:41:53.181218 node-68 ./node-68.lax3.ubiquity.io/kernel.log:2014-09-24T01:41:53.181218+01:00 info: Kernel logging (proc) stopped.

and brought back at

2014-09-24T01:48:19.556624 node-68 ./node-68.lax3.ubiquity.io/kernel.log:2014-09-24T01:48:19.556624+01:00 info: [ 33.598870] EXT3-fs (sda2): mounted filesystem with ordered data mode

and Ceilometer AMQP connectivity issues started ~18h later at
2014-09-24T19:58:38.372884+01:00

please confirm, if it is correct?

Revision history for this message
Tyler Wilson (loth) wrote :

@Bogdan Looks like 00:06:05 was the first hints at issues, possibly when the flapping started (check ceilometer-agent-notification.log)

2014-09-24T00:06:05.493953+01:00 err: 2014-09-23 23:06:05.480 21070 ERROR ceilometer.openstack.common.rpc.common [-] Failed to consume message from queue: (0, 0): (320) CONNECTION_FORCED - broker forced conn
ection closure with reason 'shutdown'
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common Traceback (most recent call last):
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/ceilometer/openstack/common/rpc/impl_kombu.py", line 591, in ensure
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common return method(*args, **kwargs)
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/ceilometer/openstack/common/rpc/impl_kombu.py", line 671, in _consume
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common return self.connection.drain_events(timeout=timeout)
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 279, in drain_events
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common return self.transport.drain_events(self.connection, **kwargs)
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqp.py", line 91, in drain_events
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common return connection.drain_events(**kwargs)
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 320, in drain_events
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common return amqp_method(channel, args)
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 526, in _close
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common (class_id, method_id), ConnectionError)
2014-09-23 23:06:05.480 21070 TRACE ceilometer.openstack.common.rpc.common ConnectionForced: (0, 0): (320) CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'
2014-09-24T00:06:05.493953+01:00 info: 2014-09-23 23:06:05.484 21070 INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server on 127.0.0.1:5673
2014-09-24T00:06:05.495063+01:00 info: 2014-09-23 23:06:05.484 21070 INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0 seconds...
2014-09-24T00:06:10.511682+01:00 err: 2014-09-23 23:06:10.501 21070 ERROR ceilometer.openstack.common.rpc.common [-] AMQP server on 127.0.0.1:5673 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Looks like the shutting down of node-68 was not a source of issue - AMQP became unreachable before to it:
The issue started at 2014-09-24T00:06:06.038365 node-68, and continued at all controllers until 2014-09-24T20:12:02.263801 (which is the end of logs in snapshot)

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Note, the rabbitmq cluster started reassembling at 2014-09-24T00:05:54.637861+01:00 (first WARN in ./node-68.lax3.ubiquity.io/rabbitmq-server.log)
And looks like it had been reassembled ok (at least by its logs) - but the time stamp looks suspicious.

Revision history for this message
Tyler Wilson (loth) wrote :

@Bogdan, I can supply credentials to the test Env if needed.

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Bogdan Dobrelya (bogdando)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/124121

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Ok, I found the root cause - there is a bug in rabbitmq ocf script with temporary iptables REJECT rule logic and here is the patch for it (already applied in-place at your test envs' controller nodes, Tyler): //review.openstack.org/124121
The fix works for new rabbitmq sessions, but I still have to figure out how to deal with conntrack - removing this REJECT rule does not restore connectivity for running Openstack services.

I mean, restarting the Openstack services would solve the issue completely, but I have to finish the fix as appropriate. Thank you for your cooperation, Tyler.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

As far as I can see from current state of the env, Tyler, there is no conntrack issues with traffic. The issue is that murano-api, murano-engine, ceilometer-agent-notification at all controller nodes have a problem with closed socket they're trying to connect and fail. So I believe these are just have to be restarted to complete the fixing.

Revision history for this message
Tyler Wilson (loth) wrote : Re: [Bug 1373569] Re: Ceilometer Cant Connect to AMQP after Controller Down
Download full text (11.8 KiB)

I've restarted 'service ceilometer-agent-notification restart', how are the
murano services restarted as they arnt in init.d?

On Thu, Sep 25, 2014 at 11:50 AM, Bogdan Dobrelya <email address hidden>
wrote:

> As far as I can see from current state of the env, Tyler, there is no
> conntrack issues with traffic. The issue is that murano-api, murano-
> engine, ceilometer-agent-notification at all controller nodes have a
> problem with closed socket they're trying to connect and fail. So I
> believe these are just have to be restarted to complete the fixing.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1373569
>
> Title:
> Ceilometer Cant Connect to AMQP after Controller Down
>
> Status in Fuel: OpenStack installer that works:
> In Progress
> Status in Fuel for OpenStack 5.1.x series:
> Triaged
> Status in Fuel for OpenStack 6.0.x series:
> In Progress
>
> Bug description:
> {"build_id": "2014-09-18_06-04-08", "ostf_sha":
> "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "31",
> "auth_required": true, "api": "1.0", "nailgun_sha":
> "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker",
> "fuelmain_sha": "8ef433e939425eabd1034c0b70e90bdf888b69fd",
> "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13",
> "feature_groups": ["experimental"], "release": "5.1",
> "release_versions": {"2014.1.1-5.1": {"VERSION": {"build_id":
> "2014-09-18_06-04-08", "ostf_sha":
> "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "31",
> "api": "1.0", "nailgun_sha":
> "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker",
> "fuelmain_sha": "8ef433e939425eabd1034c0b70e90bdf888b69fd",
> "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13",
> "feature_groups": ["experimental"], "release": "5.1", "fuellib_sha":
> "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"}}}, "fuellib_sha":
> "d9b16846e54f76c8ebe7764d2b5b8231d6b25079"}
>
> 1. Create new environment (Ubuntu, HA mode)
> 2. Choose GRE segmentation
> 3. Add controller x3 + Ceilometer
> 4. Add computes x3 + Ceph OSD
>
> Shutdown primary controller and bring it up again, then attempt to use
> the Resource Usage pages in horizon
>
> ==> /var/log/docker-logs/remote/
> node-70.lax3.ubiquity.io/ceilometer-agent-notification.log <==
> 2014-09-24T19:58:38.372884+01:00 info: 2014-09-24 18:58:38.369 19184
> INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server
> on 127.0.0.1:5673
> 2014-09-24T19:58:38.374048+01:00 info: 2014-09-24 18:58:38.370 19184
> INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0
> seconds...
>
> ==> /var/log/docker-logs/remote/
> node-68.lax3.ubiquity.io/ceilometer-agent-notification.log <==
> 2014-09-24T19:58:40.411124+01:00 info: 2014-09-24 18:58:40.408 10059
> INFO ceilometer.openstack.common.rpc.common [-] Reconnecting to AMQP server
> on 192.168.0.5:5673
> 2014-09-24T19:58:40.412381+01:00 info: 2014-09-24 18:58:40.409 10059
> INFO ceilometer.openstack.common.rpc.common [-] Delaying reconnect for 5.0
> seconds...
>
> ==> /var/log/docker-logs/remote/
> node-7...

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Tyler you can restart them by 'crm resource restart p_foo_name' where the p_foo_name is a name of corresponding resource in 'crm status'

Anyway, looks like the 2nd discovered issue has nothing to the sockets but with amqp channels, i.e. rabbitmq cluster health.
If I issue 'rabbitmqctl list_channels', it hangs for ever - you can check it as well.

Normally, if you restart the rabbitmq cluster with 'crm resource restart master_p_rabbitmq-server' and give him ~5 min to reassemble completely, the problem with hanging list_channels should gone. Note, that that action would cause *complete* downtime for your cloud unless rabbit cluster reassembled. You can check whether rabbit cluster reassembled by 'fuel health --env 3 --check ha'

tags: added: ceilometer
Revision history for this message
Tyler Wilson (loth) wrote :

@Bogdan is the way to patch this in 5.1/stable just adding that single line before env deployment?

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Tyler, you could do this by manually patching by https://review.openstack.org/124121 your /etc/puppet/modules at master node *before* to start new deployments.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The full path is /etc/puppet/modules/nova/files/ocf/rabbitmq

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Tyler, the suggested fix affects blocking rule logic and should be changed as appropriate:
By design, an iptables blocking rule should persist from the moment then start_rmq_server_app() starts and unless it exits.

Putting unblock in the middle of this function, as my initial patch suggests, is a poor workaround breaking the block/unblock implementation.

The proper one, perhaps, would be to put unblock right into the start, just before to apply the blocking rule as well.

I analyzed the logs attached and found the "moment of truth":
2014-09-24T01:42:30.328672 node-69 ./node-69.lax3.ubiquity.io/rabbitmq-server.log:2014-09-24T01:42:30.328672+01:00 info: INFO: p_rabbitmq-server: start_rmq_server_app(): begin.
(action failed by timeout, there is no "end." event, hence, iptables blocking rule was not removed!)
2014-09-24T01:45:48.730049 node-69 ./node-69.lax3.ubiquity.io/rabbitmq-server.log:2014-09-24T01:45:48.730049+01:00 info: INFO: p_rabbitmq-server: start_rmq_server_app(): begin.
(2nd blocking rule added...)

If we apply the suggested solution - put "safe" unblock prior to the blocking, that would resolve the idempotency issue.
There is an another way - make block and unblock calls idempotent by checking if the rule already exists, but I guess it would complicate the OCF script logic even more and make it more error prone... Anyway, the "safe unblock" solutions works as well.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Well, I decided to combine both solutions as well :) See an updated patch https://review.openstack.org/#/c/124121/

Revision history for this message
Tyler Wilson (loth) wrote :
Download full text (11.6 KiB)

@Bogdan, After applying your patch it looks to still be having issues after a Controller failover;

2014-10-07T19:50:52.664160+01:00 warning: 2014-10-07 18:50:52.664 1940 AUDIT ceilometer.publisher.rpc [-] Publishing 16 samples on metering
2014-10-07T19:50:52.669765+01:00 warning: 2014-10-07 18:50:52.669 1940 AUDIT ceilometer.pipeline [-] Pipeline meter_sink: Published samples
2014-10-07T19:50:52.670846+01:00 info: 2014-10-07 18:50:52.670 1940 INFO ceilometer.agent [-] Polling pollster switch.port.receive.packets
2014-10-07T19:50:52.671906+01:00 info: 2014-10-07 18:50:52.670 1940 INFO ceilometer.agent [-] Polling pollster switch.flow.packets
2014-10-07T19:50:52.671994+01:00 info: 2014-10-07 18:50:52.670 1940 INFO ceilometer.agent [-] Polling pollster switch.port.transmit.bytes
2014-10-07T19:50:52.671994+01:00 info: 2014-10-07 18:50:52.671 1940 INFO ceilometer.agent [-] Polling pollster storage.objects
2014-10-07T19:50:52.675931+01:00 warning: 2014-10-07 18:50:52.675 1940 WARNING ceilometer.agent [-] Continue after error from storage.objects: Account HEAD failed: http://23.109.32.2:8080/v1/AUTH_6c5c801c896d4c41831fa2ec1faa002c 400 Bad Request
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent Traceback (most recent call last):
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/agent.py", line 90, in poll_and_publish
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent resources=source_resources or agent_resources,
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/objectstore/swift.py", line 92, in get_samples
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent for tenant, account in self._iter_accounts(manager.keystone, cache):
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/objectstore/swift.py", line 61, in _iter_accounts
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent ksclient, cache))
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/objectstore/swift.py", line 77, in _get_account_info
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent ksclient.auth_token))
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 426, in head_account
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent http_response_content=body)
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent ClientException: Account HEAD failed: http://23.109.32.2:8080/v1/AUTH_6c5c801c896d4c41831fa2ec1faa002c 400 Bad Request
2014-10-07 18:50:52.675 1940 TRACE ceilometer.agent
2014-10-07T19:50:52.677008+01:00 info: 2014-10-07 18:50:52.676 1940 INFO ceilometer.agent [-] Polling pollster image.size
2014-10-07T19:50:52.686693+01:00 warning: 2014-10-07 18:50:52.686 1940 AUDIT ceilometer.pipeline [-] Pipeline meter_sink: Publishing samples
2014-10-07T19:50:52.688321+01:00 warning: 2014-10-07 18:50:52.688 1940 AUDIT ceilometer.publisher.rpc [-] Publishing 16 samples on metering
2014-10-07T19:50:52.691010+01:00 warning: 2014-10-07 18:50:52.691 1940 AUDIT ceilometer.pipeline [-] Pi...

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

This issue looks completely different, so I have to find some ceilometer and, perhaps, swift experts...

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@Tyler, as afar as I can see the /usr/lib/ocf/resource.d/mirantis/rabbitmq-server at controller nodes are not completely patched according to https://review.openstack.org/#/c/124121/ - there are some more new patchsets, and 5th one likely to be the last one.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I applied ocf patches at controllers and restarted ceilometer-notification-agents, now they can connect to amqp w/o any issues.

Before restarting the agents, I noticed that lsof showed for every agent which wasn't able to reconnect to amqp ~400 sockets in use like:
ceilomete 2296 ceilometer 422u sock 0,7 0t0 815215954 can't identify protocol
That is quite a strange behavior, looks like ceilo agents have some re-connection handling issues inside their python code.

So, I will assign MOS-ceilometer guys to troubleshoot this one as well.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Regarding these tracebacks from ceilometer-agent-central:
TRACE ceilometer.agent ClientException: Account GET failed: http://23.109.32.2:8080/v1/AUTH_6c5c801c896d4c41831fa2ec1faa002c?format=json 400 Bad Request [first 60 chars of response] <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidBu

I still have no idea that is the RC for them. I suggest to restart this one as well.
Note that you should use 'crm resource restart p_ceilometer-agent-central' to accomplish that.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I can see central agent running at node-2 has some issues with connecting nova endpoint
tcp CLOSE-WAIT 1 0 23.109.32.4:56639 23.109.32.2:8774 users:(("ceilometer-agen",32510,7))

this one hangs in close-wait. I believe restart should resolve the issue, but it would be nice to figure out the root cause before to do so...

Revision history for this message
Tyler Wilson (loth) wrote :
Download full text (14.1 KiB)

@Bogdan I am still getting an error loading IP/horizon/admin/metering/ Is this because of the hang in close-wait?

==> /var/log/docker-logs/remote/node-1.lax3.ubiquity.io/ceilometer-alarm-evaluator.log <==
2014-10-08T18:20:34.262950+01:00 warning: 2014-10-08 17:20:34.254 5260 WARNING ceilometerclient.common.http [-] Request returned failure status.
2014-10-08T18:20:34.264554+01:00 err: 2014-10-08 17:20:34.255 5260 ERROR ceilometer.alarm.service [-] alarm evaluation cycle failed
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service Traceback (most recent call last):
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service File "/usr/lib/python2.7/dist-packages/ceilometer/alarm/service.py", line 91, in _evaluate_assigned_alarms
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service alarms = self._assigned_alarms()
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service File "/usr/lib/python2.7/dist-packages/ceilometer/alarm/service.py", line 134, in _assigned_alarms
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service 'value': True}])
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service File "/usr/lib/python2.7/dist-packages/ceilometerclient/v2/alarms.py", line 71, in list
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service return self._list(options.build_url(self._path(), q))
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service File "/usr/lib/python2.7/dist-packages/ceilometerclient/common/base.py", line 58, in _list
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service resp, body = self.api.json_request('GET', url)
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service File "/usr/lib/python2.7/dist-packages/ceilometerclient/common/http.py", line 191, in json_request
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service resp, body_iter = self._http_request(url, method, **kwargs)
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service File "/usr/lib/python2.7/dist-packages/ceilometerclient/common/http.py", line 174, in _http_request
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service raise exc.from_response(resp, ''.join(body_iter))
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service HTTPBadGateway: HTTPBadGateway (HTTP 502)
2014-10-08 17:20:34.255 5260 TRACE ceilometer.alarm.service

==> /var/log/docker-logs/remote/node-1.lax3.ubiquity.io/ceilometer-agent-central.log <==
2014-10-08T18:21:05.748833+01:00 warning: 2014-10-08 17:21:05.739 1940 AUDIT ceilometer.pipeline [-] Pipeline meter_sink: Publishing samples
2014-10-08T18:21:05.751635+01:00 warning: 2014-10-08 17:21:05.743 1940 AUDIT ceilometer.publisher.rpc [-] Publishing 16 samples on metering
2014-10-08T18:21:05.754505+01:00 warning: 2014-10-08 17:21:05.746 1940 AUDIT ceilometer.pipeline [-] Pipeline meter_sink: Published samples
2014-10-08T18:21:05.755592+01:00 info: 2014-10-08 17:21:05.746 1940 INFO ceilometer.agent [-] Polling pollster switch.port.receive.packets
2014-10-08T18:21:05.756700+01:00 info: 2014-10-08 17:21:05.746 1940 INFO ceilometer.agent [-] Polling pollster switch.flow.packets
2014-10-08T18:21:05.756821+01:00 info: 2014-10-08 17:21:05.747 1940 I...

Changed in fuel:
assignee: MOS Ceilometer (mos-ceilometer) → Bogdan Dobrelya (bogdando)
Revision history for this message
Tyler Wilson (loth) wrote :

@Bogdan, Applied your latest patch on build

 {"build_id": "2014-10-10_05-20-01", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "60", "auth_required": true, "api": "1.0", "nailgun_sha": "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker", "fuelmain_sha": "7308d5d780b8424f6da1a08873db6b9c9f9bb07b", "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13", "feature_groups": ["experimental"], "release": "5.1.1", "release_versions": {"2014.1.1-5.1": {"VERSION": {"build_id": "2014-10-10_05-20-01", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "60", "api": "1.0", "nailgun_sha": "eb8f2b358ea4bb7eb0b2a0075e7ad3d3a905db0d", "production": "docker", "fuelmain_sha": "7308d5d780b8424f6da1a08873db6b9c9f9bb07b", "astute_sha": "f5fbd89d1e0e1f22ef9ab2af26da5ffbfbf24b13", "feature_groups": ["experimental"], "release": "5.1.1", "fuellib_sha": "46ad455514614ec2600314ac80191e0539ddfc04"}}}, "fuellib_sha": "46ad455514614ec2600314ac80191e0539ddfc04"}

Shutdown primary controller, waited 5 mins and was able to load ceilometer resource report on horizon. Started primary controller back up and waited 5 mins and was still able to pull the report. Looks like this fixed it.

Revision history for this message
Tyler Wilson (loth) wrote :
Download full text (7.2 KiB)

However the Swift errors remain,

2014-10-13T19:48:28.279010+01:00 warning: 2014-10-13 18:48:28.280 1954 WARNING ceilometer.agent [-] Continue after error from storage.objects: Account HEAD failed: http://23.109.32.2:8080/v1/AUTH_767a480426d541c0b873d6a7a04eb561 400 Bad Request
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent Traceback (most recent call last):
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/agent.py", line 90, in poll_and_publish
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent resources=source_resources or agent_resources,
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/objectstore/swift.py", line 92, in get_samples
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent for tenant, account in self._iter_accounts(manager.keystone, cache):
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/objectstore/swift.py", line 61, in _iter_accounts
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent ksclient, cache))
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/ceilometer/objectstore/swift.py", line 77, in _get_account_info
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent ksclient.auth_token))
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent File "/usr/lib/python2.7/dist-packages/swiftclient/client.py", line 426, in head_account
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent http_response_content=body)
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent ClientException: Account HEAD failed: http://23.109.32.2:8080/v1/AUTH_767a480426d541c0b873d6a7a04eb561 400 Bad Request
2014-10-13 18:48:28.280 1954 TRACE ceilometer.agent
2014-10-13T19:48:28.280104+01:00 info: 2014-10-13 18:48:28.281 1954 INFO ceilometer.agent [-] Polling pollster switch
2014-10-13T19:48:28.281171+01:00 info: 2014-10-13 18:48:28.282 1954 INFO ceilometer.agent [-] Polling pollster switch.flow.duration.nanoseconds
2014-10-13T19:48:28.281238+01:00 info: 2014-10-13 18:48:28.282 1954 INFO ceilometer.agent [-] Polling pollster hardware.cpu.load.5min
2014-10-13T19:48:28.281238+01:00 info: 2014-10-13 18:48:28.282 1954 INFO ceilometer.agent [-] Polling pollster switch.port.receive.frame_error
2014-10-13T19:48:28.282264+01:00 info: 2014-10-13 18:48:28.282 1954 INFO ceilometer.agent [-] Polling pollster hardware.cpu.load.1min
2014-10-13T19:48:28.282326+01:00 info: 2014-10-13 18:48:28.283 1954 INFO ceilometer.agent [-] Polling pollster switch.table.active.entries
2014-10-13T19:48:28.282326+01:00 info: 2014-10-13 18:48:28.283 1954 INFO ceilometer.agent [-] Polling pollster hardware.cpu.load.15min
2014-10-13T19:48:28.282326+01:00 info: 2014-10-13 18:48:28.283 1954 INFO ceilometer.agent [-] Polling pollster switch.table
2014-10-13T19:48:28.283374+01:00 info: 2014-10-13 18:48:28.283 1954 INFO ceilometer.agent [-] Polling pollster switch.port.receive.bytes
2014-10-13T19:48:28.283429+01:00 info: 2014-10-13 18:48:28.284 1954 INFO ceilometer.agent [-] Polling pollster switch.port.transmit.drops
2014-10-13T19:48:28.283429+01:00 info: ...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/124121
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=f390f336683250ebd2cf41110edab33f4ed5ef80
Submitter: Jenkins
Branch: master

commit f390f336683250ebd2cf41110edab33f4ed5ef80
Author: Bogdan Dobrelya <email address hidden>
Date: Thu Sep 25 20:55:43 2014 +0300

    Fix blocking reject rule for rabbit ocf

    * Make RMQ unblock call safe (remove all discovered RMQ
      blocking rules, if there are many of them).
    * Use unblock safe call prior to the blocking one.
    * Make block call idempotent and add 5 retries for iptables.
    * Add info log messages about block/unblock actions. Notify
      if RMQ blocking rule cannot be added for some strange reason.

    Partial-bug: #1373569
    Closes-bug: #1375824

    Change-Id: I46c6bf3c83ada4273eaa05530e80886ebac7e75f
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/5.1)

Fix proposed to branch: stable/5.1
Review: https://review.openstack.org/128305

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

MOS-team, please address reconnect issues for ceilometer agents https://bugs.launchpad.net/fuel/+bug/1373569/comments/25

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/5.1)

Reviewed: https://review.openstack.org/128305
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=9be38a613486eb115050d8d165ce212d4a185f35
Submitter: Jenkins
Branch: stable/5.1

commit 9be38a613486eb115050d8d165ce212d4a185f35
Author: Bogdan Dobrelya <email address hidden>
Date: Thu Sep 25 20:55:43 2014 +0300

    Fix blocking reject rule for rabbit ocf

    * Make RMQ unblock call safe (remove all discovered RMQ
      blocking rules, if there are many of them).
    * Use unblock safe call prior to the blocking one.
    * Make block call idempotent and add 5 retries for iptables.
    * Add info log messages about block/unblock actions. Notify
      if RMQ blocking rule cannot be added for some strange reason.

    Partial-bug: #1373569
    Closes-bug: #1375824

    Change-Id: I46c6bf3c83ada4273eaa05530e80886ebac7e75f
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Closed due to separate bug created to track ceilometer reconnect issues https://bugs.launchpad.net/fuel/+bug/1380800

Revision history for this message
Vadim Rovachev (vrovachev) wrote :

{"build_id": "2014-11-20_21-01-00", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "28", "auth_required": true, "api": "1.0", "nailgun_sha": "7580f6341a726c2019f880ae23ff3f1c581fd850", "production": "docker", "fuelmain_sha": "eac9e2704424d1cb3f183c9f74567fd42a1fa6f3", "astute_sha": "51087c92a50be982071a074ff2bea01f1a5ddb76", "feature_groups": ["mirantis"], "release": "5.1.1", "release_versions": {"2014.1.3-5.1.1": {"VERSION": {"build_id": "2014-11-20_21-01-00", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "28", "api": "1.0", "nailgun_sha": "7580f6341a726c2019f880ae23ff3f1c581fd850", "production": "docker", "fuelmain_sha": "eac9e2704424d1cb3f183c9f74567fd42a1fa6f3", "astute_sha": "51087c92a50be982071a074ff2bea01f1a5ddb76", "feature_groups": ["mirantis"], "release": "5.1.1", "fuellib_sha": "b3d9f0e203f2f0faf3763e871a8dc31570777fed"}}}, "fuellib_sha": "b3d9f0e203f2f0faf3763e871a8dc31570777fed"}

Can't verified, because find a bug: https://bugs.launchpad.net/fuel/+bug/1395078

Revision history for this message
Vadim Rovachev (vrovachev) wrote :

verified on:
{"build_id": "2014-11-30_22-41-00", "ostf_sha": "dc66fd39d4d035bb972e4c0225591290593c459d", "build_number": "25", "auth_required": true, "api": "1.0", "nailgun_sha": "58e5f47457a0e832c005ce350e01b75a0c01b90a", "production": "docker", "fuelmain_sha": "f324b592399c544eace2f64cb499564da01ab38c", "astute_sha": "1da516b88d1a8d0014d78ab0d796e5b08379a59b", "feature_groups": ["mirantis"], "release": "6.0", "release_versions": {"2014.2-6.0": {"VERSION": {"build_id": "2014-11-30_22-41-00", "ostf_sha": "dc66fd39d4d035bb972e4c0225591290593c459d", "build_number": "25", "api": "1.0", "nailgun_sha": "58e5f47457a0e832c005ce350e01b75a0c01b90a", "production": "docker", "fuelmain_sha": "f324b592399c544eace2f64cb499564da01ab38c", "astute_sha": "1da516b88d1a8d0014d78ab0d796e5b08379a59b", "feature_groups": ["mirantis"], "release": "6.0", "fuellib_sha": "bbf26b499bf47ca41302ba6f62c3ebc5a493013d"}}}, "fuellib_sha": "bbf26b499bf47ca41302ba6f62c3ebc5a493013d"}

Revision history for this message
Stanislav Makar (smakar) wrote :

Verifying 5.1.1

Revision history for this message
Stanislav Makar (smakar) wrote :

verified
{"build_id": "2014-12-03_01-07-36", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "48", "auth_required": true, "api": "1.0", "nailgun_sha": "500e36d08a45dbb389bf2bd97673d9bff48ee84d", "production": "docker", "fuelmain_sha": "7626c5aeedcde77ad22fc081c25768944697d404", "astute_sha": "ef8aa0fd0e3ce20709612906f1f0551b5682a6ce", "feature_groups": ["mirantis"], "release": "5.1.1", "release_versions": {"2014.1.3-5.1.1": {"VERSION": {"build_id": "2014-12-03_01-07-36", "ostf_sha": "64cb59c681658a7a55cc2c09d079072a41beb346", "build_number": "48", "api": "1.0", "nailgun_sha": "500e36d08a45dbb389bf2bd97673d9bff48ee84d", "production": "docker", "fuelmain_sha": "7626c5aeedcde77ad22fc081c25768944697d404", "astute_sha": "ef8aa0fd0e3ce20709612906f1f0551b5682a6ce", "feature_groups": ["mirantis"], "release": "5.1.1", "fuellib_sha": "a3043477337b4a0a8fd166dc83d6cd5d504f5da8"}}}, "fuellib_sha": "a3043477337b4a0a8fd166dc83d6cd5d504f5da8"}

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

An update. Ceilometer agent AMQP reconnect issues are addressed by this separate bug https://bugs.launchpad.net/mos/+bug/1393505

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.