2015-08-17 18:29:30 |
Ryan Beisner |
bug |
|
|
added bug |
2015-08-17 18:30:22 |
Ryan Beisner |
summary |
rmq on > vivid has mnesia |
rmq on >= vivid has mnesia |
|
2015-08-17 18:30:42 |
Ryan Beisner |
summary |
rmq on >= vivid has mnesia |
rmq on >= vivid has mnesia (no data dir) |
|
2015-08-17 19:19:11 |
Ryan Beisner |
description |
For Vivid-Kilo (and presumably later), the /var/lib/rabbitmq/data/ dir does not exist. This definitely impacts nrpe checks, potentially other things.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data. After basic rmq cluster, config and relations are validated, amqp messaging and queue replication are functionally tested with and without ssl, and nrpe checks are fired then checked.
# amulet test (this passes Trusty I|J|K, fails Vivid K):
2015-08-17 18:11:56,220 run_cmd_unit DEBUG: rabbitmq-server/2 `bash -c "$(egrep -oh /usr/local.* /etc/nagios/nrpe.d/check_rabbitmq.cfg)"` command returned 0 (OK)
2015-08-17 18:11:56,220 test_414_rmq_nrpe_monitors DEBUG: Output: Ok: sent and received 10 test messages
2015-08-17 18:11:56,221 test_414_rmq_nrpe_monitors DEBUG: Sleeping 70s for 1m cron job to run...
2015-08-17 18:13:06,291 test_414_rmq_nrpe_monitors DEBUG: Checking nrpe monitor check_rabbitmq_queue on rabbitmq-server/0...
rabbitmq-server/0 `bash -c "$(egrep -oh /usr/local.* /etc/nagios/nrpe.d/check_rabbitmq_queue.cfg)"` command returned 1 Traceback (most recent call last):
File "/usr/local/lib/nagios/plugins/check_rabbitmq_queues.py", line 80, in <module>
stats_collated = collate_stats(stats, args.c)
File "/usr/local/lib/nagios/plugins/check_rabbitmq_queues.py", line 36, in collate_stats
for vhost, queue, m_all in stats:
File "/usr/local/lib/nagios/plugins/check_rabbitmq_queues.py", line 21, in gen_stats
for line in data_lines:
File "/usr/local/lib/nagios/plugins/check_rabbitmq_queues.py", line 14, in gen_data_lines
with open(filename, "rb") as fin:
IOError: [Errno 2] No such file or directory: '/var/lib/rabbitmq/data/juju-beis0-machine-2_queue_stats.dat'
ERROR subprocess encountered error code 1
# /var/lib/rabbitmq/data/ dir not present, mnesia/ dir exists instead
root@juju-beis0-machine-2:/var/lib/rabbitmq# ls -alhR
.:
total 16K
drwxrwxr-x 3 rabbitmq rabbitmq 4.0K Aug 17 17:50 .
drwxr-xr-x 50 root root 4.0K Aug 17 17:51 ..
-r-------- 1 rabbitmq rabbitmq 20 Aug 17 00:00 .erlang.cookie
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:11 mnesia
./mnesia:
total 20K
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:11 .
drwxrwxr-x 3 rabbitmq rabbitmq 4.0K Aug 17 17:50 ..
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:14 rabbit@juju-beis0-machine-2
-rw-r--r-- 1 rabbitmq rabbitmq 6 Aug 17 18:11 rabbit@juju-beis0-machine-2.pid
drwxr-xr-x 2 rabbitmq rabbitmq 4.0K Aug 17 18:11 rabbit@juju-beis0-machine-2-plugins-expand
./mnesia/rabbit@juju-beis0-machine-2:
total 112K
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:14 .
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:11 ..
-rw-r--r-- 1 rabbitmq rabbitmq 187 Aug 17 18:11 cluster_nodes.config
-rw-r--r-- 1 rabbitmq rabbitmq 170 Aug 17 18:14 DECISION_TAB.LOG
-rw-r--r-- 1 rabbitmq rabbitmq 106 Aug 17 18:14 LATEST.LOG
drwxr-xr-x 2 rabbitmq rabbitmq 4.0K Aug 17 18:11 msg_store_persistent
drwxr-xr-x 2 rabbitmq rabbitmq 4.0K Aug 17 18:11 msg_store_transient
-rw-r--r-- 1 rabbitmq rabbitmq 93 Aug 17 18:11 nodes_running_at_shutdown
-rw-r--r-- 1 rabbitmq rabbitmq 6.2K Aug 17 18:11 rabbit_durable_exchange.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 104 Aug 17 18:11 rabbit_durable_queue.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 104 Aug 17 18:11 rabbit_durable_route.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 1.2K Aug 17 18:11 rabbit_runtime_parameters.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 4 Aug 17 18:11 rabbit_serial
-rw-r--r-- 1 rabbitmq rabbitmq 498 Aug 17 18:11 rabbit_user.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 719 Aug 17 18:11 rabbit_user_permission.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 366 Aug 17 18:11 rabbit_vhost.DCD
-rw-r--r-- 1 rabbitmq rabbitmq 5.4K Aug 17 18:11 recovery.dets
-rw-r--r-- 1 rabbitmq rabbitmq 26K Aug 17 17:51 schema.DAT
-rw-r--r-- 1 rabbitmq rabbitmq 263 Aug 17 17:50 schema_version
./mnesia/rabbit@juju-beis0-machine-2/msg_store_persistent:
total 8.0K
drwxr-xr-x 2 rabbitmq rabbitmq 4.0K Aug 17 18:11 .
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:14 ..
-rw-r--r-- 1 rabbitmq rabbitmq 0 Aug 17 18:11 0.rdq
./mnesia/rabbit@juju-beis0-machine-2/msg_store_transient:
total 8.0K
drwxr-xr-x 2 rabbitmq rabbitmq 4.0K Aug 17 18:11 .
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:14 ..
-rw-r--r-- 1 rabbitmq rabbitmq 0 Aug 17 18:11 0.rdq
./mnesia/rabbit@juju-beis0-machine-2-plugins-expand:
total 8.0K
drwxr-xr-x 2 rabbitmq rabbitmq 4.0K Aug 17 18:11 .
drwxr-xr-x 4 rabbitmq rabbitmq 4.0K Aug 17 18:11 ..
# juju stat
[Services]
NAME STATUS EXPOSED CHARM
cinder unknown false local:vivid/cinder-136
nrpe false local:vivid/nrpe-0
rabbitmq-server unknown false local:vivid/rabbitmq-server-150
[Units]
ID WORKLOAD-STATE AGENT-STATE VERSION MACHINE PORTS PUBLIC-ADDRESS MESSAGE
cinder/0 unknown idle 1.24.5 1 172.18.99.77
rabbitmq-server/0 unknown idle 1.24.5 2 5671/tcp,5672/tcp,5999/tcp 172.18.99.78
nrpe/1 unknown idle 1.24.5 172.18.99.78
rabbitmq-server/1 unknown idle 1.24.5 3 5671/tcp,5672/tcp,5999/tcp 172.18.99.79
nrpe/0 unknown idle 1.24.5 172.18.99.79
rabbitmq-server/2 unknown idle 1.24.5 4 5671/tcp,5672/tcp,5999/tcp 172.18.99.80
nrpe/2 unknown idle 1.24.5 172.18.99.80
[Machines]
ID STATE VERSION DNS INS-ID SERIES HARDWARE
0 started 1.24.5 172.18.99.76 9ee95b18-293e-4ecf-be17-ee2d3db9e536 trusty arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M availability-zone=nova
1 started 1.24.5 172.18.99.77 6aa5dbd6-6446-4354-afca-0e45b69068e1 vivid arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M availability-zone=nova
2 started 1.24.5 172.18.99.78 2e0277c6-ad6d-487b-80f9-40d17c71f332 vivid arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M availability-zone=nova
3 started 1.24.5 172.18.99.79 4265f59e-b003-4149-b8ca-4071a9af1f02 vivid arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M availability-zone=nova
4 started 1.24.5 172.18.99.80 77d5e739-aef9-4ce1-a575-b51e0e220611 vivid arch=amd64 cpu-cores=1 mem=1536M root-disk=10240M availability-zone=nova
# rabbitmq-server version on units
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
For Vivid-Kilo (and presumably later), the /var/lib/rabbitmq/data/ dir does not exist. This definitely impacts nrpe checks, potentially other things.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data. After basic rmq cluster, config and relations are validated, amqp messaging and queue replication are functionally tested with and without ssl, and nrpe checks are fired then checked.
IOError: [Errno 2] No such file or directory: '/var/lib/rabbitmq/data/juju-beis0-machine-2_queue_stats.dat'
ERROR subprocess encountered error code 1
Details @ http://paste.ubuntu.com/12110980/ |
|
2015-08-18 17:39:42 |
Ryan Beisner |
bug task added |
|
nrpe (Juju Charms Collection) |
|
2015-08-25 15:36:01 |
David Ames |
nrpe (Juju Charms Collection): status |
New |
Invalid |
|
2015-08-25 15:36:05 |
David Ames |
rabbitmq-server (Juju Charms Collection): status |
New |
Invalid |
|
2015-08-26 02:17:26 |
Ryan Beisner |
rabbitmq-server (Juju Charms Collection): status |
Invalid |
New |
|
2015-08-27 00:35:34 |
Ryan Beisner |
branch linked |
|
lp:~1chb1n/charms/trusty/rabbitmq-server/lp1485722-pidfile |
|
2015-08-27 01:20:22 |
Ryan Beisner |
summary |
rmq on >= vivid has mnesia (no data dir) |
rmq + nrpe on >= Vivid pid file location changed |
|
2015-08-27 01:34:08 |
Ryan Beisner |
summary |
rmq + nrpe on >= Vivid pid file location changed |
rmq + nrpe on >= Vivid No PID file found |
|
2015-08-27 01:37:04 |
Ryan Beisner |
description |
For Vivid-Kilo (and presumably later), the /var/lib/rabbitmq/data/ dir does not exist. This definitely impacts nrpe checks, potentially other things.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data. After basic rmq cluster, config and relations are validated, amqp messaging and queue replication are functionally tested with and without ssl, and nrpe checks are fired then checked.
IOError: [Errno 2] No such file or directory: '/var/lib/rabbitmq/data/juju-beis0-machine-2_queue_stats.dat'
ERROR subprocess encountered error code 1
Details @ http://paste.ubuntu.com/12110980/ |
For Vivid-Kilo (and presumably later), the rabbitmq pid file is in a different location than earlier versions. The script in the cron job errors out, but that is not evident unless the cron fail mail is inspected:
Return-Path: <root@juju-beis0-machine-2.openstacklocal>
X-Original-To: root
Delivered-To: root@juju-beis0-machine-2.openstacklocal
Received: by juju-beis0-machine-2.openstacklocal (Postfix, from userid 0)
id 41CF73E528; Wed, 26 Aug 2015 01:38:01 +0000 (UTC)
From: root@juju-beis0-machine-2.openstacklocal (Cron Daemon)
To: root@juju-beis0-machine-2.openstacklocal
Subject: Cron <root@juju-beis0-machine-2> /usr/local/bin/collect_rabbitmq_stats.sh
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8bit
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
Message-Id: <20150826013801.41CF73E528@juju-beis0-machine-2.openstacklocal>
Date: Wed, 26 Aug 2015 01:38:01 +0000 (UTC)
No PID file found
The bubbles up as an error to the user that the /var/lib/rabbitmq/data/ dir does not exist. This definitely impacts nrpe checks, potentially other things.
It affects next and stable and can be considered a high-priority deployment blocker for Vivid (and possibly Wily).
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data. After basic rmq cluster, config and relations are validated, amqp messaging and queue replication are functionally tested with and without ssl, and nrpe checks are fired then checked.
IOError: [Errno 2] No such file or directory: '/var/lib/rabbitmq/data/juju-beis0-machine-2_queue_stats.dat'
ERROR subprocess encountered error code 1
Details @ http://paste.ubuntu.com/12110980/
And @ http://paste.ubuntu.com/12196571/ |
|
2015-08-27 01:46:56 |
Ryan Beisner |
rabbitmq-server (Juju Charms Collection): assignee |
|
Ryan Beisner (1chb1n) |
|
2015-09-02 03:11:50 |
Ryan Beisner |
rabbitmq-server (Juju Charms Collection): status |
New |
Fix Committed |
|
2015-11-04 10:45:14 |
Edward Hope-Morley |
rabbitmq-server (Juju Charms Collection): status |
Fix Committed |
Fix Released |
|
2015-11-04 10:45:37 |
Edward Hope-Morley |
rabbitmq-server (Juju Charms Collection): milestone |
|
15.10 |
|