Gnocchi fails to receive metrics after loading gnocchi:// publisher fails

Bug #2033036 reported by Yadnesh Kulkarni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
New
Undecided
Yadnesh Kulkarni

Bug Description

The archive policy used in the configuration is `ceilometer-high`
~~~
# cat pipeline.yaml
---
sources:
    - name: meter_source
      meters:
          - "*"
      sinks:
          - meter_sink
sinks:
    - name: meter_sink
      publishers:
          - gnocchi://?filter_project=service&archive_policy=ceilometer-high
          - notifier://172.17.1.73:5666/?driver=amqp&topic=osp17-metering

# cat event_pipeline.yaml
---
sources:
    - name: event_source
      events:
          - "*"
      sinks:
          - event_sink
sinks:
    - name: event_sink
      transformers:
      triggers:
      publishers:
          - gnocchi://?filter_project=service&archive_policy=ceilometer-high
          - notifier://172.17.1.73:5666/?driver=amqp&topic=osp17-event
~~~

No such archive policy exists in gnocchi which should've been generated during "ceilometer-upgrade".
However, it doesn't complain/log anything about the incoming metrics having an undefined archive policy.
~~~
$ openstack metric archive-policy list
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
| name | back_window | definition | aggregation_methods |
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
| bool | 3600 | - timespan: 365 days, 0:00:00, granularity: 0:00:01, points: 31536000 | last |
| high | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | min, mean, count, max, sum, std |
| | | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| low | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | min, mean, count, max, sum, std |
| medium | 0 | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | min, mean, count, max, sum, std |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
~~~

Upon restarting notification agent on one of the ctrl nodes, the missing policies were created after which Gnocchi starts processing metrics
~~~
$ openstack metric archive-policy list
+----------------------+-------------+-----------------------------------------------------------------------+---------------------------------+
| name | back_window | definition | aggregation_methods |
+----------------------+-------------+-----------------------------------------------------------------------+---------------------------------+
| bool | 3600 | - timespan: 365 days, 0:00:00, granularity: 0:00:01, points: 31536000 | last |
| ceilometer-high | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | mean |
| | | - timespan: 1 day, 0:00:00, granularity: 0:01:00, points: 1440 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| ceilometer-high-rate | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | mean, rate:mean |
| | | - timespan: 1 day, 0:00:00, granularity: 0:01:00, points: 1440 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| ceilometer-low | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | mean |
| ceilometer-low-rate | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | mean, rate:mean |
| high | 0 | - timespan: 1:00:00, granularity: 0:00:01, points: 3600 | mean, count, max, min, sum, std |
| | | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
| low | 0 | - timespan: 30 days, 0:00:00, granularity: 0:05:00, points: 8640 | mean, count, max, min, sum, std |
| medium | 0 | - timespan: 7 days, 0:00:00, granularity: 0:01:00, points: 10080 | mean, count, max, min, sum, std |
| | | - timespan: 365 days, 0:00:00, granularity: 1:00:00, points: 8760 | |
+----------------------+-------------+-----------------------------------------------------------------------+---------------------------------+
~~~

Revision history for this message
Yadnesh Kulkarni (ykulkarn) wrote :
Download full text (3.2 KiB)

It seems that during deployment, keystone didn't respond to ceilometer's request to obtain gnocchi endpoint using gnocchiclient [1]
~~~
2023-05-08 18:33:49.147 14 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting http://172.17.1.82:5000. Attempting to parse version from URL.: keystoneauth1.exceptions.connection.ConnectTimeout: Request to http://172.17.1.82:5000 timed out
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base [-] Unable to load publisher gnocchi://?filter_project=service&archive_policy=ceilometer-high: keystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Request to http://172.17.1.82:5000 timed out
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base Traceback (most recent call last):
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base six.raise_from(e, None)
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "<string>", line 3, in raise_from
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base httplib_response = conn.getresponse()
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/http/client.py", line 1377, in getresponse
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base response.begin()
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/http/client.py", line 320, in begin
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base version, status, reason = self._read_status()
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/http/client.py", line 281, in _read_status
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base File "/usr/lib64/python3.9/socket.py", line 704, in readinto
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base return self._sock.recv_into(b)
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base socket.timeout: timed out
2023-05-08 18:33:49.150 14 ERROR ceilometer.pipeline.base
~~~

Since ceilometer couldn't get gnocchiclient[2] with proper auth values, it couldn't create the necessary archive policies[3]

Restarting agent_notification service after deployment fixes this because by that time keystone is healthy and responding. This seems intermittent because ceilometer & gnocchi services
are spawned during step 4 & 5 till then keystone should be completely operational.

[1] https://github.com/openstack/ceilometer/blob/stable/wallaby/ceilometer/gnocchi_client.py#L36-L39
[2] https://github.com/openstack/ceilometer/blob/stable/wallaby/ceilometer/publisher/gnocchi.py#L216-L217
[3] https://github.com/openstack/ceilometer/blob/stable...

Read more...

Changed in ceilometer:
assignee: nobody → Yadnesh Kulkarni (ykulkarn)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.