ipmi sensor naming in ceilometer is not consumer friendly

Bug #1377157 reported by Jim Mankovich
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
High
Chris Dent
Ironic
Won't Fix
Undecided
Unassigned

Bug Description

ipmi sensor naming in ceilometer is not consumer friendly

I've found the IPMI ceilometer sensor naming structure difficult to deal with from a programmatic perspective if a consumer wants to simply read all the sensors for a given Ironic Node. The IPMI sensor implementation names the ceilometer meters generically as "hardware.ipm.sensortype" and the resource ID in ceilometer is a concatenation of Ironic Node UUID followed by a dash and a ceilometer capable transformation of the ipmitool reported "Sensor ID". With this structure, there is no simple ceilometer API available to return all the sensors within a platform. The only way I could think of to find all the sensors for a given Ironic Node UUID is to query ceilometer for every resource and then hand match the resource ID text string with the Ironic Node UUID. If there are a lot of platforms, the number of Resource IDs will be very large and the returned resource ID list could easily be huge. Another interesting thing to ponder is what happens when you attempt to list all the samples for the meter named say "hardware.ipmi.temperature". Listing all the samples for "hardware.ipmi.temperature" will attempt to return you every sensor sample for every temperature sensor ever captured since the beginning of time.

Here is an example output from ceilometer meter-list with the current IPMI sensor naming scheme.
| Name | Type | Unit | Resource ID
| hardware.ipmi.current | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0-power_meter_(0x16)
| hardware.ipmi.temperature | gauge | C | edafe6f4-5996-4df8-bc84-7d92439e15c0-16-system_board_(0x15)

I believe a better approach would be to name each ceilometer meter uniquely within a platform and assign each meter a Resource ID equal to the Ironic Node UUID that the meter is associated with. This naming would enable the consumer to query ceilometer for a specific Resource ID which matches the Ironic Node UUID. This query would result in ceilometer returning a single resource containing direct links to all the sensors associated with the resource. Here is an example of a different ceilometer naming scheme which enables ceilometer querying for a specific Ironic Nodes platform information.

| Name | Type | Unit | Resource ID
| hardware.ipmi.current: power_meter_(0x16) | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.ipmi.current:16-system_board_(0x15) | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0

Revision history for this message
Chris Dent (cdent) wrote :

There was quite a lot of discussion about this when it was created, so I think if we dredge up the respective reviews we should be able to find the rationale and either all will become clear or we can fix it. I can't look now, but will if no one else has gotten around to it by early next week.

Revision history for this message
Chris Dent (cdent) wrote :

I think the general idea with meters in ceilometer is that the meter name is generic across a type of thing and that things like resource id and metadata are used to narrow the output.

I agree that this can result in some unfriendliness on the consumption side. What you really want to be able to say is "give me some temperature information on the node x" not have to construct some magical query involving an inscrutable query syntax.

As is so often the case with APIs that are supposed to make everything possible, nobody is made happy. Presumably a workaround are intermediary commands which call the API for specific needs.

Looking at the code[1] the ID of the node should show up as a key value pair in the resource_metadata so a query could be constructed as:

     ceilometer sample-list -m hardware.ipmi.fan -q 'metadata.node_uuid=something'

Note: I haven't actually tested this as I don't currently have an active Ironic setup, but that covers the concept. Does it work for you Jim?

[1] https://github.com/openstack/ceilometer/blob/master/ceilometer/ipmi/notifications/ironic.py#L98

Revision history for this message
Jim Mankovich (jman-a) wrote : Re: [Bug 1377157] Re: ipmi sensor naming in ceilometer is not consumer friendly

Chris,

With the the current implementation, the node_uuid is not available the
metadata, it
is only available in the resource ID. If the node_uuid were replicated
in the metadata
for every sensor, then the query you mentioned would work. I would
still rather see
the resource ID as just the node_uuid so you could find a platform
directly using resource
list and get links to all the sensors in the platform. With the
current naming of resource
ID, each sensor is its own resource so to find all the sensors for a
platform will require
some kind of searching through all resource IDs to find
metadata.node_uuid matches.

On 10/6/2014 5:42 AM, Chris Dent wrote:
> I think the general idea with meters in ceilometer is that the meter
> name is generic across a type of thing and that things like resource id
> and metadata are used to narrow the output.
>
> I agree that this can result in some unfriendliness on the consumption
> side. What you really want to be able to say is "give me some
> temperature information on the node x" not have to construct some
> magical query involving an inscrutable query syntax.
>
> As is so often the case with APIs that are supposed to make everything
> possible, nobody is made happy. Presumably a workaround are intermediary
> commands which call the API for specific needs.
>
> Looking at the code[1] the ID of the node should show up as a key value
> pair in the resource_metadata so a query could be constructed as:
>
> ceilometer sample-list -m hardware.ipmi.fan -q
> 'metadata.node_uuid=something'
>
> Note: I haven't actually tested this as I don't currently have an active
> Ironic setup, but that covers the concept. Does it work for you Jim?
>
> [1]
> https://github.com/openstack/ceilometer/blob/master/ceilometer/ipmi/notifications/ironic.py#L98
>

--
--- Jim Mankovich | <email address hidden> (US Mountain Time) ---

Chris Dent (cdent)
Changed in ceilometer:
assignee: nobody → Chris Dent (chdent)
Revision history for this message
Chris Dent (cdent) wrote :

Thanks for the quick response Jim.

Turns out the code I thought was present in the ipmi notification handling code is not there, so I'm going to fix that in time to get it in Juno.

It doesn't address the larger issue about easy consumption but will at least make it possible to do a reasonable query to narrow things dow. That fix will link here when I commit it for review.

Revision history for this message
Jim Mankovich (jman-a) wrote :

Thanks Chris,

FYI: I'm looking at providing the ability in Ironic to support more
than just IPMI
sensor logging. This would be something for Kilo. I would like to
put in place a
single generic ceilometer sensor notification plugin that can support
creating samples which
come from any ironic sensor producer, not just IPMI. Consistent sensor
naming
conventions will be key to being able to accomplish this.

On 10/6/2014 10:19 AM, Chris Dent wrote:
> Thanks for the quick response Jim.
>
> Turns out the code I thought was present in the ipmi notification
> handling code is not there, so I'm going to fix that in time to get it
> in Juno.
>
> It doesn't address the larger issue about easy consumption but will at
> least make it possible to do a reasonable query to narrow things dow.
> That fix will link here when I commit it for review.
>

--
--- Jim Mankovich | <email address hidden> (US Mountain Time) ---

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/126397

Changed in ceilometer:
status: New → In Progress
Dmitry Tantsur (divius)
Changed in ironic:
status: New → Confirmed
aeva black (tenbrae)
Changed in ironic:
status: Confirmed → Won't Fix
Revision history for this message
aeva black (tenbrae) wrote :

Since this is an issue with Ceilometer's formatting of the notification, I'm marking it as WontFix in Ironic.

While I agree that the current metric name "hardware.ipmi.*" is too specific to certain vendor drivers within Ironic, I think thats a separate bug to the meter naming structure in Ceilometer.

Eoghan Glynn (eglynn)
Changed in ceilometer:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/126397
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=3b20fa08527170aaca38253277feedf0f2d11217
Submitter: Jenkins
Branch: master

commit 3b20fa08527170aaca38253277feedf0f2d11217
Author: Chris Dent <email address hidden>
Date: Mon Oct 6 20:00:08 2014 +0100

    Include a 'node' key and value in ipmi metadata

    The ipmi meters use resource_ids which make it challenging
    to select information for all the sensors of a particular type.
    This change adds metadata that identifies the physical node on
    which a sensor exists.

    There are still remaining issues with the naming of ipmi and
    sensor-related meters that will need to be addressed separately.

    Closes-Bug: 1377157
    Change-Id: Ib53755112e4757aa53ab60079f3c555e21a72b63

Changed in ceilometer:
status: In Progress → Fix Committed
Eoghan Glynn (eglynn)
Changed in ceilometer:
milestone: none → juno-rc2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (proposed/juno)

Fix proposed to branch: proposed/juno
Review: https://review.openstack.org/126538

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (proposed/juno)

Reviewed: https://review.openstack.org/126538
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=cac21417118c9aa72705088f8d04ff5250d93123
Submitter: Jenkins
Branch: proposed/juno

commit cac21417118c9aa72705088f8d04ff5250d93123
Author: Chris Dent <email address hidden>
Date: Mon Oct 6 20:00:08 2014 +0100

    Include a 'node' key and value in ipmi metadata

    The ipmi meters use resource_ids which make it challenging
    to select information for all the sensors of a particular type.
    This change adds metadata that identifies the physical node on
    which a sensor exists.

    There are still remaining issues with the naming of ipmi and
    sensor-related meters that will need to be addressed separately.

    Closes-Bug: 1377157
    Change-Id: Ib53755112e4757aa53ab60079f3c555e21a72b63
    (cherry picked from commit 3b20fa08527170aaca38253277feedf0f2d11217)

Changed in ceilometer:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ceilometer:
milestone: juno-rc2 → 2014.2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/128905

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ceilometer (master)

Change abandoned by gordon chung (<email address hidden>) on branch: master
Review: https://review.openstack.org/128905
Reason: i've no idea what this is but the changes don't appear to exist in code

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)
Download full text (4.4 KiB)

Reviewed: https://review.openstack.org/128905
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=e67bdae2a2430d5d7763e83daa202bcb5cebf619
Submitter: Jenkins
Branch: master

commit 9899b6f12f5072787ac72a1a64d4a712db222cb3
Author: Samta <email address hidden>
Date: Wed Oct 8 16:27:12 2014 +0530

    Fix recording failure for system pollster

    The parameter "cumulative" was interpreted as a list and could
    not be recognized as a valid Type for SQL query hence failing
    to record the data for the pollster.

    The presence of ',' operator in the pollster class after the term
    CUMULATIVE caused it to be treated as a list

    This is corrected to interpret the field as a string which will
    be successfully recorded to the meter database.

    Change-Id: I10a69134a7f0c42a3e6c0d9bb7568e8d8fd2a932
    Closes-Bug: 1378742
    (cherry picked from commit f8f63d4b15ce68797d6e16943bd85efb19a77752)

commit 619291bd6c5d9775aaa9fcc2b47e8120e6f3443d
Author: Julien Danjou <email address hidden>
Date: Fri Oct 10 16:25:58 2014 +0200

    Add oslo.db to config generator

    Otherwise we miss the oslo.db configuration option in the sample config
    file.

    Closes-Bug: 1379808

    Change-Id: I3a70e5da42562081002286d37ba1a200150c8cfc
    (cherry picked from commit f7392e47575c8baa081406b03cf14ce2894a5996)

commit 5b966ba778ad9e87a41fc639a64e68f84d6ca8e2
Author: Eoghan Glynn <email address hidden>
Date: Tue Oct 7 16:25:34 2014 +0000

    Manually updated translations

    The workflow boils down to:

     $ sudo easy_install Babel
     $ sudo yum install gettext
     $ python setup.py extract_messages
     $ python setup.py update_catalog --no-fuzzy-matching \
       --ignore-obsolete=true
     $ source \
       ../project-config/jenkins/scripts/common_translation_update.sh
     $ setup_loglevel_vars
     $ cleanup_po_files ceilometer

    Change-Id: Ia16b2b15004e0e0cbd4332ed5106ba04f1736ade

commit aa15b2d7ed02822b72f878e889b3a77c97b4a6c5
Author: Pradeep Kilambi <email address hidden>
Date: Tue Sep 30 11:24:20 2014 -0700

    Fix neutron client to catch 404 exceptions

    When network services such as lbaas, fwaas or vpnaas are disabled
    in neutron, the discovery continues to poll the api calls and gets
    back a not found exception. The fix here is to catch the exception
    so it doesn't go unhandled.

    Change-Id: I8f350b9009f0d8c172836b1dd1123e966f887fdb
    Closes-Bug: #1374012
    (cherry picked from commit b65554eb460a282a2ab0a2dcc0053a8691cb9373)

commit fc22a04cfbab9a868347bea78a0b55c2b3316ef1
Author: Thomas Bechtold <email address hidden>
Date: Mon Sep 29 10:53:50 2014 +0200

    Fix OrderedDict usage for Python 2.6

    "import collections" also works on Python 2.6 but
    collections.OrderedDict() is not available so the current check is
    wrong. Using a function to get an OrderedDict() instance works fine for
    that.

    Closes-Bug: #1375568
    Change-Id: Iaf739dc2deb7d4b09bf477be60de4df8c4fcf5c0
    (cherry picked from commit 7212f7dc92c1c6fd5d7e36fc270b74efec412d72)

commit cac21417118c9aa72705088f8d04ff5250d93123
Author: Chris Dent <chd...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.