Queue Manager fails in podman containerised environments

Bug #2095178 reported by Matt Crees
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
oslo.messaging
Fix Released
Medium
Matt Crees

Bug Description

We hit this when trying to use queue manager in Kolla-Ansible. An invocation of nova-manage fails with the following traceback:

root@primary:~# podman exec nova_conductor nova-manage cell_v2 discover_hosts --by-service --cell_uuid c321ca04-f6e5-44dc-b257-b9ebac11a5eb
Traceback (most recent call last):
  File "/var/lib/kolla/venv/bin/nova-manage", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/nova/cmd/manage.py", line 3815, in main
    config.parse_args(sys.argv)
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/nova/config.py", line 101, in parse_args
    rpc.init(CONF)
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/nova/rpc.py", line 68, in init
    TRANSPORT = create_transport(get_transport_url())
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/nova/rpc.py", line 255, in create_transport
    return messaging.get_rpc_transport(CONF,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/oslo_messaging/rpc/transport.py", line 50, in get_rpc_transport
    return msg_transport._get_transport(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/oslo_messaging/transport.py", line 205, in _get_transport
    mgr = driver.DriverManager('oslo.messaging.drivers',
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/stevedore/driver.py", line 54, in __init__
    super().__init__(
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/stevedore/named.py", line 78, in __init__
    extensions = self._load_plugins(invoke_on_load,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/stevedore/extension.py", line 218, in _load_plugins
    self._on_load_failure_callback(self, ep, err)
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/stevedore/extension.py", line 206, in _load_plugins
    ext = self._load_one_plugin(ep,
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/stevedore/named.py", line 156, in _load_one_plugin
    return super()._load_one_plugin(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/stevedore/extension.py", line 242, in _load_one_plugin
    obj = plugin(*invoke_args, **invoke_kwds)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 1787, in __init__
    super().__init__(
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 691, in __init__
    self._q_manager = QManager(
                      ^^^^^^^^^
  File "/var/lib/kolla/venv/lib/python3.12/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 73, in __init__
    with open(f'/proc/{self.pg}/stat') as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/proc/0/stat'

The issue here is that the group process id is zero when podman directly calls this command. /proc/0 doesn't exist, so the command fails.

podman exec nova_conductor ps xao pid,ppid,pgid,sid,comm
    PID PPID PGID SID COMMAND
      1 0 1 1 dumb-init
      2 1 1 1 nova-conductor
   2389 0 0 0 ps

As an aside, this doesn't affect docker as the process group id is not 0:

docker exec nova_conductor ps xao pid,ppid,pgid,sid,comm
    PID PPID PGID SID COMMAND
      1 0 1 1 dumb-init
      6 1 1 1 nova-conductor
     39 6 1 1 nova-conductor
     40 6 1 1 nova-conductor
   2418 0 2418 2418 ps

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (master)
Changed in oslo.messaging:
status: New → In Progress
Changed in oslo.messaging:
importance: Undecided → Medium
assignee: nobody → Matt Crees (mattcrees)
Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

This issue is not limited to podman, as it also occurs with LXC. I've accidentally submitted a duplicate report for it: https://bugs.launchpad.net/oslo.messaging/+bug/2121444

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (master)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/939540
Committed: https://opendev.org/openstack/oslo.messaging/commit/9043d6583ec1aa8fd792d3e3cd488b00beab2994
Submitter: "Zuul (22348)"
Branch: master

commit 9043d6583ec1aa8fd792d3e3cd488b00beab2994
Author: Matt Crees <email address hidden>
Date: Fri Jan 17 15:09:10 2025 +0000

    Fix Queue Manager in podman containerised env

    When commands such as ``nova-manage`` are invoked via a podman call,
    e.g. ``podman exec nova_conductor nova-manage ...``, the process group
    ID assigned is 0. This break the check of start time since system boot
    as ``/proc/0`` does not exist.

    Fix this behaviour by instead using the process ID to look up the start
    time since system boot.

    Closes-Bug: #2095178
    Change-Id: Ie89783d1e8a2891bfbb5b516eff08ed694353d6d

Changed in oslo.messaging:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/2025.1)

Fix proposed to branch: stable/2025.1
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/958848

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/2024.2)

Fix proposed to branch: stable/2024.2
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/958849

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/oslo.messaging/+/958850

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/2025.1)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/958848
Committed: https://opendev.org/openstack/oslo.messaging/commit/a9a29615321cb5a6e093329bbd3365191a717edf
Submitter: "Zuul (22348)"
Branch: stable/2025.1

commit a9a29615321cb5a6e093329bbd3365191a717edf
Author: Matt Crees <email address hidden>
Date: Fri Jan 17 15:09:10 2025 +0000

    Fix Queue Manager in podman containerised env

    When commands such as ``nova-manage`` are invoked via a podman call,
    e.g. ``podman exec nova_conductor nova-manage ...``, the process group
    ID assigned is 0. This break the check of start time since system boot
    as ``/proc/0`` does not exist.

    Fix this behaviour by instead using the process ID to look up the start
    time since system boot.

    Closes-Bug: #2095178
    Change-Id: Ie89783d1e8a2891bfbb5b516eff08ed694353d6d
    (cherry picked from commit 9043d6583ec1aa8fd792d3e3cd488b00beab2994)
    Signed-off-by: Matt Crees <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/958850
Committed: https://opendev.org/openstack/oslo.messaging/commit/089b95a556dc45d32c0b98f72329cedc320da1b2
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit 089b95a556dc45d32c0b98f72329cedc320da1b2
Author: Matt Crees <email address hidden>
Date: Fri Jan 17 15:09:10 2025 +0000

    Fix Queue Manager in podman containerised env

    When commands such as ``nova-manage`` are invoked via a podman call,
    e.g. ``podman exec nova_conductor nova-manage ...``, the process group
    ID assigned is 0. This break the check of start time since system boot
    as ``/proc/0`` does not exist.

    Fix this behaviour by instead using the process ID to look up the start
    time since system boot.

    Closes-Bug: #2095178
    Change-Id: Ie89783d1e8a2891bfbb5b516eff08ed694353d6d
    (cherry picked from commit 9043d6583ec1aa8fd792d3e3cd488b00beab2994)
    Signed-off-by: Matt Crees <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.messaging 2024.1-eom

This issue was fixed in the openstack/oslo.messaging 2024.1-eom Caracal release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (stable/2024.2)

Reviewed: https://review.opendev.org/c/openstack/oslo.messaging/+/958849
Committed: https://opendev.org/openstack/oslo.messaging/commit/4e75deec33725b19718e805fe5ac3b965e5e5aae
Submitter: "Zuul (22348)"
Branch: stable/2024.2

commit 4e75deec33725b19718e805fe5ac3b965e5e5aae
Author: Matt Crees <email address hidden>
Date: Fri Jan 17 15:09:10 2025 +0000

    Fix Queue Manager in podman containerised env

    When commands such as ``nova-manage`` are invoked via a podman call,
    e.g. ``podman exec nova_conductor nova-manage ...``, the process group
    ID assigned is 0. This break the check of start time since system boot
    as ``/proc/0`` does not exist.

    Fix this behaviour by instead using the process ID to look up the start
    time since system boot.

    Closes-Bug: #2095178
    Change-Id: Ie89783d1e8a2891bfbb5b516eff08ed694353d6d
    (cherry picked from commit 9043d6583ec1aa8fd792d3e3cd488b00beab2994)
    Signed-off-by: Matt Crees <email address hidden>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.