Potential deadlock in check_quota_met during the first call of a tenant

Bug #2038798 reported by Gregory Thiemonge
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
octavia
In Progress
Medium
Gregory Thiemonge

Bug Description

There's deadlock in check_quota_met that can occur under the following conditions:
- the Quota entry of the project doesn't exist yet (first POST call to the Octavia API by a tenant)
- 2 resources are created at the same time

It may not impact production environments as the deadlock may occur only in the 2 first (simultaneous) API calls to Octavia in a project.
But it may happen in CI as the tests are executed in parallel in a newly created project.

After the first API call to Octavia, the Quota entry exists in the DB, and deadlock is not reproducible

To reproduce it, I run a loop with:

- delete the quota entry from the DB
- create 8 load balancers simultaneously
- if the deadlock occurs, an exception is raised
- delete the load balancers

It needs only a few iterations of the loop to reproduce it.

[..]
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890367]: DEBUG octavia.db.repositories [None req-fdc9e4bb-63a5-4383-b64c-25d9ac950604 demo admin] Checking quota for project: 0758ab98a0d9459a958bb7acf51c4805 object: <class 'octavia.common.data_models.LoadBalancer'> {{(pid=2890367) check_quota_met /opt/stack/octavia/octavia/db/repositories.py:382}}
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: DEBUG octavia.db.repositories [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] Checking quota for project: 0758ab98a0d9459a958bb7acf51c4805 object: <class 'octavia.common.data_models.LoadBalancer'> {{(pid=2890366) check_quota_met /opt/stack/octavia/octavia/db/repositories.py:382}}
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: DEBUG oslo_db.api [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] DB error: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: [SQL: INSERT INTO quotas (project_id, health_monitor, listener, load_balancer, member, pool, l7policy, l7rule, in_use_health_monitor, in_use_listener, in_use_load_balancer, in_use_member, in_use_pool, in_use_l7policy, in_use_l7rule) VALUES (%(project_id)s, %(health_monitor)s, %(listener)s, %(load_balancer)s, %(member)s, %(pool)s, %(l7policy)s, %(l7rule)s, %(in_use_health_monitor)s, %(in_use_listener)s, %(in_use_load_balancer)s, %(in_use_member)s, %(in_use_pool)s, %(in_use_l7policy)s, %(in_use_l7rule)s)]
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: [parameters: {'project_id': '0758ab98a0d9459a958bb7acf51c4805', 'health_monitor': None, 'listener': None, 'load_balancer': None, 'member': None, 'pool': None, 'l7policy': None, 'l7rule': None, 'in_use_health_monitor': None, 'in_use_listener': None, 'in_use_load_balancer': None, 'in_use_member': None, 'in_use_pool': None, 'in_use_l7policy': None, 'in_use_l7rule': None}]
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: (Background on this error at: https://sqlalche.me/e/14/e3q8) {{(pid=2890366) _is_exception_expected /usr/local/lib/python3.9/site-packages/oslo_db/api.py:184}}
Oct 08 10:28:51 gthiemon-devstack <email address hidden>[2890366]: DEBUG oslo_db.api [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] Performing DB retry for function octavia.db.repositories.QuotasRepository.update {{(pid=2890366) wrapper /usr/local/lib/python3.9/site-packages/oslo_db/api.py:155}}
[..]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.drivers.amphora_driver.v2.driver [None req-fdc9e4bb-63a5-4383-b64c-25d9ac950604 demo admin] Amphora provider created VIP port d4a6b6af-b82f-4e79-a805-3bcc2734c143 for load balancer 9b9adabb-405b-40fd-92ce-929ca74efbc8.
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.v2.controllers.load_balancer [None req-fdc9e4bb-63a5-4383-b64c-25d9ac950604 demo admin] Sending create Load Balancer 9b9adabb-405b-40fd-92ce-929ca74efbc8 to provider amphora
Oct 08 10:28:52 gthiemon-devstack octavia-worker[2890454]: INFO octavia.controller.queue.v2.endpoints [-] Creating load balancer '9b9adabb-405b-40fd-92ce-929ca74efbc8'...
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: [pid: 2890367|app: 0|req: 16/35] 192.168.1.101 () {66 vars in 1335 bytes} [Sun Oct 8 10:28:51 2023] POST /load-balancer/v2.0/lbaas/loadbalancers => generated 717 bytes in 562 msecs (HTTP/1.1 201) 4 headers in 162 bytes (1 switches on core 0)
Oct 08 10:28:52 gthiemon-devstack octavia-worker[2890454]: DEBUG octavia.common.base_taskflow [-] Waiting for job get_create_load_balancer_flow-46bdb5e7-06c8-43a1-8f9e-240e44493431 to finish {{(pid=2890454) _wait_for_job /opt/stack/octavia/octavia/common/base_taskflow.py:203}}
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: DEBUG octavia.db.repositories [None req-0e9a0198-0bd6-41ce-ad42-9e5ec14f093c demo admin] Checking quota for project: 0758ab98a0d9459a958bb7acf51c4805 object: <class 'octavia.common.data_models.LoadBalancer'> {{(pid=2890367) check_quota_met /opt/stack/octavia/octavia/db/repositories.py:382}}
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.drivers.amphora_driver.v2.driver [None req-0e9a0198-0bd6-41ce-ad42-9e5ec14f093c demo admin] Amphora provider created VIP port 15318a06-bea4-4955-ac65-f5fe0ccc252a for load balancer 23e87030-1708-422b-b7f4-06aa2911ee4c.
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: INFO octavia.api.v2.controllers.load_balancer [None req-0e9a0198-0bd6-41ce-ad42-9e5ec14f093c demo admin] Sending create Load Balancer 23e87030-1708-422b-b7f4-06aa2911ee4c to provider amphora
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890367]: [pid: 2890367|app: 0|req: 17/36] 192.168.1.101 () {66 vars in 1335 bytes} [Sun Oct 8 10:28:52 2023] POST /load-balancer/v2.0/lbaas/loadbalancers => generated 717 bytes in 478 msecs (HTTP/1.1 201) 4 headers in 162 bytes (1 switches on core 0)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: ERROR wsme.api [None req-bbe9492f-aa48-43e2-8d1e-6531df04be5f demo admin] Server-side error: "This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [SQL: INSERT INTO quotas (project_id, health_monitor, listener, load_balancer, member, pool, l7policy, l7rule, in_use_health_monitor, in_use_listener, in_use_load_balancer, in_use_member, in_use_pool, in_use_l7policy, in_use_l7rule) VALUES (%(project_id)s, %(health_monitor)s, %(listener)s, %(load_balancer)s, %(member)s, %(pool)s, %(l7policy)s, %(l7rule)s, %(in_use_health_monitor)s, %(in_use_listener)s, %(in_use_load_balancer)s, %(in_use_member)s, %(in_use_pool)s, %(in_use_l7policy)s, %(in_use_l7rule)s)]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [parameters: {'project_id': '0758ab98a0d9459a958bb7acf51c4805', 'health_monitor': None, 'listener': None, 'load_balancer': None, 'member': None, 'pool': None, 'l7policy': None, 'l7rule': None, 'in_use_health_monitor': None, 'in_use_listener': None, 'in_use_load_balancer': None, 'in_use_member': None, 'in_use_pool': None, 'in_use_l7policy': None, 'in_use_l7rule': None}]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: (Background on this error at: https://sqlalche.me/e/14/e3q8) (Background on this error at: https://sqlalche.me/e/14/7s2a)". Detail:
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: Traceback (most recent call last):
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/wsmeext/pecan.py", line 82, in callfunction
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: result = f(self, *args, **kwargs)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/api/v2/controllers/load_balancer.py", line 577, in post
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: lock_session.rollback()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self.force_reraise()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: raise self.value
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/api/v2/controllers/load_balancer.py", line 467, in post
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: if self.repositories.check_quota_met(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/db/repositories.py", line 396, in check_quota_met
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self.quotas.update(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_db/api.py", line 154, in wrapper
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: ectxt.value = e.inner_exc
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self.force_reraise()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: raise self.value
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib/python3.9/site-packages/oslo_db/api.py", line 142, in wrapper
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: return f(*args, **kwargs)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/opt/stack/octavia/octavia/db/repositories.py", line 1895, in update
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: session.query(self.model_class)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/query.py", line 2823, in first
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: return self.limit(1)._iter().first()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/query.py", line 2907, in _iter
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: result = self.session.execute(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1711, in execute
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: conn = self._connection_for_bind(bind)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: return self._transaction._connection_for_bind(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 721, in _connection_for_bind
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: self._assert_active()
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: File "/usr/local/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: raise sa_exc.PendingRollbackError(
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [SQL: INSERT INTO quotas (project_id, health_monitor, listener, load_balancer, member, pool, l7policy, l7rule, in_use_health_monitor, in_use_listener, in_use_load_balancer, in_use_member, in_use_pool, in_use_l7policy, in_use_l7rule) VALUES (%(project_id)s, %(health_monitor)s, %(listener)s, %(load_balancer)s, %(member)s, %(pool)s, %(l7policy)s, %(l7rule)s, %(in_use_health_monitor)s, %(in_use_listener)s, %(in_use_load_balancer)s, %(in_use_member)s, %(in_use_pool)s, %(in_use_l7policy)s, %(in_use_l7rule)s)]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [parameters: {'project_id': '0758ab98a0d9459a958bb7acf51c4805', 'health_monitor': None, 'listener': None, 'load_balancer': None, 'member': None, 'pool': None, 'l7policy': None, 'l7rule': None, 'in_use_health_monitor': None, 'in_use_listener': None, 'in_use_load_balancer': None, 'in_use_member': None, 'in_use_pool': None, 'in_use_l7policy': None, 'in_use_l7rule': None}]
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: (Background on this error at: https://sqlalche.me/e/14/e3q8) (Background on this error at: https://sqlalche.me/e/14/7s2a)
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: : sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 08 10:28:52 gthiemon-devstack <email address hidden>[2890366]: [pid: 2890366|app: 0|req: 20/37] 192.168.1.101 () {66 vars in 1335 bytes} [Sun Oct 8 10:28:51 2023] POST /load-balancer/v2.0/lbaas/loadbalancers => generated 1366 bytes in 1147 msecs (HTTP/1.1 500) 4 headers in 177 bytes (1 switches on core 0)

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :
Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :

another job faced it:

octavia-v2-dsvm-scenario-non-traffic-ops stable/2023.2

https://zuul.opendev.org/t/openstack/build/1fe585720e49430b8bc10e3313e931e9

Revision history for this message
Gregory Thiemonge (gthiemonge) wrote :
Changed in octavia:
status: New → In Progress
Changed in octavia:
assignee: nobody → Gregory Thiemonge (gthiemonge)
status: In Progress → New
status: New → In Progress
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.