General scale issue on neutron-fwaas due to RPC broadcast usage (fanout)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
networking-midonet |
Fix Released
|
Medium
|
YAMAMOTO Takashi | ||
neutron |
Fix Released
|
Medium
|
Bertrand Lallau |
Bug Description
Actually on all CRUDs methods used on FWaaS resources (Firewall, FirewallPolicy, FirewallRule, Firewallgroup, ...) an AMQP fanout cast is sent to all L3 agents.
This is a wrong design, AMPQ cast should be send only to L3Agents managing routers with firewalls related to the tenant.
This wrong design result in many bugs already reported:
1) FirewallNotFound during firewall_deleted
https:/
https:/
Explanation using 2 L3agents:
agent1: host router with firewall for tenant
agent2: doesn't host tenant router
1. neutron firewall-delete <firewall>
2. neutron-server send an AMQP call "delete_firewall" to agent1 and agent2
3. agent1 clean router firewall and send back "firewall_deleted" to neutron-server
4. neutron-server delete firewall resource from database
5. agent2 has nothing to clean and send back firewall_deleted to neutron-server
6. neutron-server get an exception "FirewallNotFound"
http://
But this is not ended :(
7. agent2 get back the "FirewallNotfound" exception
8. on RPC error it will performed a kind of "full synchronisation" (process_
send an AMQP call "get_tenants_
9. neutron-server will respond back with a ALL tenants (even if it's not related to this agents)
10 FOR each tenant agent2 will sent a AMQP call:
get_
Full sync bug is already reported here:
https:/
2) Intermittent failed on Tempest check is probably link:
https:/
3) More generally on FWaaS CRUDs operations neutron-server flood and is flooded by many AMQP requests.
=> this result in neutron-server RPC worker fully busy
=> AMQP messages accumulated in q-firewall-plugin queue
=> RPC Timeout appears on agents after (60s)
=> full synchronisation triggered
=> etc, etc...
Changed in neutron: | |
assignee: | nobody → Bertrand Lallau (bertrand-lallau) |
Changed in neutron: | |
status: | New → In Progress |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in neutron: | |
assignee: | Bertrand Lallau (bertrand-lallau) → Cedric Brandily (cbrandily) |
Changed in neutron: | |
assignee: | Cedric Brandily (cbrandily) → Bertrand Lallau (bertrand-lallau) |
tags: | added: fwaas |
Changed in neutron: | |
importance: | Undecided → Medium |
Changed in networking-midonet: | |
importance: | Undecided → Medium |
assignee: | nobody → YAMAMOTO Takashi (yamamoto) |
milestone: | none → 5.0.0 |
status: | New → In Progress |
Fix proposed to branch: master /review. openstack. org/426287
Review: https:/