Management traffic prioritization

Bug #1452922 reported by Aleksandr Shaposhnikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Medium
Igor Gajsin

Bug Description

We have to prioritize management traffic within controllers and compute nodes.

Bug https://bugs.launchpad.net/mos/+bug/1447619 isn't fully describes the problem because even image transfer from/to glance could be a reason for network disruption.

Basically we need to at least prioritize corosync/pacemaker, rabbitmq and mysql-galera traffic in from of other traffic.

So to implement that we have to do the following:
1. Prioritize traffic on controllers from/to: rabbitmq, corosync/pacemaker, mysql-galera ports.
2. Prioritize traffic on computes to: rabbitmq ports.

As example we could use -j TOS --set-tos Minimize-Delay for corosync/pacemaker and rabbitmq. Also the same or maximizing throughput(Maximize-Throughput) for mysql-galera.

description: updated
Mike Scherbakov (mihgen)
Changed in mos:
milestone: none → 6.1
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

I believe we need only managment network level QoS, no need to prioririze separate services inside

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

But doing so could be not enough unless we do as well:
* glance images traffic, cinder volumes ISCSI traffic, ceph osd2node and so on storages related traffic shouldn't pass through the management network.
* we should move all of these types of traffic to one of the storage networks we have, or introduce more.
* ideally, Fuel should support only deployments with storage networks separated to physical interfaces. QoS is not a panacea, AFAIK, it may drop packets raising re-transmission rates, and in the cases there is no CPU power for processing IRQ, results might be very sad and lead to storage degradation.

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

Moving to Fuel since the bug requires changes in deployment only.

affects: mos → fuel
Changed in fuel:
milestone: 6.1 → none
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Note, according to the comment https://bugs.launchpad.net/fuel/+bug/1447619/comments/32 QoS implementation should not be based on the "drop-packets-if-exceeded-limits" as it would have made things go even worse.

Changed in fuel:
milestone: none → 7.0
Changed in fuel:
importance: Undecided → Medium
assignee: nobody → Fuel Library Team (fuel-library)
status: New → Confirmed
Igor Gajsin (igajsin)
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Igor Gajsin (igajsin)
Igor Gajsin (igajsin)
Changed in fuel:
status: Confirmed → Triaged
Revision history for this message
Igor Gajsin (igajsin) wrote :

There is the blueprint https://blueprints.launchpad.net/fuel/+spec/templates-for-networking which provides proper way to fix this problem by using network roles which can be flexible attach certain kind of traffic to a appointed interface.

 See "Adapt Cinder for advancing networking" for example: https://review.openstack.org/#/c/197092/

I set this bug as invalid because future works will be in the context of this blueprint.

Changed in fuel:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.