poor networking throughput through veth interfaces
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Precise |
Fix Released
|
Medium
|
Chris J Arges | ||
Quantal |
Fix Released
|
Medium
|
Chris J Arges | ||
Raring |
Fix Released
|
Medium
|
Chris J Arges |
Bug Description
SRU Justification:
Impact:
Users of the 3.2/3.5/3.8 series kernel will have poor network throughput when using OpenStack Neutron depending on their setup.
Fix:
These upstream patches are necessary to fix the issue:
2681128f0ced8aa
8093315a91340bc
d0e2c55e7c940a3
2efd32ee1b60b0b
f45a5c267da3517
Testcase:
Setup OpenStack Neutron. Test throughput between internal and external nodes.
The following explains an example vlan+namespace configuration:
Internal Node: [10.x.x.
netns: qrouter-123 ---> qg-234[
Where:
1) tap123+qr-123 and tap234+qg-234 are veth pairs
2) qr-123 and qg-234 reside inside the qrouter-123 namespace
Another testcase without Openstack:
* create two vms: (vm1, vm2), install iperf on those machines
* connect vms via an isolated bridge
* measure baseline performance
- iperf -s # on machine 1
- iperf -t 60 -l 4M -c <machine 1 IP> # on machine 2
* create veth pairs between vms using attached script:
- ./setup-
- ./setup-
* attach VM's interfaces to the created bridges (qbrvm1 / qbrvm2)
* In the VM's setup static IPs
- sudo ifconfig eth0 10.10.10.1/24 up #vm1
- sudo ifconfig eth0 10.10.10.2/24 up #vm2
* measure performance now
* we expect this to be close to the existing performance
NOTE: the fixed kernel needs to be on the _hypervisor_
--
OpenStack Neutron does IP forwarding through a network namespace. A veth pair is used to connect into the namespace. The veth pair appears to be the bottleneck, independent of network namespace. In newer versions of Linux (Ubuntu-3.9.0-7.15 / v3.9-rc1 and greater) throughput is much higher by almost 3 times. For example with some testing throughput is 3.5 Gbps in pre 3.9-rc1 versions and 9.1 Gbps with these patches applied.
This has been confirmed on kernels from 3.5.x-3.8.x. (Quantal and Raring lts backports)
Changed in linux (Ubuntu): | |
assignee: | Chris J Arges (arges) → nobody |
status: | New → Fix Released |
Changed in linux (Ubuntu Quantal): | |
status: | New → In Progress |
Changed in linux (Ubuntu Raring): | |
status: | New → In Progress |
assignee: | nobody → Chris J Arges (arges) |
Changed in linux (Ubuntu Quantal): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in linux (Ubuntu Raring): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Quantal): | |
importance: | Undecided → Medium |
description: | updated |
description: | updated |
tags: | added: verification-needed-raring |
Changed in linux (Ubuntu Quantal): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Raring): | |
status: | In Progress → Fix Committed |
tags: | added: kernel-key |
tags: | added: quantal raring |
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Quantal): | |
status: | Fix Released → In Progress |
tags: | removed: kernel-key |
Changed in linux (Ubuntu Raring): | |
status: | Fix Released → In Progress |
tags: | removed: verification-failed-quantal verification-failed-raring |
description: | updated |
tags: |
added: verification-done-quantal removed: verification-quantal-done |
Changed in linux (Ubuntu Quantal): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Raring): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-raring removed: verification-needed-raring |
description: | updated |
description: | updated |
description: | updated |
Changed in linux (Ubuntu Precise): | |
status: | In Progress → Fix Committed |
description: | updated |
A bisection reveals that the following patch solves the issue: a52549044975d8c 7f673b28a1 veth: extend device features
8093315a91340bc
However it relies on: 4e66f221197e183 cc16d244fe veth: reduce stat overhead
2681128f0ced8aa
For a clean cherry-pick.
And after building and testing the following bug needed to be fixed with: ee91e9e23a2683b 593690f1e9 veth: avoid a NULL deref in veth_stats_one
d0e2c55e7c940a3
A test build is available here that solves the issue backported to 3.5/3.8: people. canonical. com/~arges/ lp1201869/
http://