Flow setup sometimes takes upto 50ms even on a setup with less flows

Bug #1572458 reported by Vedamurthy Joshi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Praveen
Trunk
Fix Committed
High
Praveen

Bug Description

R3.0.2.0 Build 26 with vrouter-agent fixes upto April-19

On this setup with 9 compute nodes, there are ~100 VMs across these 9 nodes.
A compute node nodel7 for instance has about 50 active flows.

A VM 10.1.1.4 on nodel7 initiates tcp hping3 traffic to 10.1.1.10 which is also on the same compute node.

hping3 -p 22 -S 10.1.1.10 -i u1000

Praveen observed that some flow ksync tasks could be causing this delay

------------------
Sample :
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77054 win=14600 rtt=34.3 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77055 win=14600 rtt=51.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77056 win=14600 rtt=50.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77057 win=14600 rtt=49.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77058 win=14600 rtt=48.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77059 win=14600 rtt=47.4 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77060 win=14600 rtt=46.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77061 win=14600 rtt=45.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77062 win=14600 rtt=44.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77063 win=14600 rtt=43.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77064 win=14600 rtt=42.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77065 win=14600 rtt=41.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77066 win=14600 rtt=40.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77067 win=14600 rtt=39.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77068 win=14600 rtt=38.5 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77069 win=14600 rtt=37.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77070 win=14600 rtt=36.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77071 win=14600 rtt=35.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77072 win=14600 rtt=34.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77073 win=14600 rtt=33.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77074 win=14600 rtt=32.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77075 win=14600 rtt=31.6 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77076 win=14600 rtt=37.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77077 win=14600 rtt=36.7 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77078 win=14600 rtt=35.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77079 win=14600 rtt=34.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77080 win=14600 rtt=33.7 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77081 win=14600 rtt=32.7 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77082 win=14600 rtt=31.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77083 win=14600 rtt=30.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77084 win=14600 rtt=29.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77085 win=14600 rtt=28.9 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=77086 win=14600 rtt=27.9 ms

------------------
Normal flow setup rate is usually <10ms :

len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=575 win=14600 rtt=1.0 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=576 win=14600 rtt=1.1 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=577 win=14600 rtt=1.1 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=578 win=14600 rtt=0.8 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=579 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=580 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=581 win=14600 rtt=1.1 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=582 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=583 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=584 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=585 win=14600 rtt=1.2 ms
len=44 ip=10.1.1.10 ttl=64 DF id=0 sport=22 flags=SA seq=586 win=14600 rtt=1.2 ms

---------------
Tasktrace log :

2016-04-20 10:57:54.113 TaskTrace: 6 5 Run time(in usec) 23867 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 10:58:01.780 TaskTrace: 2 3 Run time(in usec) 23225 Flow Delete Queue-3 controller/src/base/task.cc 258
2016-04-20 10:58:01.780 TaskTrace: 27 2 Run time(in usec) 23390 Flow Log Queue controller/src/base/task.cc 258
2016-04-20 10:58:02.331 TaskTrace: 6 5 Run time(in usec) 23828 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 10:58:02.332 TaskTrace: 27 2 Run time(in usec) 26637 Flow Log Queue controller/src/base/task.cc 258
2016-04-20 10:58:02.332 TaskTrace: 2 3 Run time(in usec) 22125 Flow KSync Queue-3 controller/src/base/task.cc 258
2016-04-20 10:58:06.448 TaskTrace: 6 5 Run time(in usec) 27000 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 10:58:10.560 TaskTrace: 6 5 Run time(in usec) 22942 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 10:58:13.047 TaskTrace: 7 1 Run time(in usec) 21423 Agent Stats collector controller/src/base/task.cc 258
2016-04-20 10:58:13.066 TaskTrace: 2 3 Run time(in usec) 20541 Flow Delete Queue-3 controller/src/base/task.cc 258
2016-04-20 10:58:13.067 TaskTrace: 10 0 Run time(in usec) 20780 Function 1 controller/src/base/task.cc 258
2016-04-20 10:58:16.150 TaskTrace: 2 3 Run time(in usec) 33878 Flow Delete Queue-3 controller/src/base/task.cc 258
2016-04-20 10:58:16.150 TaskTrace: 27 2 Run time(in usec) 29346 Flow Log Queue controller/src/base/task.cc 258
2016-04-20 10:58:21.233 TaskTrace: 2 3 Run time(in usec) 133029 Flow KSync Queue-3 controller/src/base/task.cc 258
2016-04-20 10:58:21.233 TaskTrace: 2 3 Schedule delay(in usec) 133063 Flow Delete Queue-3 controller/src/base/task.cc 1316
2016-04-20 10:58:21.233 TaskTrace: 2 3 TBB schedule time(in usec) 133087 Flow Delete Queue-3 controller/src/base/task.cc 249
2016-04-20 10:58:22.883 TaskTrace: 6 5 Run time(in usec) 20295 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 10:58:35.199 TaskTrace: 6 5 Run time(in usec) 22592 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 11:47:06.061 TaskTrace: 6 5 Run time(in usec) 100326 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 11:55:08.878 TaskTrace: 2 0 Run time(in usec) 41437 Flow Event Queue-0 controller/src/base/task.cc 258
2016-04-20 11:55:08.878 TaskTrace: 2 3 Run time(in usec) 41653 Flow Event Queue-3 controller/src/base/task.cc 258
2016-04-20 11:55:08.878 TaskTrace: 2 1 Run time(in usec) 41778 Flow Event Queue-1 controller/src/base/task.cc 258
2016-04-20 11:55:08.878 TaskTrace: 11 0 Run time(in usec) 41413 Function 1 controller/src/base/task.cc 258
2016-04-20 11:55:41.105 TaskTrace: 2 3 Run time(in usec) 43590 Flow Event Queue-3 controller/src/base/task.cc 258
2016-04-20 11:55:41.106 TaskTrace: 2 1 Run time(in usec) 43765 Flow Event Queue-1 controller/src/base/task.cc 258
2016-04-20 11:55:41.106 TaskTrace: 2 0 Run time(in usec) 43832 Flow Event Queue-0 controller/src/base/task.cc 258
2016-04-20 11:55:41.106 TaskTrace: 11 0 Run time(in usec) 43913 Function 1 controller/src/base/task.cc 258
2016-04-20 11:57:49.006 TaskTrace: 10 0 Run time(in usec) 24894 Virtual-Machine UVE controller/src/base/task.cc 258
2016-04-20 12:55:31.033 TaskTrace: 6 5 Run time(in usec) 37324 Proto work queue. Module 5 controller/src/base/task.cc 258
2016-04-20 12:55:31.782 TaskTrace: 2 1 Run time(in usec) 24426 Flow KSync Queue-1 controller/src/base/task.cc 258
2016-04-20 13:42:49.087 TaskTrace: 2 1 Run time(in usec) 20218 Flow Event Queue-1 controller/src/base/task.cc 258
2016-04-20 13:42:49.090 TaskTrace: 11 0 Run time(in usec) 21785 Function 1 controller/src/base/task.cc 258
2016-04-20 13:42:49.090 TaskTrace: 27 2 Run time(in usec) 22409 Flow Log Queue controller/src/base/task.cc 258

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/19595
Submitter: Praveen K V (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/19595
Committed: http://github.org/Juniper/contrail-controller/commit/77cd042305ed4019ccf6c0f0cef60589b95e9258
Submitter: Zuul
Branch: R3.0

commit 77cd042305ed4019ccf6c0f0cef60589b95e9258
Author: Praveen K V <email address hidden>
Date: Mon Apr 25 10:10:47 2016 +0530

Move unrelated tasks from db::DBTable context

ConfigManager processing and vnsw-if listener dont need to run in
db::DBTable. Fix the dependencies

Change-Id: Ie727d3365e48cc59422bc2cbde45f44646e20548
Partial-Bug: #1572458

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/19783
Submitter: Praveen K V (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/19783
Committed: http://github.org/Juniper/contrail-controller/commit/47927a880f917933023fd562d0638304309f7c1a
Submitter: Zuul
Branch: master

commit 47927a880f917933023fd562d0638304309f7c1a
Author: Praveen K V <email address hidden>
Date: Mon Apr 25 10:10:47 2016 +0530

Move unrelated tasks from db::DBTable context

ConfigManager processing and vnsw-if listener dont need to run in
db::DBTable. Fix the dependencies

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Change-Id: Ie727d3365e48cc59422bc2cbde45f44646e20548
Partial-Bug: #1572458

Jeba Paulaiyan (jebap)
tags: added: performance
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.