void boost::intrusive::list_impl<boost::intrusive::listopt<boost::intrusive::detail::member_hook_traits<Path, boost::intrusive::list_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &Path::node_>, unsigned long, true> >::sort<bool (*)(Path const&, Path const&)>(bool (*)(Path const&, Path const&)) ()

Bug #1543901 reported by Daisuke Nakajima
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Prabhjot Singh Sethi
R2.21.x
Fix Committed
High
Prabhjot Singh Sethi
R2.22.x
Fix Committed
High
Prabhjot Singh Sethi
Trunk
Fix Committed
High
Prabhjot Singh Sethi

Bug Description

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-tor-agent --config_file /etc/contrail/contrail-tor-agent-1.co'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f464ca1c6f0 in ?? ()
(gdb) bt
#0 0x00007f464ca1c6f0 in ?? ()
#1 0x0000000000d09504 in void boost::intrusive::list_impl<boost::intrusive::listopt<boost::intrusive::detail::member_hook_traits<Path, boost::intrusive::list_member_hook<boost::intrusive::none, boost::intrusive::none, boost::intrusive::none>, &Path::node_>, unsigned long, true> >::sort<bool (*)(Path const&, Path const&)>(bool (*)(Path const&, Path const&)) ()
#2 0x0000000000d09368 in Route::Sort(bool (*)(Path const&, Path const&), Path const*) ()
#3 0x000000000084e865 in AgentRoute::RemovePath(AgentPath*) ()
#4 0x000000000084f328 in AgentRouteTable::DeletePathFromPeer(DBTablePartBase*, AgentRoute*, AgentPath*) ()
#5 0x00000000008502d8 in AgentRouteTable::Input(DBTablePartition*, DBClient*, DBRequest*) ()
#6 0x0000000000862021 in EvpnAgentRouteTable::Delete(Peer const*, std::string const&, MacAddress const&, boost::asio::ip::address const&, unsigned int) ()
#7 0x000000000094f358 in OvsPeer::DeleteOvsRoute(VrfEntry*, unsigned int, MacAddress const&, bool) ()
#8 0x00000000009a27cd in OVSDB::HaStaleL2RouteEntry::DeleteEvent() ()
#9 0x00000000009a35a8 in OVSDB::HaStaleL2RouteTable::ProcessExportEntry(OVSDB::HaStaleL2RouteEntry*) ()
#10 0x00000000009a640f in QueueTaskRunner<OVSDB::HaStaleL2RouteEntry*, WorkQueue<OVSDB::HaStaleL2RouteEntry*> >::Run() ()
#11 0x0000000000e15df0 in TaskImpl::execute() ()
#12 0x00007f46ad2d8b3a in ?? () from /usr/lib/libtbb.so.2
#13 0x00007f46ad2d4816 in ?? () from /usr/lib/libtbb.so.2
#14 0x00007f46ad2d3f4b in ?? () from /usr/lib/libtbb.so.2
#15 0x00007f46ad2d00ff in ?? () from /usr/lib/libtbb.so.2
#16 0x00007f46ad2d02f9 in ?? () from /usr/lib/libtbb.so.2
#17 0x00007f46ad4f4182 in start_thread (arg=0x7f46a518d700) at pthread_create.c:312
#18 0x00007f46ac7cd47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) quit
root@openc-14:/var/crashes# contrail-version
Package Version Build-ID | Repo | Package Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-fabric-utils 2.21.1-22 22
contrail-install-packages 2.21.1-22~juno 22
contrail-lib 2.21.1-22 22
contrail-nodemgr 2.21.1-22 22
contrail-nova-vif 2.21.1-22 22
contrail-setup 2.21.1-22 22
contrail-utils 2.21.1-22 22
contrail-vrouter-3.13.0-40-generic 2.21.1-22 22
contrail-vrouter-agent 2.21.1-22 22
contrail-vrouter-common 2.21.1-22 22
contrail-vrouter-init 2.21.1-22 22
contrail-vrouter-utils 2.21.1-22 22
python-contrail 2.21.1-22 22
python-contrail-vrouter-api 2.21.1-22 22
python-neutronclient 1:2.3.8-0ubuntu1~cloud0.2contrail22
python-nova 1:2014.2.3-0ubuntu1~cloud0.3contrail22
python-opencontrail-vrouter-netns 2.21.1-22 22

Tags: vrouter
Changed in juniperopenstack:
assignee: nobody → Hari Prasad Killi (haripk)
milestone: none → r3.0-fcs
tags: added: vrouter
Revision history for this message
Prabhjot Singh Sethi (prabhjot) wrote :
Download full text (11.2 KiB)

issue happened due to agent route modification from two threads at the same time one in context of "Agent::RouteWalker" and other in context of "Agent::KSync", both of them trying to manage there paths in the same route for dynamic router peer.

before removal of path there was a previous front pointing to BGP path which got removed before calling sort on the pending path list.

This parallel access currently happens only when OVS and BGP connection is down at the same time.

#4 0x000000000084e865 in AgentRoute::RemovePath (this=0x7f4644d07580, path=0x7f46245d2850) at controller/src/vnsw/agent/oper/agent_route.cc:559
559 controller/src/vnsw/agent/oper/agent_route.cc: No such file or directory.
(gdb) p *path
$1 = (AgentPath) {
  <Path> = {
    _vptr.Path = 0xe36390 <vtable for AgentPath+16>,
    node_ = {
      <boost::intrusive::detail::generic_hook<boost::intrusive::get_list_node_algo<void*>, boost::intrusive::member_tag, (boost::intrusive::link_mode_type)1, 0>> = {
        <boost::intrusive::detail::no_default_definer> = {<No data fields>},
        <boost::intrusive::list_node<void*>> = {
          next_ = 0x0,
          prev_ = 0x0
        }, <No data fields>}, <No data fields>},
    time_stamp_usecs_ = 1455067642637189
  },
  members of AgentPath:
  peer_ = 0x7f4644002e90,
  nh_ = {
    px = 0x7f463001fa90
  },
  label_ = 4294967295,
  vxlan_id_ = 6,
  dest_vn_name_ = {
    static npos = <optimized out>,
    _M_dataplus = {
      <std::allocator<char>> = {
        <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>},
      members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider:
      _M_p = 0x7f4694108c78 "default-domain:commonmax:commonmax-vn31-001"
    }
  },
  sync_ = false,
  force_policy_ = false,
  sg_list_ = {
    <std::_Vector_base<int, std::allocator<int> >> = {
      _M_impl = {
        <std::allocator<int>> = {
          <__gnu_cxx::new_allocator<int>> = {<No data fields>}, <No data fields>},
        members of std::_Vector_base<int, std::allocator<int> >::_Vector_impl:
        _M_start = 0x0,
        _M_finish = 0x0,
        _M_end_of_storage = 0x0
      }
    }, <No data fields>},
  tunnel_dest_ = {
    addr_ = {
      s_addr = 688592812
    }
  },
  tunnel_bmap_ = 8,
  tunnel_type_ = TunnelType::VXLAN,
  vrf_name_ = {
    static npos = <optimized out>,
    _M_dataplus = {
      <std::allocator<char>> = {
        <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>},
      members of std::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider:
      _M_p = 0x132a898 <std::string::_Rep::_S_empty_rep_storage@@GLIBCXX_3.4+24> ""
    }
  },
  gw_ip_ = {
    addr_ = {
      s_addr = 0
    }
  },
  unresolved_ = false,
  is_stale_ = false,
  is_subnet_discard_ = false,
  dependant_rt_ = {
    node_ = {
      <boost::intrusive::detail::generic_hook<boost::intrusive::get_list_node_algo<void*>, boost::intrusive::member_tag, (boost::intrusive::link_mode_type)1, 0>> = {
        <boost::intrusive::detail::no_default_definer> = {<No data fields>},
        <boost::intrusive::list_node<void*>> = {
          next_ = 0x0,
          pr...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/17233
Submitter: Prabhjot Singh Sethi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17233
Committed: http://github.org/Juniper/contrail-controller/commit/f9981edb0627b600b0a664173b77e9e09079556c
Submitter: Zuul
Branch: master

commit f9981edb0627b600b0a664173b77e9e09079556c
Author: Prabhjot Singh Sethi <email address hidden>
Date: Mon Feb 15 14:40:16 2016 +0530

Fix parallel access to route path list

Issue:
------
Route Path list is getting modified from two threads at
the same time, resulting in path list sanity issues and
causing crash

Fix:
----
Adding task exclusion between "Agent::KSync" and
"Agent::RouteWalker" to resolve this issue.
TODO: this exculsion can be removed once handling for
dynamic peer is complete

Closes-Bug: 1543901
Change-Id: Ia5380210adfe868bcdc0559c21c5e938b3d445d3

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/17290
Submitter: Prabhjot Singh Sethi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/17291
Submitter: Prabhjot Singh Sethi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/17292
Submitter: Prabhjot Singh Sethi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17292
Committed: http://github.org/Juniper/contrail-controller/commit/5e7d003089daac3501c9bdbc820de03b08645cef
Submitter: Zuul
Branch: R2.22.x

commit 5e7d003089daac3501c9bdbc820de03b08645cef
Author: Prabhjot Singh Sethi <email address hidden>
Date: Mon Feb 15 14:40:16 2016 +0530

Fix parallel access to route path list

Issue:
------
Route Path list is getting modified from two threads at
the same time, resulting in path list sanity issues and
causing crash

Fix:
----
Adding task exclusion between "Agent::KSync" and
"Agent::RouteWalker" to resolve this issue.
TODO: this exculsion can be removed once handling for
dynamic peer is complete

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Closes-Bug: 1543901
Change-Id: Ia5380210adfe868bcdc0559c21c5e938b3d445d3
(cherry picked from commit f9981edb0627b600b0a664173b77e9e09079556c)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17290
Committed: http://github.org/Juniper/contrail-controller/commit/a9480663472b317138ebf1374af3f0352150723d
Submitter: Zuul
Branch: R2.20

commit a9480663472b317138ebf1374af3f0352150723d
Author: Prabhjot Singh Sethi <email address hidden>
Date: Mon Feb 15 14:40:16 2016 +0530

Fix parallel access to route path list

Issue:
------
Route Path list is getting modified from two threads at
the same time, resulting in path list sanity issues and
causing crash

Fix:
----
Adding task exclusion between "Agent::KSync" and
"Agent::RouteWalker" to resolve this issue.
TODO: this exculsion can be removed once handling for
dynamic peer is complete

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Closes-Bug: 1543901
Change-Id: Ia5380210adfe868bcdc0559c21c5e938b3d445d3
(cherry picked from commit f9981edb0627b600b0a664173b77e9e09079556c)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17291
Committed: http://github.org/Juniper/contrail-controller/commit/34ad87ec8413d5ed9ae32bd4f23f71510011f8ff
Submitter: Zuul
Branch: R2.21.x

commit 34ad87ec8413d5ed9ae32bd4f23f71510011f8ff
Author: Prabhjot Singh Sethi <email address hidden>
Date: Mon Feb 15 14:40:16 2016 +0530

Fix parallel access to route path list

Issue:
------
Route Path list is getting modified from two threads at
the same time, resulting in path list sanity issues and
causing crash

Fix:
----
Adding task exclusion between "Agent::KSync" and
"Agent::RouteWalker" to resolve this issue.
TODO: this exculsion can be removed once handling for
dynamic peer is complete

Conflicts:
 src/vnsw/agent/cmn/agent.cc

Closes-Bug: 1543901
Change-Id: Ia5380210adfe868bcdc0559c21c5e938b3d445d3
(cherry picked from commit f9981edb0627b600b0a664173b77e9e09079556c)

information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.