I hit this core only once on my setup,ciuld not reproduce again with the same test.
core file and binary and symbols are kept @nodem4:/cs-/shared/bugs
Orchestrator :Kubernetes
HOSTOS :centos7.5
SKU :queens
build :5.0-291
deployer :contrail-ansible-deployer
========================================
Topology
========================================
vrouter +k8s_node:
ip: nodec60
ip: nodec61
config +control++kubemanager:
ip: nodeg12(k8s_master)
ip: nodeg31
ip: nodec58
========================================
backtrace
===============================================================================================
[root@nodec61 crashes]# gdb contrail-vrouter-agent core.contrail-vroute.5010.nodec61.1539766364
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /var/crashes/contrail-vrouter-agent...Reading symbols from /var/crashes/contrail-vrouter-agent.debug...done.
done.
warning: Could not load shared library symbols for 14 libraries, e.g. /lib64/libtcmalloc.so.4.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-vrouter-agent'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007fb39ee46350 in std::_Rb_tree_insert_and_rebalance(bool, std::_Rb_tree_node_base*, std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) () from /lib64/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.26-23.el7.x86_64 glibc-2.17-222.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-19.el7.x86_64 libcom_err-1.42.9-12.el7_5.x86_64 libcurl-7.29.0-46.el7.x86_64 libgcc-4.8.5-28.el7_5.1.x86_64 libidn-1.28-4.el7.x86_64 libselinux-2.5-12.el7.x86_64 libssh2-1.4.3-10.el7_2.1.x86_64 libstdc++-4.8.5-28.el7_5.1.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 nspr-4.19.0-1.el7_5.x86_64 nss-3.36.0-7.el7_5.x86_64 nss-softokn-freebl-3.36.0-5.el7_5.x86_64 nss-util-3.36.0-1.el7_5.x86_64 openldap-2.4.44-15.el7_5.x86_64 openssl-libs-1.0.2k-12.el7.x86_64 pcre-8.32-17.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x00007fb39ee46350 in std::_Rb_tree_insert_and_rebalance(bool, std::_Rb_tree_node_base*, std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) () from /lib64/libstdc++.so.6
#1 0x0000000000ec685d in _M_insert_ (__v=..., __p=0xa2ccf00, __x=0x0, this=0xa2a3808) at /usr/include/c++/4.8.2/bits/stl_tree.h:1025
#2 std::_Rb_tree<boost::intrusive_ptr<DBTableWalk>, boost::intrusive_ptr<DBTableWalk>, std::_Identity<boost::intrusive_ptr<DBTableWalk> >, std::less<boost::intrusive_ptr<DBTableWalk> >, std::allocator<boost::intrusive_ptr<DBTableWalk> > >::_M_insert_unique (this=0xa2a3808, __v=...) at /usr/include/c++/4.8.2/bits/stl_tree.h:1382
#3 0x0000000000ec619b in insert (__x=..., this=<optimized out>) at /usr/include/c++/4.8.2/bits/stl_set.h:463
#4 AppendWalkReq (ref=..., this=<optimized out>) at controller/src/db/db_table_walk_mgr.h:133
#5 DBTableWalkMgr::WalkTable (this=0x2c9e2c0, walk=...) at controller/src/db/db_table_walk_mgr.cc:104
#6 0x0000000000ec6564 in DBTableWalkMgr::WalkAgain (this=<optimized out>, ref=...) at controller/src/db/db_table_walk_mgr.cc:86
#7 0x0000000000ec00ae in DBTable::WalkAgain (this=this@entry=0x3466d80, walk=...) at controller/src/db/db_table.cc:625
#8 0x0000000000be4132 in AgentSandesh::DoSandeshInternal (this=0xa2b2390, sandesh=..., first=<optimized out>, first@entry=0, last=<optimized out>, last@entry=99)
at controller/src/vnsw/agent/oper/agent_sandesh.cc:963
#9 0x0000000000be4462 in AgentSandesh::DoSandesh (sandesh=..., first=first@entry=0, last=last@entry=99) at controller/src/vnsw/agent/oper/agent_sandesh.cc:967
#10 0x0000000000be44da in AgentSandesh::DoSandesh (sandesh=...) at controller/src/vnsw/agent/oper/agent_sandesh.cc:971
#11 0x0000000000cf19fd in VmListReq::HandleRequest (this=<optimized out>) at controller/src/vnsw/agent/oper/vm.cc:210
#12 0x0000000000dab46d in Sandesh::ProcessRecv (rsnh=0xa20fc00) at src/contrail-common/sandesh/library/cpp/sandesh.cc:566
#13 0x0000000000dbf8c4 in operator() (a0=0xa20fc00, this=0x7fb397a05af0) at /usr/include/boost/function/function_template.hpp:767
#14 RunQueue (this=0x89abb60) at src/contrail-common/base/queue_task.h:67
#15 QueueTaskRunner<SandeshRequest*, WorkQueue<SandeshRequest*> >::Run (this=0x89abb60) at src/contrail-common/base/queue_task.h:42
#16 0x0000000000e9ad5f in TaskImpl::execute (this=0x89912c0) at src/contrail-common/base/task.cc:281
#17 0x00007fb39f31a66a in ?? ()
#18 0x0000000000000001 in ?? ()
#19 0x0000000000000000 in ?? ()
(gdb) q
===============================================================================================
[root@nodec61 crashes]# contrail-status
Pod Service Original Name State Status
vrouter agent contrail-vrouter-agent running Up About an hour
vrouter nodemgr contrail-nodemgr running Up 3 hours
vrouter kernel module is PRESENT
== Contrail vrouter ==
nodemgr: active
agent: active
[root@nodec61 crashes]#
===============================================================================================
[root@nodeg12 ~]#
[root@nodeg12 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
nodec60 Ready <none> 4h v1.9.2
nodec61 Ready <none> 4h v1.9.2
nodeg12 NotReady master 4h v1.9.2
[root@nodeg12 ~]# cntrail
-bash: cntrail: command not found
[root@nodeg12 ~]# contrail-status
Pod Service Original Name State Status redis contrail-external-redis running Up 4 hours
analytics alarm-gen contrail-analytics-alarm-gen running Up 4 hours
analytics api contrail-analytics-api running Up 4 hours
analytics collector contrail-analytics-collector running Up 4 hours
analytics nodemgr contrail-nodemgr running Up 4 hours
analytics query-engine contrail-analytics-query-engine running Up 4 hours
analytics snmp-collector contrail-analytics-snmp-collector running Up 4 hours
analytics topology contrail-analytics-topology running Up 4 hours
config api contrail-controller-config-api running Up 3 hours
config device-manager contrail-controller-config-devicemgr running Up 4 hours
config nodemgr contrail-nodemgr running Up 4 hours
config schema contrail-controller-config-schema running Up 4 hours
config svc-monitor contrail-controller-config-svcmonitor running Up 4 hours
config-database cassandra contrail-external-cassandra running Up 4 hours
config-database nodemgr contrail-nodemgr running Up 4 hours
config-database rabbitmq contrail-external-rabbitmq running Up 4 hours
config-database zookeeper contrail-external-zookeeper running Up 4 hours
control control contrail-controller-control-control running Up 4 hours
control dns contrail-controller-control-dns running Up 4 hours
control named contrail-controller-control-named running Up 4 hours
control nodemgr contrail-nodemgr running Up 4 hours
database cassandra contrail-external-cassandra running Up 4 hours
database kafka contrail-external-kafka running Up 4 hours
database nodemgr contrail-nodemgr running Up 4 hours
database zookeeper contrail-external-zookeeper running Up 4 hours
kubernetes kube-manager contrail-kubernetes-kube-manager running Up 29 minutes
webui job contrail-controller-webui-job running Up 4 hours
webui web contrail-controller-webui-web running Up 4 hours
WARNING: container with original name 'contrail-external-redis' have Pod or Service empty. Pod: '' / Service: 'redis'. Please pass NODE_TYPE with pod name to container's env
== Contrail control ==
control: active
nodemgr: active
named: active
dns: active
== Contrail config-database ==
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
rabbitmq: active
cassandra: active
== Contrail kubernetes ==
kube-manager: backup
== Contrail database ==
kafka: active
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
cassandra: active
== Contrail analytics ==
snmp-collector: active
query-engine: active
api: active
alarm-gen: active
nodemgr: active
collector: active
topology: active
== Contrail webui ==
web: active
job: active
== Contrail config ==
svc-monitor: backup
nodemgr: active
device-manager: backup
api: active
schema: backup
I hit this core only once on my setup,ciuld not reproduce again with the same test. /cs-/shared/ bugs ansible- deployer ======= ======= ======= ======= ===== ======= ======= ======= ======= =====
core file and binary and symbols are kept @nodem4:
Orchestrator :Kubernetes
HOSTOS :centos7.5
SKU :queens
build :5.0-291
deployer :contrail-
=======
Topology
=======
vrouter +k8s_node:
ip: nodec60
ip: nodec61
config +control+ +kubemanager:
ip: nodeg12(k8s_master) ======= ======= ======= ======= ===== ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ==== vrouter- agent core.contrail- vroute. 5010.nodec61. 1539766364 gnu.org/ licenses/ gpl.html> redhat- linux-gnu" . www.gnu. org/software/ gdb/bugs/>... contrail- vrouter- agent.. .Reading symbols from /var/crashes/ contrail- vrouter- agent.debug. ..done.
ip: nodeg31
ip: nodec58
=======
backtrace
=======
[root@nodec61 crashes]# gdb contrail-
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-
For bug reporting instructions, please see:
<http://
Reading symbols from /var/crashes/
done.
warning: core file may not match specified executable file.
[New LWP 5065]
[New LWP 5061]
[New LWP 5068]
[New LWP 5066]
[New LWP 5010]
[New LWP 5064]
[New LWP 5063]
[New LWP 5062]
[New LWP 5067]
warning: Could not load shared library symbols for 14 libraries, e.g. /lib64/ libtcmalloc. so.4. libthread_ db.so.1" . contrail- vrouter- agent'. tree_insert_ and_rebalance( bool, std::_Rb_ tree_node_ base*, std::_Rb_ tree_node_ base*, std::_Rb_ tree_node_ base&) () from /lib64/ libstdc+ +.so.6 lib-2.1. 26-23.el7. x86_64 glibc-2. 17-222. el7.x86_ 64 keyutils- libs-1. 5.8-3.el7. x86_64 krb5-libs- 1.15.1- 19.el7. x86_64 libcom_ err-1.42. 9-12.el7_ 5.x86_64 libcurl- 7.29.0- 46.el7. x86_64 libgcc- 4.8.5-28. el7_5.1. x86_64 libidn- 1.28-4. el7.x86_ 64 libselinux- 2.5-12. el7.x86_ 64 libssh2- 1.4.3-10. el7_2.1. x86_64 libstdc+ +-4.8.5- 28.el7_ 5.1.x86_ 64 libxml2- 2.9.1-6. el7_2.3. x86_64 nspr-4. 19.0-1. el7_5.x86_ 64 nss-3.36. 0-7.el7_ 5.x86_64 nss-softokn- freebl- 3.36.0- 5.el7_5. x86_64 nss-util- 3.36.0- 1.el7_5. x86_64 openldap- 2.4.44- 15.el7_ 5.x86_64 openssl- libs-1. 0.2k-12. el7.x86_ 64 pcre-8. 32-17.el7. x86_64 xz-libs- 5.2.2-1. el7.x86_ 64 zlib-1. 2.7-17. el7.x86_ 64 tree_insert_ and_rebalance( bool, std::_Rb_ tree_node_ base*, std::_Rb_ tree_node_ base*, std::_Rb_ tree_node_ base&) () from /lib64/ libstdc+ +.so.6 c++/4.8. 2/bits/ stl_tree. h:1025 tree<boost: :intrusive_ ptr<DBTableWalk >, boost:: intrusive_ ptr<DBTableWalk >, std::_Identity< boost:: intrusive_ ptr<DBTableWalk > >, std::less< boost:: intrusive_ ptr<DBTableWalk > >, std::allocator< boost:: intrusive_ ptr<DBTableWalk > > >::_M_insert_unique (this=0xa2a3808, __v=...) at /usr/include/ c++/4.8. 2/bits/ stl_tree. h:1382 c++/4.8. 2/bits/ stl_set. h:463 src/db/ db_table_ walk_mgr. h:133 :WalkTable (this=0x2c9e2c0, walk=...) at controller/ src/db/ db_table_ walk_mgr. cc:104 :WalkAgain (this=<optimized out>, ref=...) at controller/ src/db/ db_table_ walk_mgr. cc:86 entry=0x3466d80 , walk=...) at controller/ src/db/ db_table. cc:625 :DoSandeshInter nal (this=0xa2b2390, sandesh=..., first=<optimized out>, first@entry=0, last=<optimized out>, last@entry=99) src/vnsw/ agent/oper/ agent_sandesh. cc:963 :DoSandesh (sandesh=..., first=first@ entry=0, last=last@entry=99) at controller/ src/vnsw/ agent/oper/ agent_sandesh. cc:967 :DoSandesh (sandesh=...) at controller/ src/vnsw/ agent/oper/ agent_sandesh. cc:971 :HandleRequest (this=<optimized out>) at controller/ src/vnsw/ agent/oper/ vm.cc:210 :ProcessRecv (rsnh=0xa20fc00) at src/contrail- common/ sandesh/ library/ cpp/sandesh. cc:566 5af0) at /usr/include/ boost/function/ function_ template. hpp:767 common/ base/queue_ task.h: 67 <SandeshRequest *, WorkQueue< SandeshRequest* > >::Run (this=0x89abb60) at src/contrail- common/ base/queue_ task.h: 42 common/ base/task. cc:281 ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ====
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/
Core was generated by `/usr/bin/
Program terminated with signal 11, Segmentation fault.
#0 0x00007fb39ee46350 in std::_Rb_
Missing separate debuginfos, use: debuginfo-install cyrus-sasl-
(gdb) bt
#0 0x00007fb39ee46350 in std::_Rb_
#1 0x0000000000ec685d in _M_insert_ (__v=..., __p=0xa2ccf00, __x=0x0, this=0xa2a3808) at /usr/include/
#2 std::_Rb_
#3 0x0000000000ec619b in insert (__x=..., this=<optimized out>) at /usr/include/
#4 AppendWalkReq (ref=..., this=<optimized out>) at controller/
#5 DBTableWalkMgr:
#6 0x0000000000ec6564 in DBTableWalkMgr:
#7 0x0000000000ec00ae in DBTable::WalkAgain (this=this@
#8 0x0000000000be4132 in AgentSandesh:
at controller/
#9 0x0000000000be4462 in AgentSandesh:
#10 0x0000000000be44da in AgentSandesh:
#11 0x0000000000cf19fd in VmListReq:
#12 0x0000000000dab46d in Sandesh:
#13 0x0000000000dbf8c4 in operator() (a0=0xa20fc00, this=0x7fb397a0
#14 RunQueue (this=0x89abb60) at src/contrail-
#15 QueueTaskRunner
#16 0x0000000000e9ad5f in TaskImpl::execute (this=0x89912c0) at src/contrail-
#17 0x00007fb39f31a66a in ?? ()
#18 0x0000000000000001 in ?? ()
#19 0x0000000000000000 in ?? ()
(gdb) q
=======
[root@nodec61 crashes]# contrail-status vrouter- agent running Up About an hour
Pod Service Original Name State Status
vrouter agent contrail-
vrouter nodemgr contrail-nodemgr running Up 3 hours
vrouter kernel module is PRESENT
== Contrail vrouter ==
nodemgr: active
agent: active
[root@nodec61 crashes]# ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ======= ====
redis contrail- external- redis running Up 4 hours analytics- alarm-gen running Up 4 hours analytics- api running Up 4 hours analytics- collector running Up 4 hours analytics- query-engine running Up 4 hours analytics- snmp-collector running Up 4 hours analytics- topology running Up 4 hours controller- config- api running Up 3 hours controller- config- devicemgr running Up 4 hours controller- config- schema running Up 4 hours controller- config- svcmonitor running Up 4 hours external- cassandra running Up 4 hours external- rabbitmq running Up 4 hours external- zookeeper running Up 4 hours controller- control- control running Up 4 hours controller- control- dns running Up 4 hours controller- control- named running Up 4 hours external- cassandra running Up 4 hours external- kafka running Up 4 hours external- zookeeper running Up 4 hours kubernetes- kube-manager running Up 29 minutes controller- webui-job running Up 4 hours controller- webui-web running Up 4 hours
=======
[root@nodeg12 ~]#
[root@nodeg12 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
nodec60 Ready <none> 4h v1.9.2
nodec61 Ready <none> 4h v1.9.2
nodeg12 NotReady master 4h v1.9.2
[root@nodeg12 ~]# cntrail
-bash: cntrail: command not found
[root@nodeg12 ~]# contrail-status
Pod Service Original Name State Status
analytics alarm-gen contrail-
analytics api contrail-
analytics collector contrail-
analytics nodemgr contrail-nodemgr running Up 4 hours
analytics query-engine contrail-
analytics snmp-collector contrail-
analytics topology contrail-
config api contrail-
config device-manager contrail-
config nodemgr contrail-nodemgr running Up 4 hours
config schema contrail-
config svc-monitor contrail-
config-database cassandra contrail-
config-database nodemgr contrail-nodemgr running Up 4 hours
config-database rabbitmq contrail-
config-database zookeeper contrail-
control control contrail-
control dns contrail-
control named contrail-
control nodemgr contrail-nodemgr running Up 4 hours
database cassandra contrail-
database kafka contrail-
database nodemgr contrail-nodemgr running Up 4 hours
database zookeeper contrail-
kubernetes kube-manager contrail-
webui job contrail-
webui web contrail-
WARNING: container with original name 'contrail- external- redis' have Pod or Service empty. Pod: '' / Service: 'redis'. Please pass NODE_TYPE with pod name to container's env
== Contrail control ==
control: active
nodemgr: active
named: active
dns: active
== Contrail config-database ==
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
rabbitmq: active
cassandra: active
== Contrail kubernetes ==
kube-manager: backup
== Contrail database ==
kafka: active
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
cassandra: active
== Contrail analytics ==
snmp-collector: active
query-engine: active
api: active
alarm-gen: active
nodemgr: active
collector: active
topology: active
== Contrail webui ==
web: active
job: active
== Contrail config ==
svc-monitor: backup
nodemgr: active
device-manager: backup
api: active
schema: backup
[root@nodeg12 ~]#
=