k8s:oberving agent core while running k8s sanity
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R5.0 |
New
|
High
|
Sachchidanand Vaidya | |||
Trunk |
New
|
High
|
Sachchidanand Vaidya |
Bug Description
I hit this core only once on my setup, couldnot reproduce again with the same test.
the test creates two namespaces one of them with custom isolated and spawns few pods in each namespaces and perform reachability checks with in the namespace and across namespaces
core file and binary and symbols are kept @nodem4:
[root@nodem4 1798371]# ls -ltrh
total 511M
-rwxr-xr-x 1 fedora fedora 25M Oct 17 2018 contrail-
-r--r--r-- 1 fedora fedora 324M Oct 17 2018 contrail-
-rw------- 1 fedora fedora 163M Oct 17 2018 core.contrail-
[root@nodem4 1798371]# pwd
/cs-shared/
[root@nodem4 1798371]#
Orchestrator :Kubernetes
HOSTOS :centos7.5
SKU :queens
build :5.0-291
deployer :contrail-
=======
Topology
=======
vrouter +k8s_node:
ip: nodec60
ip: nodec61
config +control+
ip: nodeg12(k8s_master)
ip: nodeg31
ip: nodec58
=======
backtrace
=======
[root@nodec61 crashes]# gdb contrail-
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-110.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-
For bug reporting instructions, please see:
<http://
Reading symbols from /var/crashes/
done.
warning: core file may not match specified executable file.
[New LWP 5065]
[New LWP 5061]
[New LWP 5068]
[New LWP 5066]
[New LWP 5010]
[New LWP 5064]
[New LWP 5063]
[New LWP 5062]
[New LWP 5067]
warning: Could not load shared library symbols for 14 libraries, e.g. /lib64/
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/
Core was generated by `/usr/bin/
Program terminated with signal 11, Segmentation fault.
#0 0x00007fb39ee46350 in std::_Rb_
Missing separate debuginfos, use: debuginfo-install cyrus-sasl-
(gdb) bt
#0 0x00007fb39ee46350 in std::_Rb_
#1 0x0000000000ec685d in _M_insert_ (__v=..., __p=0xa2ccf00, __x=0x0, this=0xa2a3808) at /usr/include/
#2 std::_Rb_
#3 0x0000000000ec619b in insert (__x=..., this=<optimized out>) at /usr/include/
#4 AppendWalkReq (ref=..., this=<optimized out>) at controller/
#5 DBTableWalkMgr:
#6 0x0000000000ec6564 in DBTableWalkMgr:
#7 0x0000000000ec00ae in DBTable::WalkAgain (this=this@
#8 0x0000000000be4132 in AgentSandesh:
at controller/
#9 0x0000000000be4462 in AgentSandesh:
#10 0x0000000000be44da in AgentSandesh:
#11 0x0000000000cf19fd in VmListReq:
#12 0x0000000000dab46d in Sandesh:
#13 0x0000000000dbf8c4 in operator() (a0=0xa20fc00, this=0x7fb397a0
#14 RunQueue (this=0x89abb60) at src/contrail-
#15 QueueTaskRunner
#16 0x0000000000e9ad5f in TaskImpl::execute (this=0x89912c0) at src/contrail-
#17 0x00007fb39f31a66a in ?? ()
#18 0x0000000000000001 in ?? ()
#19 0x0000000000000000 in ?? ()
(gdb) q
=======
[root@nodec61 crashes]# contrail-status
Pod Service Original Name State Status
vrouter agent contrail-
vrouter nodemgr contrail-nodemgr running Up 3 hours
vrouter kernel module is PRESENT
== Contrail vrouter ==
nodemgr: active
agent: active
[root@nodec61 crashes]#
=======
[root@nodeg12 ~]#
[root@nodeg12 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
nodec60 Ready <none> 4h v1.9.2
nodec61 Ready <none> 4h v1.9.2
nodeg12 NotReady master 4h v1.9.2
[root@nodeg12 ~]# cntrail
-bash: cntrail: command not found
[root@nodeg12 ~]# contrail-status
Pod Service Original Name State Status
analytics alarm-gen contrail-
analytics api contrail-
analytics collector contrail-
analytics nodemgr contrail-nodemgr running Up 4 hours
analytics query-engine contrail-
analytics snmp-collector contrail-
analytics topology contrail-
config api contrail-
config device-manager contrail-
config nodemgr contrail-nodemgr running Up 4 hours
config schema contrail-
config svc-monitor contrail-
config-database cassandra contrail-
config-database nodemgr contrail-nodemgr running Up 4 hours
config-database rabbitmq contrail-
config-database zookeeper contrail-
control control contrail-
control dns contrail-
control named contrail-
control nodemgr contrail-nodemgr running Up 4 hours
database cassandra contrail-
database kafka contrail-
database nodemgr contrail-nodemgr running Up 4 hours
database zookeeper contrail-
kubernetes kube-manager contrail-
webui job contrail-
webui web contrail-
WARNING: container with original name 'contrail-
== Contrail control ==
control: active
nodemgr: active
named: active
dns: active
== Contrail config-database ==
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
rabbitmq: active
cassandra: active
== Contrail kubernetes ==
kube-manager: backup
== Contrail database ==
kafka: active
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
cassandra: active
== Contrail analytics ==
snmp-collector: active
query-engine: active
api: active
alarm-gen: active
nodemgr: active
collector: active
topology: active
== Contrail webui ==
web: active
job: active
== Contrail config ==
svc-monitor: backup
nodemgr: active
device-manager: backup
api: active
schema: backup
[root@nodeg12 ~]#
=======
[root@nodeg12 ~]# kubectl get pods -n ctest-ns1-62653991
NAME READY STATUS RESTARTS AGE
ctest-busybox-
ctest-busybox-
ctest-busybox-
ctest-nginx-
ctest-nginx-
[root@nodeg12 ~]#
[root@nodeg12 ~]# kubectl get pods -n ctest-ns2-83384090
NAME READY STATUS RESTARTS AGE
ctest-busybox-
ctest-busybox-
ctest-nginx-
ctest-nginx-
[root@nodeg12 ~]# kubectl describe ns ctest-ns2-83384090
Name: ctest-ns2-83384090
Labels: <none>
Annotations: opencontrail.
Status: Active
No resource quota.
No resource limits.
[root@nodeg12 ~]# kubectl describe ns ctest-ns1-62653991
Name: ctest-ns1-62653991
Labels: <none>
Annotations: <none>
Status: Active
No resource quota.
description: | updated |
Changed in juniperopenstack: | |
milestone: | none → r5.0.3 |
description: | updated |