Tempest crashes on 'ContainerCreating' status during pod creation

Bug #1940955 reported by Sunday Mgbogu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
New
Undecided
Unassigned

Bug Description

Running tempest ubuntu VM, I noticed that the Pod creation gets stuck on 'ContainerCreating' mode. I needed to delete the pods and restart the contoller and manually create pods and services to get rid of this. If the Kuryr CNI hangs on 'ContainerCreating' and tries to check connectivity, tempest crashes.

Below is the kuryr-cni daemon crash log:

ug 25 00:00:28 localhost systemd[1]: logrotate.service: Succeeded.
Aug 24 00:00:28 localhost systemd[1]: Finished Rotate log files.
Aug 24 00:00:28 localhost dockerd[135885]: time="2021-08-24T00:00:28.328779284Z" level=debug msg="Calling GET /v1.40/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dpodsandbox%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:28 localhost dockerd[135885]: time="2021-08-24T00:00:28.330992323Z" level=debug msg="Calling GET /v1.40/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dcontainer%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:28 localhost etcd[117986]: start time = 2021-08-24 00:00:28.917600772 +0000 UTC m=+397649.044032333, time spent = 102.8µs, remote = 10.196.0.57:41988, response type = /etcdserverpb.KV/Range, request count = 0, request size = 18, response count = 0, response size = 29, request content = key:"/registry/health"
Aug 24 00:00:29 localhost dockerd[135885]: time="2021-08-24T00:00:29.332865434Z" level=debug msg="Calling GET /v1.40/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dpodsandbox%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:29 localhost dockerd[135885]: time="2021-08-24T00:00:29.334545884Z" level=debug msg="Calling GET /v1.40/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dcontainer%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:29 localhost kuryr-daemon[1367910]: 2021-08-24 00:00:29.460 1367910 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] 4999-ms timeout __log_wakeup /usr/local/lib/python3.8/dist-packages/ovs/poller.py:248#033[00m
Aug 24 00:00:29 localhost kuryr-daemon[1367910]: 2021-08-24 00:00:29.461 1367910 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: idle 5001 ms, sending inactivity probe run /usr/local/lib/python3.8/dist-packages/ovs/reconnect.py:103#033[00m
Aug 24 00:00:29 localhost kuryr-daemon[1367910]: 2021-08-24 00:00:29.461 1367910 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: entering IDLE _transition /usr/local/lib/python3.8/dist-packages/ovs/reconnect.py:488#033[00m
Aug 24 00:00:29 localhost kuryr-daemon[1367910]: 2021-08-24 00:00:29.461 1367910 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: entering ACTIVE _transition /usr/local/lib/python3.8/dist-packages/ovs/reconnect.py:488#033[00m
Aug 24 00:00:29 localhost kuryr-daemon[1367910]: 2021-08-24 00:00:29.462 1367910 DEBUG ovsdbapp.backend.ovs_idl.vlog [-] [POLLIN] on fd 23 __log_wakeup /usr/local/lib/python3.8/dist-packages/ovs/poller.py:263#033[00m
Aug 24 00:00:29 localhost dockerd[135885]: time="2021-08-24T00:00:29.713522606Z" level=debug msg="Calling GET /v1.40/version"
Aug 24 00:00:29 localhost etcd[117986]: start time = 2021-08-24 00:00:29.917526837 +0000 UTC m=+397650.043958386, time spent = 134.91µs, remote = 10.196.0.57:41988, response type = /etcdserverpb.KV/Range, request count = 0, request size = 18, response count = 0, response size = 29, request content = key:"/registry/health"
Aug 24 00:00:30 localhost dockerd[135885]: time="2021-08-24T00:00:30.135753359Z" level=debug msg="Calling GET /v1.40/containers/json?filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dpodsandbox%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:30 localhost dockerd[135885]: time="2021-08-24T00:00:30.137158238Z" level=debug msg="Calling GET /v1.40/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dcontainer%22%3Atrue%7D%2C%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:30 localhost dockerd[135885]: time="2021-08-24T00:00:30.138326837Z" level=debug msg="Calling GET /v1.40/containers/json?filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dpodsandbox%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:30 localhost dockerd[135885]: time="2021-08-24T00:00:30.139723271Z" level=debug msg="Calling GET /v1.40/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dcontainer%22%3Atrue%7D%2C%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0"
Aug 24 00:00:30 localhost kubelet[153109]: E0824 00:00:30.140783 153109 kubelet_volumes.go:154] orphaned pod "5966eaa4-e460-43df-b97d-71876c0f1468" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them.
Aug 24 00:00:30 localhost etcd[117986]: start time = 2021-08-24 00:00:30.211035578 +0000 UTC m=+397650.337467128, time spent = 117.446µs, remote = 10.196.0.57:41436, response type = /etcdserverpb.KV/Range, request count = 0, request size = 66, response count = 0, response size = 29, request content = key:"/registry/controllerrevisions/" range_end:"/registry/controllerrevisions0" count_only:true
Aug 24 00:00:30 localhost <email address hidden>[1290303]: [pid: 1290303|app: 0|req: 152/15393] 10.196.0.57 () {56 vars in 1305 bytes} [Tue Aug 24 00:00:30 2021] GET /load-balancer/v2.0/lbaas/loadbalancers/17f24dc2-11c1-42b4-b381-d2930ade1680 => generated 814 bytes in 50 msecs (HTTP/1.1 200) 4 headers in 157 bytes (1 switches on core 0)
Aug 24 00:00:30 localhost kuryr-k8s-controller[1342264]: 2021-08-24 00:00:30.233 1342264 DEBUG kuryr_kubernetes.controller.drivers.lbaasv2 [-] Provisioning status PENDING_UPDATE for {'id': @@@
"reconcile" 547102L, 164078826C

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.