[k8s-R5.0]: Docker restart hangs indefinitely and docker daemon stops running post that

Bug #1764739 reported by Pulkit Tandon
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R5.0
Invalid
Medium
Prasanna Mucharikar
Trunk
Invalid
Medium
Prasanna Mucharikar

Bug Description

Configuration:
K8s 1.9.2
coat-5.0-15
Centos-7.4

Setup:
5 node setup.
1 Kube master. 3 Controller.
2 Agent+ K8s slaves

Description:
On docker restart on k8s slave/compute node, the command hangs indefinitely.
No more docker commands or any system command works post that.

This is a sporadic issue and observed during k8s sanity multiple times.
All tests post this will fail.

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

[root@nodel8 log]# docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

summary: - [k8s-R5.0]: Vrouter agent crash and Pods creation fails on build 16
+ [k8s-R5.0]: Docker restart hangs indefinitely and docker daemon stops
+ running post that
Revision history for this message
Sachchidanand Vaidya (vaidyasd) wrote :

It's intermittent. May be side effect of the memory issue.

Revision history for this message
Sachchidanand Vaidya (vaidyasd) wrote :

It's intermittent. May be side effect of the memory issue.
Rerun and check if you still see it.

Revision history for this message
Sachchidanand Vaidya (vaidyasd) wrote :

Pls rerun and check.

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

Venky(@vvelpula) observed this issue again on his setup.
Added him as a watcher to this bug. He may contact you today for the same.

Though intermittent but its observed many times now.

Revision history for this message
Venkatesh Velpula (vvelpula) wrote :

Hi Sachin,

   I came across the same issue today with 18th build . let me know if you want to have a look at the setup

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

As the occurrence is frequent and impact is high, changing the Importance for this bug.

tags: added: sanityblocker
Jeba Paulaiyan (jebap)
tags: added: releasenote
Jeba Paulaiyan (jebap)
tags: added: blocker sanity
removed: sanityblocker
Revision history for this message
Prasanna Mucharikar (mprasanna) wrote :

Unable to reproduce on local cluster. Need to debug a cluster with issue reproduced.

Revision history for this message
Prasanna Mucharikar (mprasanna) wrote :

Recommended docker upgrade to 17.0.3.2 and even to upcoming 17.07 release to overcome intermittent docker hang issues.

https://github.com/moby/moby/issues/33710

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

Hi Prasanna,

The recommendation for docker upgrade should go to Andrey or Micheal Henkel. This should be handled as part of contrail-ansible-deployer

W.r.t this bug, I have not seen it occurring for quite some time now.
On build R5.0-86, I gave 6 consecutive attempts to reproduce it, but problem did not happen.
Thus, I am lowering down the priority and removing the tag as blocker.

If this problem do not happen for next few releases, we can close this bug.
Thanks!

tags: removed: blocker
Revision history for this message
Pulkit Tandon (pulkitt) wrote :

As no code change/fix has gone w.r.t this issue, marking it "Fix Released" is not correct.
Unfortunately, "not reproducible" state is not present.

Thus, marking the bug as "invalid".

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.