[K8s]: Agent down in new pod after initial agent pod is killed

Bug #1732607 reported by Pulkit Tandon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.0
New
High
Sachchidanand Vaidya
R4.1
New
High
Sachchidanand Vaidya
Trunk
Invalid
High
Sachchidanand Vaidya

Bug Description

Build: R4.0-98
contrail-agent-rz6kv

Setup:
5 node system
3 Controllers. 1 of the contrail controller is Kube master
2 agent + Kube slaves
Setup provisioned through Single Yaml.

Steps:
After provisioning, everything came up fine.
Delete the agent pod on 1 of the slave.
The new container will get automatically triggered.

Observation:
agent was in "inactive' state.
Agent bring up failed in the new pod.
The reason was that agent.conf was having nothing populated.

Workaround:
Reprovision the contrail containers using single yaml.

Its easily reproducible. Please let me know if any specific logs are required.

Tags: provisioning
Changed in juniperopenstack:
assignee: nobody → Sachchidanand Vaidya (vaidyasd)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/37794
Submitter: Prasanna Mucharikar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/37794
Committed: http://github.com/Juniper/contrail-controller/commit/e38379297fa4954ff3c04149cb1c97a3fb4bb117
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit e38379297fa4954ff3c04149cb1c97a3fb4bb117
Author: Prasanna Mucharikar <email address hidden>
Date: Wed Nov 22 10:15:41 2017 -0800

If vhost0 is present but info in contrail-vrouter-agent.conf is missing,
treat it as fresh pod install.
Closes-Bug: #1732607

Change-Id: Ic125ce638a7f4836c11afc6e9cfee9fefdafda43

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/38499
Submitter: Prasanna Mucharikar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/38522
Submitter: Prasanna Mucharikar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/38499
Committed: http://github.com/Juniper/contrail-controller/commit/2b4731b7d8ae60cf540cdb18f891bb26b94d36b7
Submitter: Zuul (<email address hidden>)
Branch: master

commit 2b4731b7d8ae60cf540cdb18f891bb26b94d36b7
Author: Prasanna Mucharikar <email address hidden>
Date: Wed Nov 22 10:15:41 2017 -0800

If vhost0 is present but info in contrail-vrouter-agent.conf is missing,
treat it as fresh pod install.
Closes-Bug: #1732607

Change-Id: Ic125ce638a7f4836c11afc6e9cfee9fefdafda43
(cherry picked from commit e38379297fa4954ff3c04149cb1c97a3fb4bb117)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/38522
Committed: http://github.com/Juniper/contrail-controller/commit/34fe93a76ac665d459afeb8998521ecc9f82de54
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 34fe93a76ac665d459afeb8998521ecc9f82de54
Author: Prasanna Mucharikar <email address hidden>
Date: Wed Nov 22 10:15:41 2017 -0800

If vhost0 is present but info in contrail-vrouter-agent.conf is missing,
treat it as fresh pod install.
Closes-Bug: #1732607

Change-Id: Ic125ce638a7f4836c11afc6e9cfee9fefdafda43
(cherry picked from commit e38379297fa4954ff3c04149cb1c97a3fb4bb117)

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

Issue still exist in mainline:
R5.0-ocata-ubuntu16-93

After deleting the agent pod, when new pod is spawned, multiple vrouter crashes observed.
Also, the agent state remains inactive.

root@nodei18(agent):/# contrail-status
== Contrail vRouter ==
contrail-vrouter-agent: inactive
contrail-vrouter-nodemgr: active
========Run time service failures=============
/var/crashes/core.contrail-vroute.1463.nodei18.1517385953
/var/crashes/core.contrail-vroute.1062.nodei18.1517385945
/var/crashes/core.contrail-vroute.1524.nodei18.1517385954
/var/crashes/core.contrail-vroute.1213.nodei18.1517385947
/var/crashes/core.contrail-vroute.1338.nodei18.1517385948
/var/crashes/core.contrail-vroute.864.nodei18.1517385944
/var/crashes/core.contrail-vroute.1434.nodei18.1517385951
/var/crashes/core.contrail-vroute.1380.nodei18.1517385950

Note that this time, the agent.conf was not empty. It was having all the variables populated correctly.

Crash logs:
server : 10.204.216.50 (bhushana@mayamruga)
Path: /home/bhushana/Documents/technical/bugs/1732607

Revision history for this message
Prasanna Mucharikar (mprasanna) wrote :

Hi Pulkit,
could you show this to vrouter team? I am not sure how I can be of help.

regards
Prasanna

Revision history for this message
Sachchidanand Vaidya (vaidyasd) wrote :

Not applicable.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.