RHOSP10 Contrail 4.1.1 Upgrade

Bug #1797981 reported by vinaykumar tejavath
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
R4.1
Fix Committed
High
alexey-mr

Bug Description

Environment:
Deployment: Contrail Networking 4.1.0
Openstack/Docker/Kubernetes/Vmware SKU*:RHOSP10
Exact Host OS version*: RH7.4

Description:
Customer unable to upgrade to 4.1.0 from 4.1.1.

provision_control.py execution failed. contrail api coudn't connect to keystone while executing the script. Contrail trying to execute the command while keystone service is down.
This is an HA setup.

We need to add delay before executing provisioning scripts and make sure the api service is up before running provisioning_control, provision_config.py etc.

    Error: python /opt/contrail/utils/provision_control.py --router_asn 64512 --api_server_ip 172.18.225.34 --api_server_port 8082 --api_server_use_ssl false --admin_user admin --admin_password 6EEgq8WesfseXA7g9YpR3M29Z --admin_tenant admin returned 1 instead of one of [0]
    Error: /Stage[main]/Contrail::Control::Provision_control/Exec[provision_control.py srbhonciye63 api_server config]/returns: change from notrun to 0 failed: python /opt/contrail/utils/provision_control.py --router_asn 64512 --api_server_ip 172.18.225.34 --api_server_port 8082 --api_server_use_ssl false --admin_user admin --admin_password 6EEgq8WesfseXA7g9YpR3M29Z --admin_tenant admin returned 1 instead of one of [0]
    Warning: /Stage[main]/Contrail::Control::Provision_control/Exec[provision_control.py srbhonciye63 bgp speaker]: Skipping because of failed dependencies
    Warning: /Stage[main]/Contrail::Control::Provision_encap/Exec[provision_encap.py 172.18.225.34]: Skipping because of failed dependencies
    Error: python /opt/contrail/utils/provision_config_node.py --host_name srbhonciye63 --host_ip 172.18.225.47 --api_server_ip 172.18.225.34 --api_server_port 8082 --api_server_use_ssl false --admin_user admin --admin_password 6EEgq8WesfseXA7g9YpR3M29Z --admin_tenant admin --openstack_ip 192.168.0.6 --oper add returned 1 instead of one of [0]
    Error: /Stage[main]/Contrail::Config::Provision_config/Exec[provision_config_node.py srbhonciye63]/returns: change from notrun to 0 failed: python /opt/contrail/utils/provision_config_node.py --host_name srbhonciye63 --host_ip 172.18.225.47 --api_server_ip 172.18.225.34 --api_server_port 8082 --api_server_use_ssl false --admin_user admin --admin_password 6EEgq8WesfseXA7g9YpR3M29Z --admin_tenant admin --openstack_ip 192.168.0.6 --oper add returned 1 instead
of one of [0]

Attached the log files below.

Tags: jtac-p1 rhosp
Revision history for this message
vinaykumar tejavath (vtejavath) wrote :
information type: Proprietary → Public
Changed in juniperopenstack:
importance: Undecided → Critical
assignee: nobody → alexey-mr (alexey-morlang)
Jeba Paulaiyan (jebap)
Changed in juniperopenstack:
importance: Critical → High
no longer affects: juniperopenstack
Revision history for this message
alexey-mr (alexey-morlang) wrote :

1) Is it really a timing issue? I mean - Are these calls successful if run them manually after some time?

2) There are in attached logs:
       (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:19:in `deprecation')^[[0m
    ^[[1;31mError: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y downgrade java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4' returned 1: Error: Package: 1:java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64 (contrail-4.1.1)
               Requires: java-1.8.0-openjdk-headless(x86-64) = 1:1.8.0.151-5.b12.el7_4
               Installed: 1:java-1.8.0-openjdk-headless-1.8.0.181-3.b13.el7_5.x86_64 (@rhel-7-server-rpms)
                   java-1.8.0-openjdk-headless(x86-64) = 1:1.8.0.181-3.b13.el7_5
               Available: 1:java-1.8.0-openjdk-headless-1.8.0.151-5.b12.el7_4.x86_64 (contrail-4.1.1)
                   java-1.8.0-openjdk-headless(x86-64) = 1:1.8.0.151-5.b12.el7_4

     You could try using --skip-broken to work around the problem
     You could try running: rpm -Va --nofiles --nodigest^[[0m
    ^[[1;31mError: /Stage[main]/Contrail::Analyticsdatabase::Install/Package[java-1.8.0-openjdk]/ensure: change from 1.8.0.181-3.b13.el7_5 to 1.8.0.151-5.b12.el7_4 failed: Could not update: Execution of '/usr/bin/yum -d 0 -e 0 -y downgrade java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4' returned 1: Error: Package: 1:java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64 (contrail-4.1.1)
               Requires: java-1.8.0-openjdk-headless(x86-64) = 1:1.8.0.151-5.b12.el7_4
               Installed: 1:java-1.8.0-openjdk-headless-1.8.0.181-3.b13.el7_5.x86_64 (@rhel-7-server-rpms)
                   java-1.8.0-openjdk-headless(x86-64) = 1:1.8.0.181-3.b13.el7_5
               Available: 1:java-1.8.0-openjdk-headless-1.8.0.151-5.b12.el7_4.x86_64 (contrail-4.1.1)
                   java-1.8.0-openjdk-headless(x86-64) = 1:1.8.0.151-5.b12.el7_4
     You could try using --skip-broken to work around the problem
     You could try running: rpm -Va --nofiles --nodigest^[[0m

It looks that there is outdated puppet-contrail module. Latest module has no dependency on java 1.8.0.151. The dependency was removed by commit:

commit 6e00428a77aaa5213c9b40d74d3ea1f9cc595241
Author: Santosh Gupta <email address hidden>
Date: Thu Jun 21 14:13:04 2018 -0700

    Upgrade cassandra to version 3.11.2 (3/3)

    - upgrade cassandra package to 3.11.2
    - remove dependency on jdk 1.8.0.151 in puppet files

    Change-Id: Ic23f16aba003f5cecf34327d65c831900ab4709e
    Depends-On: I45cd32dc6a7bc87295de941e7e6131c5308b539f
    Depends-On: I3894dfdb688a21451866af2bb201c7f9ae8fb721
    Partial-Bug: #1776656

Revision history for this message
shajuvk (shajuvk) wrote :

Hi Alexey,

Java dependency customer resolved after adding the package java-1.8.0-openjdk-headless-1.8.0.151 and java-1.8.0-openjdk-1.8.0.151 to contrail repo. The dependency error from 'openstack stack failure list overcloud' command was displayed the old failures. If you check the step 5.2 error , it was related to provision_control.py script running issue.

Customer able to run this command manually without any error after the upgrade steps are failed.

Thanks,
Shaju

Revision history for this message
shajuvk (shajuvk) wrote :

Customer has the setup in failed state, If needed vinay can help to get access.

shajuvk (shajuvk)
tags: added: rhosp
Revision history for this message
alexey-mr (alexey-morlang) wrote :

The calls to provision have 100 retries with 3 sec sleeps between.. what I mean is the code is trying to provision during 5 mins. in increased time till 8 mins: https://review.opencontrail.org/#/c/47063/
Is it possible to try it?

Additionally: if they have issue with java - it means they have not the latest puppet-contrail (e.g. they dont have the change for https://bugs.launchpad.net/juniperopenstack/+bug/1779943). I am not sure doest it matter or not.. But if it is possible it is better to try latest module + the review above.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/47063
Submitter: alexey-mr (<email address hidden>)

Revision history for this message
shajuvk (shajuvk) wrote :

Alexey,

The above mentioned bug fix will be available only on R4.1.2.
Customer has R4.1.1 and we yet to release R4.1.2.(https://review.opencontrail.org/#/c/46090/1/manifests/config/service.pp)

To fix the customer issue, how we can apply this bug fix ? After we modify all provision files under /usr/share/openstack-puppet/modules/contrail/manifests we needs to upload it to swift right ?

Thanks,
Shaju

Revision history for this message
shajuvk (shajuvk) wrote :

Vinay,

Could you please ask customer to try the change committed in the review request: https://review.opencontrail.org/#/c/47063/ . Click on each file to find the changes. It is a small numeric change.

files can be find in undercloud under directory /usr/share/openstack-puppet/modules/contrail/manifests/

copy the existing directory /home/stack/usr to /home/stack/usr-old1 before making change.

change 1:
---
Below files needs the change for adding delay
/home/stack/usr/share/openstack-puppet/modules/contrail/manifests/analytics/provision_analytics.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/analyticsdatabase/provision_database.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/config/provision_alarm.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/config/provision_config.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests//config/provision_linklocal.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/control/provision_control.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/control/provision_encap.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/control/provision_linklocal.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/database/provision_database.pp
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests/vrouter/provision_vrouter.pp

Change2:
-----
Additionally we need to add one more change. Line number 15 from below file need to comment.

File location:
/home/stack /usr/share/openstack-puppet/modules/contrail/manifests /config/service.pp

#onlyif => 'contrail-status |grep contrail-api: |grep "Generic Connection:Keystone\[\] connection down"',

After making both the changes upload the files to swift.

source stackrc

tar czvf puppet-modules.tgz ~/usr/
upload-swift-artifacts -f puppet-modules.tgz

Thanks,
Shaju

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/47063
Committed: http://github.com/Juniper/puppet-contrail/commit/dacdd70c9209e734aaef0c830b91000b744ca4b9
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit dacdd70c9209e734aaef0c830b91000b744ca4b9
Author: alexey-mr <email address hidden>
Date: Wed Oct 17 14:11:44 2018 +0300

Increase time for provisioning attemps

Change-Id: I27a8e4865b8626a740ebb507bf36b526cdcd5244
Closes-Bug: #1797981

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.