fab setup_all depends on connection between build_host and control_data ip of db node

Bug #1454980 reported by Jeba Paulaiyan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Critical
Ignatious Johnson Christopher
Trunk
Fix Committed
Critical
Ignatious Johnson Christopher

Bug Description

Image : R2.20 #18 (Juno) 14.04.1

fab setp_all fails with below trace. Possible suspect is the fab trying to connect to the internal vIP (172.16.80.25), which is not reachable from the build_host. The setup_all should provision the cluster using external vIP.

2015-05-14 00:24:28:645918: [root@172.16.70.10] out: /usr/bin/openstack-config --set|--del config_file section [parameter] [value]
2015-05-14 00:24:28:654366: [root@172.16.70.10] out: [localhost] local: chkconfig nova-compute on
2015-05-14 00:24:28:819291: [root@172.16.70.10] out: [localhost] local: service nova-compute restart
2015-05-14 00:24:28:819945: [root@172.16.70.10] out: nova-compute stop/waiting
2015-05-14 00:24:28:820476: [root@172.16.70.10] out: nova-compute start/running, process 15468
2015-05-14 00:24:28:853088: [root@172.16.70.10] out: [localhost] local: chkconfig supervisor-vrouter on
2015-05-14 00:24:28:853736: [root@172.16.70.10] out: [localhost] local: python /opt/contrail/utils/provision_vrouter.py --host_name csol1-node10 --host_ip 172.16.80.10 --api_server_ip 172.16.80.25 --oper add --admin_user admin --admin_password c0ntrail123 --admin_tenant_name admin --openstack_ip 172.16.80.25
2015-05-14 00:24:28:854327: [root@172.16.70.10] out: Traceback (most recent call last):
2015-05-14 00:24:31:977413: [root@172.16.70.10] out: File "/opt/contrail/utils/provision_vrouter.py", line 186, in <module>
2015-05-14 00:24:31:978173: [root@172.16.70.10] out: main()
2015-05-14 00:24:31:978756: [root@172.16.70.10] out: File "/opt/contrail/utils/provision_vrouter.py", line 182, in main
2015-05-14 00:24:31:979354: [root@172.16.70.10] out: VrouterProvisioner(args_str)
2015-05-14 00:24:31:979891: [root@172.16.70.10] out: File "/opt/contrail/utils/provision_vrouter.py", line 32, in __init__
2015-05-14 00:24:31:980416: [root@172.16.70.10] out: auth_host=self._args.openstack_ip)
2015-05-14 00:24:31:980994: [root@172.16.70.10] out: File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 200, in __init__
2015-05-14 00:24:31:981620: [root@172.16.70.10] out: retry_on_error=False)
2015-05-14 00:24:31:982209: [root@172.16.70.10] out: File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 391, in _request
2015-05-14 00:24:31:982753: [root@172.16.70.10] out: raise ConnectionError
2015-05-14 00:24:31:983288: [root@172.16.70.10] out: requests.exceptions.ConnectionError
2015-05-14 00:24:31:983825: [root@172.16.70.10] out:
2015-05-14 00:24:31:984369: [root@172.16.70.10] out: Fatal error: local() encountered an error (return code 1) while executing 'python /opt/contrail/utils/provision_vrouter.py --host_name csol1-node10 --host_ip 172.16.80.10 --api_server_ip 172.16.80.25 --oper add --admin_user admin --admin_password c0ntrail123 --admin_tenant_name admin --openstack_ip 172.16.80.25'
2015-05-14 00:24:31:984868: [root@172.16.70.10] out:
2015-05-14 00:24:31:985114: [root@172.16.70.10] out: Aborting.
2015-05-14 00:24:31:985325: [root@172.16.70.10] out:
2015-05-14 00:24:31:987654:

2015-05-14 00:24:31:995849: Fatal error: sudo() received nonzero return code 1 while executing!
2015-05-14 00:24:31:995849:
2015-05-14 00:24:31:995849: Requested: setup-vnc-compute --self_ip 172.16.80.10 --cfgm_ip 172.16.80.25 --cfgm_user root --cfgm_passwd c0ntrail123 --ncontrols 3 --amqp_server_ip 172.16.80.25 --service_token aefc2944c765806ac7da --orchestrator openstack --hypervisor libvirt --non_mgmt_ip 172.16.80.10 --non_mgmt_gw 172.16.80.253 --keystone_ip 172.16.80.25 --openstack_mgmt_ip 172.16.70.2 --keystone_auth_protocol http --keystone_auth_port 35357 --quantum_service_protocol http --keystone_admin_user admin --keystone_admin_password c0ntrail123 --internal_vip 172.16.80.25 --external_vip 172.16.70.25 --contrail_internal_vip 172.16.80.25 --mgmt_self_ip 172.16.70.10
2015-05-14 00:24:31:995849: Executed: sudo -S -p 'sudo password:' /bin/bash -l -c "cd /opt/contrail/bin && setup-vnc-compute --self_ip 172.16.80.10 --cfgm_ip 172.16.80.25 --cfgm_user root --cfgm_passwd c0ntrail123 --ncontrols 3 --amqp_server_ip 172.16.80.25 --service_token aefc2944c765806ac7da --orchestrator openstack --hypervisor libvirt --non_mgmt_ip 172.16.80.10 --non_mgmt_gw 172.16.80.253 --keystone_ip 172.16.80.25 --openstack_mgmt_ip 172.16.70.2 --keystone_auth_protocol http --keystone_auth_port 35357 --quantum_service_protocol http --keystone_admin_user admin --keystone_admin_password c0ntrail123 --internal_vip 172.16.80.25 --external_vip 172.16.70.25 --contrail_internal_vip 172.16.80.25 --mgmt_self_ip 172.16.70.10"
2015-05-14 00:24:31:995849:
2015-05-14 00:24:31:995970: Aborting.

Tags: provisioning
Jeba Paulaiyan (jebap)
tags: added: provisioning
Revision history for this message
Ignatious Johnson Christopher (ijohnson-x) wrote :

From below logs the Fab have connected to [root@172.16.70.10] -- External network and executing "local: python /opt/contrail/utils/provision_vrouter.py" locally in node [root@172.16.70.10].

logs:
---------
 2015-05-14 00:24:28:853736: [root@172.16.70.10] out: [localhost] local: python /opt/contrail/utils/provision_vrouter.py --host_name csol1-node10 --host_ip 172.16.80.10 --api_server_ip 172.16.80.25 --oper add --admin_user admin --admin_password c0ntrail123 --admin_tenant_name admin --openstack_ip 172.16.80.25

provision_vrouter.py will connect to api server thought internal network(--api_server_ip 172.16.80.25) and add the vrouter node to the config. Looks like connection from compute node(root@172.16.70.10) to api-server is not established from the below logs,

logs:
---------
2015-05-14 00:24:31:982209: [root@172.16.70.10] out: File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 391, in _request
2015-05-14 00:24:31:982753: [root@172.16.70.10] out: raise ConnectionError
2015-05-14 00:24:31:983288: [root@172.16.70.10] out: requests.exceptions.ConnectionError

Revision history for this message
Jeba Paulaiyan (jebap) wrote :

The issue reported in the initial comment is hardware wiring issue. So fab is not using the Internal vIP. However, database provisioning tries to connect to the control_data ip of the database node from the build_host. This need to be fixed.

2015-05-14 10:59:58:901148: [root@172.16.70.2] out:
2015-05-14 10:59:58:901677:
2015-05-14 10:59:58:902016: [root@172.16.80.2] sudo: python provision_database_node.py --api_server_ip 172.16.80.2 --host_name csol1-node2 --host_ip 172.16.80.2 --oper add --admin_user admin --admin_password c0ntrail123 --admin_tenant_name admin
2015-05-14 10:59:58:902432: Disconnecting from 172.16.70.2... done.
2015-05-14 11:00:09:014963: Disconnecting from 172.16.70.8... done.
2015-05-14 11:00:09:068885: Disconnecting from 172.16.70.4... done.
2015-05-14 11:00:09:075292: Disconnecting from 172.16.70.10... done.
2015-05-14 11:00:09:095855: Disconnecting from 172.16.70.7... done.
2015-05-14 11:00:09:184554: Disconnecting from 172.16.70.9... done.
2015-05-14 11:00:09:239264: Disconnecting from 172.16.70.3... done.
2015-05-14 11:00:09:265837: Disconnecting from 172.16.70.12... done.
2015-05-14 11:00:09:354918: Disconnecting from 172.16.70.11... done.
2015-05-14 11:00:09:439951: !
2015-05-14 10:59:34:285159:
2015-05-14 10:59:34:285159:
2015-05-14 10:59:34:500932: Warning: sudo() received nonzero return code 1 while executing 'rm /etc/init/supervisor-vrouter.override'!
2015-05-14 10:59:34:500932:
2015-05-14 10:59:34:500932:
2015-05-14 10:59:39:654776: Warning: sudo() received nonzero return code 2 while executing 'sudo sed -i 's/ENABLED=.*/ENABLED=1/g' /etc/default/haproxy'!
2015-05-14 10:59:39:654776:
2015-05-14 10:59:39:654776:
2015-05-14 10:59:39:852269: Warning: sudo() received nonzero return code 1 while executing 'rm /etc/init/supervisor-vrouter.override'!
2015-05-14 10:59:39:852269:
2015-05-14 10:59:39:852269:
2015-05-14 10:59:44:490045: Warning: sudo() received nonzero return code 2 while executing 'sudo sed -i 's/ENABLED=.*/ENABLED=1/g' /etc/default/haproxy'!
2015-05-14 10:59:44:490045:
2015-05-14 10:59:44:490045:
2015-05-14 10:59:44:685267: Warning: sudo() received nonzero return code 1 while executing 'rm /etc/init/supervisor-vrouter.override'!
2015-05-14 10:59:44:685267:
2015-05-14 10:59:44:685267:
2015-05-14 10:59:49:134741: Warning: sudo() received nonzero return code 2 while executing 'sudo sed -i 's/ENABLED=.*/ENABLED=1/g' /etc/default/haproxy'!
2015-05-14 10:59:49:134741:
2015-05-14 10:59:49:134741:
2015-05-14 10:59:54:112503: Warning: sudo() received nonzero return code 2 while executing 'sudo sed -i 's/ENABLED=.*/ENABLED=1/g' /etc/default/haproxy'!
2015-05-14 10:59:54:112503:
2015-05-14 10:59:54:112503:
2015-05-14 11:00:08:913355: Fatal error: Timed out trying to connect to 172.16.80.2 (tried 1 time)
2015-05-14 11:00:08:913355:
2015-05-14 11:00:08:913355: Underlying exception:
2015-05-14 11:00:08:913355: timed out
2015-05-14 11:00:08:913355:
2015-05-14 11:00:08:914114: Aborting.
2015-05-14 11:00:08:914114:

summary: - fab depends on connection to internal vIP for provisioning
+ fab setup_all depends on connection between build_host and control_data
+ ip of db node
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10368
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/10369
Submitter: Ignatious Johnson Christopher (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10368
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/dffca5dbe4925f7014957f2b6380742dc35592a5
Submitter: Zuul
Branch: master

commit dffca5dbe4925f7014957f2b6380742dc35592a5
Author: Ignatious Johnson Christopher <email address hidden>
Date: Thu May 14 11:38:45 2015 -0700

Using Management ip to login to nodes when provisioning database, collector and control nodes.

Change-Id: I7ebb9bd01cea7f1b3c2d764972b0f1a1d57ad1ec
Closes-bug: 1454980

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/10369
Committed: http://github.org/Juniper/contrail-fabric-utils/commit/4eb0b6f2a0a865aaaf91c248d5a72da77dfdaf22
Submitter: Zuul
Branch: R2.20

commit 4eb0b6f2a0a865aaaf91c248d5a72da77dfdaf22
Author: Ignatious Johnson Christopher <email address hidden>
Date: Thu May 14 11:47:15 2015 -0700

Using Management ip to login to nodes when provisioning database, collector and control nodes.

Change-Id: Icc7910378323dbdedf908a576caa473afba8c181
Closes-bug: 1454980

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.