[10.0] [Tempest] Failed to deploy tempest environment

Bug #1648558 reported by Yury Tregubov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Sergii Turivnyi

Bug Description

Tried to run Tempest suites for mos 10.0, but failed to deploy the usual tempest env with Ceph Sahara DVR and ironic.

Nodes are stuck in discover state. And some errors are seen on master node.

Snapshots used for deploy: 1069, 1080.

Diagnostic snapshot is attached

[root@nailgun ~]# fuel nodes
id | status | name | cluster | ip | mac | roles | pending_roles | online | group_id
---+----------+---------------------------+---------+------------+-------------------+-------+-------------------+--------+---------
 3 | discover | slave-03_controller_mongo | 1 | 10.109.8.6 | 64:e0:47:90:d5:02 | | controller, mongo | 1 | 1
 1 | discover | slave-01_controller_mongo | 1 | 10.109.8.4 | 64:ff:91:87:ed:0c | | controller, mongo | 1 | 1
 2 | discover | slave-02_controller_mongo | 1 | 10.109.8.5 | 64:66:e4:f2:39:b3 | | controller, mongo | 1 | 1
 4 | discover | slave-05_compute_cinder | 1 | 10.109.8.8 | 64:e8:dd:c9:82:cb | | cinder, compute | 1 | 1
 6 | discover | slave-04_compute_cinder | 1 | 10.109.8.7 | 64:2a:9a:56:74:b1 | | cinder, compute | 1 | 1
 5 | discover | slave-06_ironic | 1 | 10.109.8.9 | 64:bf:34:ac:84:bf | | ironic | 1 | 1
[root@nailgun ~]# grep -R ERROR /var/log/*
/var/log/anaconda/syslog:13:50:32,609 CRIT firewalld: 2016-12-08 13:50:32 FATAL ERROR: No IPv4 and IPv6 firewall.
/var/log/anaconda/syslog:13:50:32,609 ERR firewalld: 2016-12-08 13:50:32 ERROR: Raising SystemExit in run_server
/var/log/anaconda/journal.log:Dec 08 13:52:00 nailgun.test.domain.local dracut[3048]: -rw-r--r-- 1 root root 191 Mar 6 2015 usr/lib/kbd/consolefonts/ERRORS
/var/log/anaconda/journal.log:Dec 08 13:52:42 nailgun.test.domain.local dracut[13123]: -rw-r--r-- 1 root root 191 Mar 6 2015 usr/lib/kbd/consolefonts/ERRORS
/var/log/anaconda/journal.log:Dec 08 13:53:14 nailgun.test.domain.local dracut[25563]: -rw-r--r-- 1 root root 191 Mar 6 2015 usr/lib/kbd/consolefonts/ERRORS
/var/log/fuel-bootstrap-image-build.log:modprobe: ERROR: ../libkmod/libkmod.c:586 kmod_search_moddep() could not open moddep file '/lib/modules/3.10.0-327.36.3.el7.x86_64/modules.dep.bin'
/var/log/mcollective.log:E, [2016-12-08T14:21:16.863080 #20910] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/mcollective.log:E, [2016-12-08T16:51:55.697225 #20910] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/nailgun/api.log:2016-12-08 14:24:03.913 ERROR [7fd2752d3880] (logger) Response code '500 Internal Server Error' for PUT /api/clusters/1/network_configuration/neutron/verify/ from 10.109.8.1:41906
/var/log/nailgun/api.log:2016-12-08 14:24:24.371 ERROR [7fd2752d3880] (logger) Response code '500 Internal Server Error' for PUT /api/clusters/1/network_configuration/neutron/verify/ from 10.109.8.1:41906
/var/log/nailgun/app.log:2016-12-08 14:24:03.907 ERROR [7fd2752d3880] (base) Unexpected exception occured
/var/log/nailgun/app.log:2016-12-08 14:24:24.366 ERROR [7fd2752d3880] (base) Unexpected exception occured
/var/log/remote/10.109.8.7/bootstrap/mcollective.log:2016-12-08T14:21:50.960975+00:00 err: 14:21:50.655407 #1404] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/remote/10.109.8.9/bootstrap/mcollective.log:2016-12-08T16:52:22.876456+00:00 err: 16:52:22.751793 #1399] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}
/var/log/remote/10.109.8.4/bootstrap/mcollective.log:2016-12-08T14:21:39.087455+00:00 err: 14:21:38.871774 #1363] ERROR -- : rabbitmq.rb:50:in `on_hbread_fail' Heartbeat read failed from 'stomp://mcollective@10.109.8.2:61613': {"ticker_interval"=>29.5, "read_fail_count"=>0, "lock_fail"=>true, "lock_fail_count"=>1}

Changed in fuel:
importance: Undecided → Medium
Revision history for this message
Yury Tregubov (ytregubov) wrote :
description: updated
tags: added: blocker-for-qa
Changed in fuel:
importance: Medium → High
milestone: none → 10.1
Changed in fuel:
status: New → Confirmed
Roman Rufanov (rrufanov)
Changed in fuel:
importance: High → Critical
assignee: nobody → Alexey Shtokolov (ashtokolov)
Changed in fuel:
assignee: Alexey Shtokolov (ashtokolov) → Vladimir Kuklin (vkuklin)
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :
Download full text (58.8 KiB)

This bug happens due to incorrect test setup. Node interfaces (except for admin) are not assigned to any networks. Here is a snippet of node-1 networking config:

- assigned_networks:
  - id: 1
    name: fuelweb_admin
  bus_info: '0000:00:03.0'
  current_speed: null
  driver: virtio_net
  id: 1
  interface_properties:
    disable_offloading: false
    dpdk:
      available: false
      enabled: false
    mtu: null
    numa_node: 0
    pci_id: 1af4:0001
    sriov:
      available: false
      enabled: false ...

Changed in fuel:
assignee: Vladimir Kuklin (vkuklin) → Fuel QA Team (fuel-qa)
Revision history for this message
Nastya Urlapova (aurlapova) wrote :

@Folks, fuel-qa doesn't support cases with OS components, so I would ask assistance MOS-QA team here.

Changed in fuel:
assignee: Fuel QA Team (fuel-qa) → MOS QA Team (mos-qa)
Revision history for this message
Yury Tregubov (ytregubov) wrote :

Tried to deploy the envs with these templates and fuel-qa stable/newton branch:

https://github.com/Mirantis/mos-ci-deployment-scripts/blob/stable/9.0/templates/tempest/ironic_cinder.yaml

https://github.com/Mirantis/mos-ci-deployment-scripts/blob/stable/9.0/templates/stepler_tempest/ironic_cinder.yaml

In both cases got problem with network checks at the very beginning of deploy as described above.

Then I tried to continue the deploy manually from snapshot with empty slaves.
And got the following problem. Repo mirror.fuel-infra.org is not available from compute and ironic nodes. However it's accessible from controllers and fuel master node. And network checks were passed before deploy started.

ytregubov@srv136-bud:~$ ssh root@10.109.8.2
root@10.109.8.2's password:
Last login: Mon Dec 12 09:31:04 2016 from 10.109.8.1
[root@nailgun ~]# ssh 10.109.8.8
Warning: Permanently added '10.109.8.8' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-53-generic x86_64)

 * Documentation: https://help.ubuntu.com
 * Management: https://landscape.canonical.com
 * Support: https://ubuntu.com/advantage
Last login: Mon Dec 12 09:34:41 2016 from 10.109.8.2
root@node-6:~# ping mirror.fuel-infra.org
PING seed.fuel-infra.org (74.217.65.25) 56(84) bytes of data.
From 10.109.7.9 icmp_seq=1 Destination Host Unreachable
From 10.109.7.9 icmp_seq=2 Destination Host Unreachable
From 10.109.7.9 icmp_seq=3 Destination Host Unreachable

Revision history for this message
Yury Tregubov (ytregubov) wrote :
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Hi dev team, could you please take a look? It looks like MOS 10.0 doesn't allow to deploy some simple configurations which were used to run Tempest test suite.

Changed in fuel:
assignee: MOS QA Team (mos-qa) → Fuel Sustaining (fuel-sustaining-team)
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

Gentlemen, the RCA has been conducted and the result is the following - tests are creating an environment with incorrect configuration. This configuration should be fixed and everything will start working flawlessly - just assign interfaces to the networks and trigger the deployment. So far, until this networking config is fixed, there is nothing we can do with regards to this failure.

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS QA Team (mos-qa)
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Vladimir, thank you for the update!

We are going to fix the issue.

Changed in fuel:
assignee: MOS QA Team (mos-qa) → Sergii Turivnyi (sturivnyi)
Changed in fuel:
status: Confirmed → Fix Released
Revision history for this message
Sergii Turivnyi (sturivnyi) wrote :

We have updated configuration. Now env deployed successfully.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.