Ceph upgrade failed during minor (Rocky to Rocky) overcloud upgrade
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
John Fulton |
Bug Description
Description:
"openstack overcloud external-update run --tags ceph" command fails when trying to do a minor Rocky to Rocky upgrade.
Steps:
1. Upgraded the undercloud
2. Prepared the update
openstack overcloud update prepare ...
2. Upgraded all overcloud nodes
openstack overcloud update run --nodes Controller
openstack overcloud update run --nodes Compute
openstack overcloud update run --nodes CephStorage
3. Run "openstack overcloud external-update run --tags ceph"
Expected results:
A successfully updated containerized Red Hat Ceph Storage 3 cluster.
Actual results:
'''
TASK [set facts for swift back up of ceph-ansible fetch directory] *************
Tuesday 11 June 2019 22:44:07 +0100 (0:00:00.068) 0:01:02.993 **********
ok: [undercloud] => {"ansible_facts": {"new_ceph_
: "temporary_
ible_fetch_
url": "https:/
url_sig=
TASK [attempt download of fetch directory tarball from swift backup] ***********
Tuesday 11 June 2019 22:44:07 +0100 (0:00:00.079) 0:01:03.073 **********
[WARNING]: Consider using the get_url or uri module rather than running curl.
If you need to use command because get_url or uri is insufficient you can add
warn=False to this command task or set command_
get rid of this message.
changed: [undercloud] => {"changed": true, "cmd": "curl -s -o /tmp/temporary_
5.5.2:13808/
646cc71b094dd6a
0, "start": "2019-06-11 22:44:07.490375", "stderr": "", "stderr_lines": [], "stdout": "401", "stdout_lines": ["401"]}
TASK [ensure we create a new fetch_directory or use the old fetch_directory] ***
Tuesday 11 June 2019 22:44:07 +0100 (0:00:00.409) 0:01:03.482 **********
fatal: [undercloud]: FAILED! => {"changed": false, "msg": "Received HTTP: 401 when attempting to GET from https:/
1/AUTH_
6a0ae8bc8df3d4b
NO MORE HOSTS LEFT *******
```
Environment:
openstack-
ceph-ansible-
python-
openstack-
ansible-
python2-
openstack-
ansible-
openstack-
puppet-
openstack-
python-
python2-
openstack-
puppet-
description: | updated |
Changed in tripleo: | |
importance: | Undecided → High |
milestone: | none → train-2 |
tags: | added: queens-backport-potential |
Anton,
The Swift URLs which are generated by the workflow expire and that's why you got 401 unauthorized.
As per the tail of the URL temp_url_ expires= 1560282811, which when converted from epoch time to human time in GMT at that moment is Tuesday, June 11, 2019 19:53:31. We can see your failed task occurred at Tuesday 11 June 2019 22:44:07 +0100.
New URLs are generated when `openstack overcloud external-upgrade run` is executed and more details are in:
https:/ /github. com/openstack/ tripleo- common/ commit/ d1619ed9eac7ebb f8d8efae1476e19 81d0a980e4
The URLs should be good for 1 hour:
https:/ /github. com/openstack/ tripleo- common/ blob/master/ workbooks/ deployment. yaml#L633
So it must have taken >1 hour to get to this state. We should probably increase the timeout.