restore-backup doesn't complete when trying to decrease controller numbers.

Bug #1720737 reported by José Pekkarinen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Unassigned
2.1
Won't Fix
Undecided
Unassigned

Bug Description

Steps to reproduce:

1) bootstrap a new controller.
2) take backup.
3) enable ha.
4) restore backup.

Expected: As stated in doc, revert back to a non ha juju.
Got:

ERROR could not clean up after failed restore attempt: cannot complete restore: <nil>: Restore did not finish succesfuly
ERROR cannot perform restore: <nil>: restore failed: error restoring state from backup: setting special user permission in db: Role "oploger@admin" already exists

See complete output down:

$ juju bootstrap --build-agent localhost ant-controller
Creating Juju controller "ant-controller" on localhost/localhost
Building local Juju agent binary version 2.1.4 for amd64
To configure your system to better support LXD containers, please see: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
Launching controller instance(s) on localhost/localhost...
 - juju-a20416-0 (arch=amd64)
Fetching Juju GUI 2.9.2
Waiting for address
Attempting to connect to 192.168.0.3:22
Logging to /var/log/cloud-init-output.log on the bootstrap machine
Running apt-get update
Running apt-get upgrade
Installing curl, cpu-checker, bridge-utils, cloud-utils, tmux
Installing Juju machine agent
Starting Juju machine agent (service jujud-machine-0)
Bootstrap agent now started
Contacting Juju controller at 192.168.0.3 to verify accessibility...
Bootstrap complete, "ant-controller" controller now available.
Controller machines are in the "controller" model.
Initial model "default" added.

$ juju create-backup -m controller
20171002-062133.b56950f7-d04a-480a-8993-0b9871a20416
downloading to juju-backup-20171002-062133.tar.gz
pekkari@ant ~/workspace $ juju restore-backup -m controller --id=20171002-062133.b56950f7-d04a-480a-8993-0b9871a20416
restore from "20171002-062133.b56950f7-d04a-480a-8993-0b9871a20416" completed

$ juju enable-ha
maintaining machines: 0
adding machines: 1, 2

$ juju restore-backup -m controller --id=20171002-062133.b56950f7-d04a-480a-8993-0b9871a20416
ERROR could not clean up after failed restore attempt: cannot complete restore: <nil>: Restore did not finish succesfuly
ERROR cannot perform restore: <nil>: restore failed: error restoring state from backup: setting special user permission in db: Role "oploger@admin" already exists

Best regards.

José.

tags: added: 4010
Revision history for this message
John A Meinel (jameinel) wrote :

Is this specifically different from bug #1720740 ?

Changed in juju:
status: New → Incomplete
tags: added: cpe-onsite
Revision history for this message
José Pekkarinen (koalinux) wrote :

Yes it is, as in this I'm targeting to decrease the controllers as stated in the
documentation, and it finish in error message. The other just want to restore the
backup and it stuck forever. It might be that the same fix works for both, they need
an specific patch for both.

Changed in juju:
status: Incomplete → New
Revision history for this message
Ian Booth (wallyworld) wrote :

We plan to address HA issues in the upcoming 18.04 cycle.

Changed in juju:
milestone: none → 2.4-beta1
importance: Undecided → High
status: New → Triaged
tags: added: restore-backup
Revision history for this message
Heather Lanigan (hmlanigan) wrote :

@koalinux

Is it possible you hit this bug: https://bugs.launchpad.net/juju/+bug/1740969, whereby attempting to restore a backup more than once produces the error where the mongo user is already created.

Revision history for this message
José Pekkarinen (koalinux) wrote :

@hmlanigan, I'm afraid this was a fresh new juju ha deployment in my laptop, so the output you
see is the very first it executed, it succeeded on creating the backup, it never succeeded on
restoring it.

Changed in juju:
milestone: 2.4-beta1 → none
Revision history for this message
John A Meinel (jameinel) wrote :

In juju 2.4 you'll be able to just "juju remove-machine -m controller N" to remove a controller machine in order to change the number of controllers.

So the portion of being able to restore to an HA controller, is already tracked in the other bug, and the part about changing the # of controllers is already fixed as part of bug #1658033

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.