virtlogd should be restarted on upgrade

Bug #1738834 reported by Vincent Bernat
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

Hello,

When upgrading from libvirt 1.3.1 (in Xenial) to libvirt 3.6.0 (in Bionic), virtlogd is not restarted on upgrade. It seems libvirt is unable to work with an old virtlogd. If not restarted, any spawned VM will just hang the whole libvirtd process because libvirt is not able to speak with virtlogd (one thread holds the lock in virLogManagerDomainOpenLogFile/virNetClientProgramCall/virNetClientSendWithReply). Nothing specific about that in the logs.

An easy way to test this is to just add Bionic sources to Xenial and upgrade libvirt0 and libvirt-bin. After restarting virtlogd, everything works as expected.

Revision history for this message
Vincent Bernat (vbernat) wrote :

Looking at postinst, I see virtlogd is reloaded. I have checked that reload trigger a reexec (like expected). Therefore, I don't know why this is not enough. I should investigate more on this.

Revision history for this message
Janåke Rönnblom (jan-ake) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

It looks like there is no other scenario for this upgrade to happen naturally between the versions 1.3.1 and 3.6.0 other than a release upgrade in ubuntu, right? And that would require a reboot at the end.

Given that, I'll mark this as "low" priority for now until further data about why reload/reexec isn't working surfaces.

Changed in libvirt (Ubuntu):
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Another way to hit this upgrade path is Xenial using libvirt and then upgrading to Cloud-Archive Pike (Artful based).

Since we already have 18.04/Queens which also is longer supported I tried with that.

This most likely is due to "--no-restart-on-upgrade" that is used for virtlogd.
Which was actually added that way in Debian to avoid issues with guest logs due to the restart.

I agree to Andreas that the most common case is a release-upgrade, but still I'd love to see this solved.

This seems to be almost like a no-way-out issue, quoting from upstream changelog (a bit older in 2016).

 45221 »···virtlogd: Don't stop or restart along with libvirtd
 45222 »···Commit 839a060 tied the lifecycle of virtlogd more
 45223 »···closely to that of libvirtd. Unfortunately, while starting
 45224 »···virtlogd when libvirtd is started is definitely a good idea,
 45225 »···restarting virtlogd or shutting it down at any time outside
 45226 »···of system poweroff is not.
 45227
 45228 »···Revert part of that commit by removing the PartOf= lines,
 45229 »···meaning that only startup requests will be propagated from
 45230 »···libvirtd to virtlogd.

Somewhere in the back of my mind is something about getting virtlogd reload/reopen files as needed, but I can't rememeber atm where/what to look for.

This needs some checks to find a way how to make it
a) not fail due to a restart
b) not fail due to not being restarts
which seem opposite until we have found a way to do so.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This was it:
The virtlogd daemon has the ability to re-exec() itself upon receiving SIGUSR1, to allow live upgrades without downtime.

Changed in libvirt (Ubuntu):
importance: Low → Medium
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

For that it has:
  ExecReload=/bin/kill -USR1 $MAINPID
defined and would do the right thing.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

But that should be done in postinst:
129 # Force virtlockd to reexec if enabled
130 if [ -d /run/systemd/system ]; then
131 ! systemctl is-active -q virtlogd || systemctl reload virtlogd.service >/dev/null
132 ! systemctl is-active -q virtlockd || systemctl reload virtlockd.service >/dev/null
133 fi

Also see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833745 which added it a long time ago.

Hmm, doing a reload manually gave me
  Process: 10228 ExecReload=/bin/kill -USR1 $MAINPID (code=exited, status=0/SUCCESS)
Obviously as intended the main PID doesn't change.

I wonder if something makes it skip those maintscript portion that is supposed to do that.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Here after an install of libvirt-daemon-system (which has the snippet to reload)

systemctl status virtlogd --no-pager
...
Jun 26 10:37:11 x systemd[1]: Started Virtual machine log manager.
Jun 26 10:37:26 x systemd[1]: Reloading Virtual machine log manager.
Jun 26 10:37:26 x systemd[1]: Reloaded Virtual machine log manager.
Jun 26 10:37:28 x systemd[1]: Started Virtual machine log manager.

Therefore this really is supposed to work.
It should stick with it's main PID and not restart intentionally, but get reloaded via SIGUSSR1.
And it should be visible in the logs.

So overall we came to the same conclusion, it does re-exec on reload and does so on install/upgrade.
I'll mark it incomplete until we know more about it because as such I can#t do any more for this bug.

Changed in libvirt (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Janåke Rönnblom (jan-ake) wrote :

This happens during the unattended-upgrade at night. The libvirt-bin is then not fully upgraded/installed.

Jul 18 06:12:07 x systemd[1]: Reloading Virtual machine log manager.
Jul 18 06:12:07 x systemd[1]: Reloaded Virtual machine log manager.
Jul 18 06:12:09 x systemd[1]: Dependency failed for Virtual machine log manager.
Jul 18 06:12:09 x systemd[1]: virtlogd.service: Job virtlogd.service/start failed with result 'dependency'.

It seems to me that the virtlogd is reloaded however seconds later the virtlogd is tried to be started. Could systemd when installing libvirt-bin try to start virtlogd even though its already running? And what dependency failed?

The only way to fix this is to stop virtlogd and virtlockd. Then libvirt-bin can be upgraded using apt-get

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hmm, maybe another package of that upgrade pack restarts it which it should not do.

I prepared an upgrade from Xenial to the Cloud Archive versions as initially reported.
This comes down to the critical section being:
The following NEW packages will be installed:
  libvirt-clients libvirt-daemon libvirt-daemon-driver-storage-rbd libvirt-daemon-system
The following packages will be upgraded:
  libvirt-bin libvirt0

Experimenting with that I see similar (but no failure):
Jul 18 11:01:49 x-uca systemd[1]: Started Virtual machine log manager.
Jul 18 11:22:28 x-uca systemd[1]: Reloading Virtual machine log manager.
Jul 18 11:22:28 x-uca systemd[1]: Reloaded Virtual machine log manager.
Jul 18 11:22:29 x-uca systemd[1]: Started Virtual machine log manager.

Splitting this apart with dpkg --force-depends -i instead of apt I found that libvirt-daemon-system triggers a reload (ok as it has the new files).
None of the other packages do (of course in the past when libvirt-bin had the service this would trigger the reload).

That said it reloads just fine, and I'd in fact assume that this is no "extra" restart.
It is triggering the reload and after it is complete it will report "started" having taken over the file descriptors and keeping the PID.
To test that I was just running "systemctl reload virtlogd.service" but that only gave me
  Jul 18 11:54:53 x-uca systemd[1]: Reloading Virtual machine log manager.
  Jul 18 11:54:53 x-uca systemd[1]: Reloaded Virtual machine log manager.

Hmm, running start gives me the start message.
But since this is not a re-start it is safe.
The service retains PID and FD's and actually nothing happens.
 $ systemctl start virtlogd.service
 Log:
   Jul 18 11:55:56 x-uca systemd[1]: Started Virtual machine log manager.

In fact what triggers this is the SYSV compat code.
It still has (for backports and people still insisting to run without systemd) /etc/init.d/virtlogd
If that exists the postinst will call
  $ invoke-rc.d virtlogd start
That is the start action we see as with systemd installed the old calls are mapped to systemd.

But again, it is no REstart so nothing changes.
The question is why the START action in your case fails.

If you just run "systemctl start virtlogd.service":
- does it work or fail as in your upgrade case?
- does it change the PID?
We need to find why the "start" action in your cases fail.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Discussions revealed that it might also be interesting to check the status of the virtlogd.socket before the update.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.