snapd refresh after 2.49.2 kills snap services

Bug #1924805 reported by Ian Johnson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Fix Released
Critical
Ian Johnson

Bug Description

After a UC18 or UC20 system is refreshed to snapd 2.49.2, due to this PR: https://github.com/snapcore/snapd/pull/10054, we will add the following to new systemd units that snapd generates:

```
Requires=usr-lib-snapd.mount
After=usr-lib-snapd.mount
```

Which is to fix a race condition on certain devices where the device unexpectedly reboots during a snapd refresh.

This is problematic because when we refresh snapd itself after having written out some systemd units that include the Requires= above (which could happen for app snap refreshes or new installs after snapd 2.49.2 is installed), now we will unmount/stop usr-lib-snapd.mount, which due to how Requires= works, will forcibly stop all services which have the Requires= bit in it. Then after snapd finishes refreshing itself those services will be left in the terminated state, effectively killing the services when snapd refreshes itself.

Since snapd 2.49.2 has already been deployed, we need a solution to this wherein the new snapd that is refreshed to can figure out how to restart snap services that were killed by the old snapd that stopped the usr-lib-snapd.mount unit before finalizing the refresh to the new snapd.

Changed in snapd:
assignee: nobody → Ian Johnson (anonymouse67)
importance: Undecided → Critical
status: New → In Progress
Revision history for this message
Ian Johnson (anonymouse67) wrote :

The root cause of why snap services get killed but not restarted here is because the order of operations that snapd does during a snapd snap refresh is as follows:

1) change usr-lib-snapd.mount
2) systemctl daemon-reload
3) systemctl enable usr-lib-snapd.mount
4) systemctl stop usr-lib-snapd.mount (!!!!!!!!!!!!!!!!!!!!)
5) systemctl start usr-lib-snapd.mount

Step 4 is the bit that kills the snap services, and then when we do step 5, the services that we killed in 4 are not restarted. If instead we did the following:

1) change usr-lib-snapd.mount
2) systemctl daemon-reload
3) systemctl enable usr-lib-snapd.mount
4) systemctl restart usr-lib-snapd.mount

then we would not have this situation where snap services get killed when snapd gets refreshed.

We don't necessarily need to change this behavior since snapd in the wild is already doing this, so we can't really stop it from doing it when we refresh to a newer snapd, but this makes it more clear what is going on.

Revision history for this message
Paweł Stołowski (stolowski) wrote :
Changed in snapd:
status: In Progress → Fix Committed
status: Fix Committed → Fix Released
Revision history for this message
Ian Johnson (anonymouse67) wrote :

To be clear, this was fixed with snapd 2.50, both the bug where refreshing _from_ 2.49.2 killed snap services, and the root cause which caused services to be killed upon refreshes.

Changed in snapd:
milestone: none → 2.51
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.