Unattended-Upgrade will upgrade 1020-oem kernel without nvidia-driver

Bug #1997505 reported by Bin Li
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OEM Priority Project
Fix Released
Critical
Bin Li
linux-meta-oem-5.14 (Ubuntu)
New
Critical
Unassigned
Focal
New
Undecided
Unassigned
Jammy
New
Undecided
Unassigned
linux-meta-oem-5.17 (Ubuntu)
Fix Released
Critical
Andy Whitcroft
Focal
New
Undecided
Unassigned
Jammy
New
Undecided
Unassigned
linux-restricted-modules-media-fixup (Ubuntu)
New
Undecided
Unassigned
Focal
Fix Committed
Undecided
Unassigned
Jammy
Fix Committed
Undecided
Unassigned

Bug Description

[ Impact ]

When the a factory image is installed onto a new system it ends up with very old packages installed as available when that image was frozen. Due to the presence of strict version clamps in some older linux-restricted-modules packages these will not upgrade as removals are required, however the associated kernel can still upgrade. This leads unattended-upgrades to upgrade one without the other rendering the latest kernel unable to drive the display. A very poor user experience on second boot.

[ Test Plan ]

Install a factory image into a VM and allow unattended-upgrades to upgrade the system; expect the kernel to upgrade and Nvidia components to be held-back. Then install this package and expect both to upgrade.

[ Where problems could occur ]

The new source provides updated packages for very old and now abandoned ABI specific packages. No current install should have the packages we are changing nor should they be installed by normal updates. Affected installs from frozen media should install a single package out of this set based on their frozen ABI version releasing the strict version clamp, and then immediately upgrade to the latest packages in the archive. We do not expect these packages to remain installed on any system.

[ Other Info ]

All included packages are pulled directly from the Launchpad Librarian.

===

If the GMed image used earlier kernel than 5.17.0-1020-oem, then you would like meet this issue.

1020-oem is in security channel. In jammy the unattended-upgrade will install security fixes by default.

For the I+N platforms, the nvidia driver couldn't be installed for 1020-oem kernel, then user will meet a black screen cause of nvidia modules couldn't be loaded.

Unattended-Upgrade::Allowed-Origins {
        "${distro_id}:${distro_codename}";
        "${distro_id}:${distro_codename}-security";
        // Extended Security Maintenance; doesn't necessarily exist for
        // every release and this system may not have it installed, but if
        // available, the policy for updates is such that unattended-upgrades
        // should also install from here by default.
        "${distro_id}ESMApps:${distro_codename}-apps-security";
        "${distro_id}ESM:${distro_codename}-infra-security";

Revision history for this message
Bin Li (binli) wrote :

$ apt-cache policy linux-oem-22.04a
linux-oem-22.04a:
  Installed: 5.17.0.1020.19
  Candidate: 5.17.0.1021.20
  Version table:
     5.17.0.1021.20 500
        500 http://us.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
 *** 5.17.0.1020.19 500
        500 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages
        100 /var/lib/dpkg/status
     5.17.0.1003.3 500
        500 http://us.archive.ubuntu.com/ubuntu jammy/main amd64 Packages

$ sudo apt-cache policy linux-modules-nvidia-515-5.17.0-1020-oem
linux-modules-nvidia-515-5.17.0-1020-oem:
  Installed: (none)
  Candidate: 5.17.0-1020.21+1
  Version table:
     5.17.0-1020.21+1 500
        500 http://us.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages
        500 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages

description: updated
tags: added: oem-priority originate-from-1995563 sutton
Changed in oem-priority:
importance: Undecided → Critical
assignee: nobody → Bin Li (binli)
status: New → In Progress
Revision history for this message
Bin Li (binli) wrote :

The nvidia driver is in security channel, I'm not sure why it's not updated.

$ apt-cache policy linux-modules-nvidia-515-oem-22.04a
linux-modules-nvidia-515-oem-22.04a:
  Installed: (none)
  Candidate: 5.17.0-1021.22
  Version table:
     5.17.0-1021.22 500
        500 http://archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 Packages
     5.17.0-1020.21+1 500
        500 http://archive.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages

Revision history for this message
Bin Li (binli) wrote :

From the history of unattended-upgrade, the linux-modules-nvidia-515-oem-22.04a was not upgrade.

Start-Date: 2022-11-22 07:01:12
Commandline: /usr/bin/unattended-upgrade
Install: linux-oem-5.17-headers-5.17.0-1020:amd64 (5.17.0-1020.21, automatic), linux-image-5.17.0-1020-oem:amd64 (5.17.0-1020.21, automatic), linux-modules-5.17.0-1020-oem:amd64 (5.17.0-1020.21, automatic), linux-headers-5.17.0-1020-oem:amd64 (5.17.0-1020.21, automatic)
Upgrade: linux-image-oem-22.04a:amd64 (5.17.0.1015.14, 5.17.0.1020.19), linux-oem-22.04a:amd64 (5.17.0.1015.14, 5.17.0.1020.19), linux-headers-oem-22.04a:amd64 (5.17.0.1015.14, 5.17.0.1020.19)
End-Date: 2022-11-22 07:01:33

Start-Date: 2022-11-22 07:03:37
Commandline: /usr/bin/unattended-upgrade
Upgrade: libnvidia-common-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2)
End-Date: 2022-11-22 07:03:37

Revision history for this message
Bin Li (binli) wrote :

$ sudo unattended-upgrade -v
Could not figure out development release: Distribution data outdated. Please check for an update for distro-info-data. See /usr/share/doc/distro-info-data/README.Debian for details.
Starting unattended upgrades script
Allowed origins are: o=Ubuntu,a=jammy, o=Ubuntu,a=jammy-security, o=UbuntuESMApps,a=jammy-apps-security, o=UbuntuESM,a=jammy-infra-security
Initial blacklist:
Initial whitelist (not strict):
No packages found that can be upgraded unattended and no pending auto-removals
Package libnvidia-cfg1-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-compute-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-decode-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-encode-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-extra-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-fbc1-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-gl-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package linux-modules-nvidia-515-oem-22.04a is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-compute-utils-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-driver-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-kernel-common-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-kernel-source-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-utils-515 is kept back because a related package is kept back or due to local apt_preferences(5).
Package xserver-xorg-video-nvidia-515 is kept back because a related package is kept back or due to local apt_preferences(5).

Revision history for this message
Bin Li (binli) wrote :

https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-515/+bug/1996890/comments/3

I think the root cause is same with lp:1996890 .

Nvidia 515.76 was superseded by 515.76+really.515.65.01, and is, therefore, not available in the archive. The dependency is correct, since allowing you to install the modules from 515.76 when user-space is really from 515.65.01 will break things badly.

Revision history for this message
Bin Li (binli) wrote (last edit ):

I could install the related nvidia by manual, but it would remove linux-modules-nvidia-515-5.17.0-1016-oem.

$ sudo apt-get install linux-modules-nvidia-515-oem-22.04a=5.17.0-1020.21+1

Start-Date: 2022-11-23 03:02:00
Commandline: apt-get install linux-modules-nvidia-515-oem-22.04a=5.17.0-1020.21+1
Requested-By: u (1001)
Install: linux-objects-nvidia-515-5.17.0-1020-oem:amd64 (5.17.0-1020.21+1, automatic), linux-modules-nvidia-515-5.17.0-1020-oem:amd64 (5.17.0-1020.21+1, automatic), linux-signatures-nvidia-5.17.0-1020-oem:amd64 (5.17.0-1020.21+1, automatic)
Upgrade: libnvidia-fbc1-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), libnvidia-gl-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), linux-modules-nvidia-515-oem-22.04a:amd64 (5.17.0-1016.17, 5.17.0-1020.21+1), libnvidia-extra-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), nvidia-compute-utils-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), nvidia-driver-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), libnvidia-encode-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), nvidia-utils-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), xserver-xorg-video-nvidia-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), libnvidia-decode-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), nvidia-kernel-common-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), libnvidia-cfg1-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), nvidia-kernel-source-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2), libnvidia-compute-515:amd64 (515.65.01-0ubuntu0.22.04.1, 515.76+really.515.65.01-0ubuntu0.22.04.2)

Remove: linux-modules-nvidia-515-5.17.0-1016-oem:amd64 (5.17.0-1016.17)

Revision history for this message
Alberto Milone (albertomilone) wrote :

Perhaps we should prevent unattended-upgrade from doing that when no compatible NVIDIA driver is available.

Bin Li (binli)
Changed in oem-priority:
status: In Progress → Triaged
Revision history for this message
Bin Li (binli) wrote :

1021-oem kernel in security pocket didn't help this issue.

Revision history for this message
Alberto Milone (albertomilone) wrote :

The current workaround makes the linux-oem-22.04a meta package conflict with the older linux-modules-nvidia-515-oem-22.04a . This, in turn, prevents unattended-upgrades from upgrading the NVIDIA drivers.

This is in linux-oem-22.04a version 5.17.0.1021.21 .

I am also attaching the relevant log from unattended-upgrades.

Here is the relevant changelog:

linux-meta-oem-5.17 (5.17.0.1021.21) jammy; urgency=medium

  * Unattended-Upgrade will upgrade 1020-oem kernel without nvidia-driver
    (LP: #1997505)
    - [Packaging] prevent upgrade without drivers if Nvidia is installed

 -- Andy Whitcroft <apw at canonical.com> Fri, 02 Dec 2022 12:21:55 +0000

Changed in linux-meta-oem-5.17 (Ubuntu):
importance: Undecided → Critical
assignee: nobody → Andy Whitcroft (apw)
status: New → Fix Released
Revision history for this message
Bin Li (binli) wrote :

Thanks all, the workaround works fine.

no longer affects: nvidia-graphics-drivers-515 (Ubuntu)
Changed in oem-priority:
status: Triaged → Fix Released
Revision history for this message
Bin Li (binli) wrote :

Hi Alberto & Andy,

 Currently our customer met this issue on focal too.
 We released image with 5.14.0-1032.35 oem kernel and nvidia 510.51-0ubuntu0.20.04.1, and the unattended-upgrade upgraded to 5.14.0-1051.58, and the nvidia modules could not be installed.

 Could you help apply the jammy's fix into focal? Thanks!

$ sudo apt install linux-modules-nvidia-510-5.14.0-1051-oem
The following packages have unmet dependencies:
 linux-modules-nvidia-510-5.14.0-1051-oem : Depends: nvidia-kernel-common-510 (<= 510.85.02-1) but 510.108.03-0ubuntu0.20.04.1 is to be installed
E: Unable to correct problems, you have held broken packages.

jeremyszu (os369510)
Changed in oem-priority:
status: Fix Released → Confirmed
Revision history for this message
Bin Li (binli) wrote :
Download full text (3.3 KiB)

I did a test on focal platforms, I found this issue was not fixed yet.

By default user used oem kernel which is set by oem-meta packages, it will be ok, but if the user select the generic kernel, then it will meet black screen issue cause of missing nvidia driver.

Start-Date: 2023-05-31 06:04:11
Commandline: /usr/bin/unattended-upgrade -v
Requested-By: u (1001)
Install: linux-hwe-5.15-headers-5.15.0-73:amd64 (5.15.0-73.80~20.04.1, automatic), linux-headers-generic-hwe-20.04:amd64 (5.15.0.73.80~20.04.34, automatic), linux-modules-extra-5.15.0-73-generic:amd64 (5.15.0-73.80~20.04.1, automatic), linux-image-generic-hwe-20.04:amd64 (5.15.0.73.80~20.04.34, automatic), linux-generic-hwe-20.04:amd64 (5.15.0.73.80~20.04.34, automatic), linux-modules-5.15.0-73-generic:amd64 (5.15.0-73.80~20.04.1, automatic), linux-headers-5.15.0-73-generic:amd64 (5.15.0-73.80~20.04.1, automatic), linux-image-5.15.0-73-generic:amd64 (5.15.0-73.80~20.04.1, automatic)
Upgrade: linux-image-oem-20.04d:amd64 (5.14.0.1032.29, 5.15.0.73.80~20.04.34), linux-oem-20.04d:amd64 (5.14.0.1032.29, 5.15.0.73.80~20.04.34)
End-Date: 2023-05-31 06:04:45

Start-Date: 2023-05-31 06:18:38
Commandline: /usr/bin/unattended-upgrade -v
Requested-By: u (1001)
Remove: linux-image-oem-20.04d:amd64 (5.15.0.73.80~20.04.34)
End-Date: 2023-05-31 06:18:38

Start-Date: 2023-05-31 06:18:42
Commandline: /usr/bin/unattended-upgrade -v
Requested-By: u (1001)
Remove: linux-headers-oem-20.04d:amd64 (5.15.0.73.80~20.04.34)
End-Date: 2023-05-31 06:18:42

Removing linux-headers-oem-20.04d (5.15.0.73.80~20.04.34) ...
Packages that were successfully auto-removed: linux-headers-oem-20.04d linux-image-oem-20.04d
Packages that are kept back:
Package libnvidia-cfg1-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-compute-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-decode-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-encode-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-extra-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-fbc1-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package libnvidia-gl-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package linux-modules-nvidia-510-oem-20.04d is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-compute-utils-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-driver-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-kernel-common-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-kernel-source-510 is kept back because a related package is kept back or due to local apt_preferences(5).
Package nvidia-utils-510 is kept back because a related package is kept bac...

Read more...

Andy Whitcroft (apw)
description: updated
Changed in linux-meta-oem-5.14 (Ubuntu):
importance: Undecided → Critical
no longer affects: oem-priority/focal
Revision history for this message
jeremyszu (os369510) wrote :

the source package of linux-oem-20.04d is now linux-meta-hwe-5.15 btw.

Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Bin, or anyone else affected,

Accepted linux-restricted-modules-media-fixup into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-restricted-modules-media-fixup/22.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in linux-restricted-modules-media-fixup (Ubuntu Jammy):
status: New → Fix Committed
Changed in linux-restricted-modules-media-fixup (Ubuntu Focal):
status: New → Fix Committed
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Bin, or anyone else affected,

Accepted linux-restricted-modules-media-fixup into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/linux-restricted-modules-media-fixup/20.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Bin Li (binli)
Changed in oem-priority:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.