Removing legacy virtio-pci devices causes kernel panic
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Noble |
Fix Released
|
Medium
|
Matthew Ruffell |
Bug Description
BugLink: https:/
[Impact]
If you detach a legacy virtio-pci device from a current Noble system, it will cause a null pointer dereference, and panic the system. This is an issue if you force noble to use legacy virtio-pci devices, or run noble on very old hypervisors that only support legacy virtio-pci devices, e.g. trusty and older.
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
CPU: 2 PID: 358 Comm: kworker/u8:3 Kdump: loaded Not tainted 6.8.0-31-generic #31-Ubuntu
Workqueue: kacpi_hotplug acpi_hotplug_
RIP: 0010:0x0
...
Call Trace:
<TASK>
? show_regs+0x6d/0x80
? __die+0x24/0x80
? page_fault_
? do_user_
? exc_page_
? asm_exc_
vp_del_
remove_
virtnet_
virtio_
device_
device_
device_
bus_remove_
device_
? pci_bus_
device_
unregister_
virtio_
pci_device_
device_
device_
device_
pci_stop_
pci_stop_
disable_
acpiphp_
hotplug_
? __pfx_acpiphp_
acpiphp_
acpi_device_
acpi_hotplug_
process_
worker_
? _raw_spin_
? __pfx_worker_
kthread+0xef/0x120
? __pfx_kthread+
ret_from_
? __pfx_kthread+
ret_from_
</TASK>
The issue was introduced in:
commit fd27ef6b44bec26
Author: Feng Liu <email address hidden>
Date: Tue Dec 19 11:32:40 2023 +0200
Subject: virtio-pci: Introduce admin virtqueue
Link: https:/
Modern virtio-pci devices are not affected. If the device is a legacy virtio device, the is_avq function pointer is not assigned in the virtio_pci_device structure of the legacy virtio device, resulting in a NULL pointer dereference when the code calls if (vp_dev-
There is no workaround. If you are affected, then not detaching devices for the time being is the only solution.
[Fix]
This was fixed in 6.9-rc1 by:
commit c8fae27d141a32a
From: Li Zhang <email address hidden>
Date: Sat, 16 Mar 2024 13:25:54 +0800
Subject: virtio-pci: Check if is_avq is NULL
Link: https:/
This is a clean cherry pick to noble. The commit just adds a basic NULL pointer check before it dereferences the pointer.
[Testcase]
Start a fresh Noble VM.
Edit the grub kernel command line:
1) sudo vim /etc/default/grub
GRUB_CMDLINE_
2) sudo update-grub
3) sudo reboot
Outside the VM, on the host:
$ qemu-img create -f qcow2 /root/share-
$ cat >> share-device.xml << EOF
disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='writeback' io='threads'/>
<source file='/
<target dev='vdc' bus='virtio'/>
</disk>
EOF
$ sudo -s
# virsh attach-device noble-test share-device.xml --config --live
# virsh detach-device noble-test share-device.xml --config --live
A kernel panic should occur.
There is a test kernel available in:
https:/
If you install it, the panic should no longer occur.
[Where problems could occur]
We are adding a basic null pointer check right before the pointer is about to be used, which is quite low risk.
If a regression were to occur, it would only affect VMs using legacy virtio-pci devices, which is not the default. It would potentially have large impacts on fleets of very old hypervisors running trusty, precise or lucid, but that is very unlikely in this day and age.
[Other Info]
Upstream mailing list discussion and author testcase:
https:/
CVE References
tags: | added: linux-image-generic |
tags: |
added: kernel-bug removed: linux-image-generic |
Changed in linux (Ubuntu): | |
status: | New → Fix Released |
summary: |
- remove virtio legacy device make kernel Oops + Removing legacy virtio-pci devices causes kernel panic |
Changed in linux (Ubuntu Noble): | |
status: | New → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Matthew Ruffell (mruffell) |
description: | updated |
Changed in linux (Ubuntu Noble): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-noble-linux removed: verification-needed-noble-linux |
the backtrace as follows: work_fn 307a80 EFLAGS: 00010216 0(0000) GS:ffff8af9e810 0000(0000) knlGS:000000000 0000000 oops+0x99/ 0x1b0 addr_fault+ 0x2ee/0x6b0 fault+0x83/ 0x1b0 page_fault+ 0x27/0x30 vqs+0x6e/ 0x2a0 vq_common+ 0x166/0x1a0 remove+ 0x61/0x80 dev_remove+ 0x3f/0xc0 remove+ 0x40/0x80 release_ driver_ internal+ 0x20b/0x270 release_ driver+ 0x12/0x20 device+ 0xcb/0x140 del+0x161/ 0x3e0 generic_ read_dev_ vendor_ id+0x2c/ 0x1a0 unregister+ 0x17/0x60 virtio_ device+ 0x16/0x40 pci_remove+ 0x43/0xa0 remove+ 0x36/0xb0 remove+ 0x40/0x80 release_ driver_ internal+ 0x20b/0x270 release_ driver+ 0x12/0x20 bus_device+ 0x7a/0xb0 and_remove_ bus_device+ 0x12/0x30 slot+0x4f/ 0xa0 disable_ and_eject_ slot+0x1c/ 0xa0 event+0x11b/ 0x280 hotplug_ notify+ 0x10/0x10 hotplug_ notify+ 0x27/0x70 hotplug+ 0xb6/0x300 work_fn+ 0x1e/0x40 one_work+ 0x16c/0x350 thread+ 0x306/0x440 lock_irqsave+ 0xe/0x20 thread+ 0x10/0x10
[ 72.571019] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 72.571084] #PF: supervisor instruction fetch in kernel mode
[ 72.571128] #PF: error_code(0x0010) - not-present page
[ 72.571167] PGD 0 P4D 0
[ 72.571190] Oops: 0010 [#1] PREEMPT SMP NOPTI
[ 72.571225] CPU: 2 PID: 358 Comm: kworker/u8:3 Kdump: loaded Not tainted 6.8.0-31-generic #31-Ubuntu
[ 72.571344] Workqueue: kacpi_hotplug acpi_hotplug_
[ 72.571386] RIP: 0010:0x0
[ 72.571417] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 72.571468] RSP: 0018:ffffb0c880
[ 72.571508] RAX: 0000000000000000 RBX: ffff8af8c1b08800 RCX: 0000000000000000
[ 72.571561] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8af8c1b08800
[ 72.571616] RBP: ffffb0c880307ab8 R08: 0000000000000000 R09: 0000000000000000
[ 72.571667] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8af8c550c700
[ 72.571717] R13: ffff8af8c1b08b28 R14: ffff8af8c550c200 R15: 0000000000000080
[ 72.571768] FS: 000000000000000
[ 72.571825] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 72.571867] CR2: ffffffffffffffd6 CR3: 000000014f23c006 CR4: 00000000007706f0
[ 72.571921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 72.571972] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 72.572023] PKRU: 55555554
[ 72.572046] Call Trace:
[ 72.572068] <TASK>
[ 72.572087] ? show_regs+0x6d/0x80
[ 72.572117] ? __die+0x24/0x80
[ 72.572144] ? page_fault_
[ 72.572177] ? do_user_
[ 72.572211] ? exc_page_
[ 72.572244] ? asm_exc_
[ 72.572279] vp_del_
[ 72.572308] remove_
[ 72.572341] virtnet_
[ 72.572370] virtio_
[ 72.572402] device_
[ 72.572433] device_
[ 72.572477] device_
[ 72.572510] bus_remove_
[ 72.572542] device_
[ 72.572571] ? pci_bus_
[ 72.572617] device_
[ 72.572648] unregister_
[ 72.572684] virtio_
[ 72.572714] pci_device_
[ 72.572746] device_
[ 72.572919] device_
[ 72.573083] device_
[ 72.573241] pci_stop_
[ 72.573394] pci_stop_
[ 72.573552] disable_
[ 72.573705] acpiphp_
[ 72.573860] hotplug_
[ 72.574006] ? __pfx_acpiphp_
[ 72.574159] acpiphp_
[ 72.574304] acpi_device_
[ 72.574452] acpi_hotplug_
[ 72.574598] process_
[ 72.574742] worker_
[ 72.574878] ? _raw_spin_
[ 72.575017] ? __pfx_worker_
[ 72.5...