makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid pmd_pte."
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
makedumpfile (Ubuntu) |
Confirmed
|
Undecided
|
Kellen Renshaw |
Bug Description
[Impact]
* On Focal with an HWE (>=5.12) kernel, makedumpfile can sometimes fail with "__vtop4_x86_64: Can't get a valid pmd_pte."
* makedumpfile falls back to cp for the dump, resulting in extremely large vmcores. This can impact both collection and analysis due to lack of space for the resulting vmcore.
* This is fixed in upstream commit present in versions 1.7.0 and 1.7.1:
https:/
commit 646456862df8926
Author: Kazuhito Hagio <email address hidden>
Date: Wed May 26 14:31:26 2021 +0900
[PATCH] Increase SECTION_
* Required for kernel 5.12
Kernel commit 1f90a3477df3 ("mm: teach pfn_to_
ZONE_DEVICE section collisions") added a section flag
(SECTION_
some machines like this:
_
readmem: Can't convert a virtual address(
readmem: type_addr: 0, addr:ffffe2bdc2
_
create_
Increase SECTION_
been used until the change, so we can just increase the value.
Signed-off-by: Kazuhito Hagio <email address hidden>
[Test Plan]
* Confirm that makedumpfile works as expected by triggering a kdump.
* Confirm that the patched makedumpfile works as expected on a system known to experience the issue.
* Confirm that the patched makedumpfile is able to work with a cp-generated known affected vmcore to compress it. The unpatched version fails.
[Where problems could occur]
* This change could adversely affect the collection/
Changed in makedumpfile (Ubuntu): | |
assignee: | nobody → Kellen Renshaw (krenshaw) |
tags: | added: sts |
summary: |
- makedumpfile fails with __vtop4_x86_64: Can't get a valid pmd_pte. + makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid + pmd_pte." |
Hi Kellen, thanks a lot for reporting and fixing that!
I'd like to take the opportunity to discuss something related: no matter how many bugs we fix in makedumpfile / crash, more will come as kernel version bumps. Kernel has no stable ABI, so kernel developers can "break" compatibility with such tools, although makedumpfile maintainer (and crash's as well!) are really great in keep up with that and release proactive fixes even before the kernel change is merged.
But the problem is: in Ubuntu ecosystem, despite we have the HWE concept for kernel, these packages are not part of kernel HWE upgrades; hence, they get "stuck" and subject to bugs when kernel HWE is released. It happens all the time and will continue happening...
We had discussions in the past (and I'm hereby CCing the interested parties: DannF, Dan Streetman, Heitor and Cascardo) about sync'ing makedumpfile and crash with kernel HWE upgrades. So, that might be a good opportunity for doing it.
The idea was more or less like this: update makedump/crash on Release to make it sync'ed with Release +1 until the next LTS. So, in the end, we'll have LTS version == LTS +1 and then, we stop upgrading/syncing these packages. And the cycle restarts for LTS+1, up to the release of LTS+2.
Hopefully this plan (or something similar) eventually is followed, I bet all users/customers would be glad to not face makedump/crash bugs due to kernel upgrades anymore!
Cheers, and thanks for the attention =D