System frozen after resume, on Ubuntu 16.04 kernel 4.4.0-22

Bug #1589379 reported by bastien
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

Problem on HP probook 450, with Radeon R7 M260/M265.
The last kernel on which resume works is 4.4.3-040403-generic.
I have the problem on all kernels above, including v4.6-rc7-wily.

Reverse bisect revealed fix commit:
24e8df6a6837d6cff182e84b838dc1d6971251fc drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c

ProblemType: KernelOops
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-22-generic 4.4.0-22.40
ProcVersionSignature: Ubuntu 4.4.0-22.40-generic 4.4.8
Uname: Linux 4.4.0-22-generic x86_64
Annotation: This occurred during a previous suspend, and prevented the system from resuming properly.
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: bastien 2164 F.... pulseaudio
Date: Mon Jun 6 07:50:40 2016
DuplicateSignature: suspend/resume:HP HP ProBook 450 G3:N78 Ver. 01.06
ExecutablePath: /usr/share/apport/apportcheckresume
Failure: suspend/resume
HibernationDevice: RESUME=UUID=1888eb9c-89d7-4478-b4bc-1ce90d714377
InstallationDate: Installed on 2016-05-05 (31 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
InterpreterPath: /usr/bin/python3.5
MachineType: HP HP ProBook 450 G3
ProcCmdline: /usr/bin/python3 /usr/share/apport/apportcheckresume
ProcEnviron:
 PATH=(custom, no user)
 LANG=fr_FR.UTF-8
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-22-generic.efi.signed root=UUID=b919e8bc-92b0-4fcc-b9d3-0d8f38803a2b ro quiet splash vt.handoff=7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.4.0-22-generic N/A
 linux-backports-modules-4.4.0-22-generic N/A
 linux-firmware 1.157
SourcePackage: linux
Title: [HP HP ProBook 450 G3] suspend/resume failure
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 12/18/2015
dmi.bios.vendor: HP
dmi.bios.version: N78 Ver. 01.06
dmi.board.name: 8101
dmi.board.vendor: HP
dmi.board.version: KBC Version 16.20
dmi.chassis.asset.tag: 5CD6080WDD
dmi.chassis.type: 10
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrN78Ver.01.06:bd12/18/2015:svnHP:pnHPProBook450G3:pvr:rvnHP:rn8101:rvrKBCVersion16.20:cvnHP:ct10:cvr:
dmi.product.name: HP ProBook 450 G3
dmi.sys.vendor: HP

Revision history for this message
bastien (bastien-bernard) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

bastien, thank you for reporting this and helping make Ubuntu better.

In order to allow additional upstream developers to examine the issue, at your earliest convenience, could you please test the latest upstream kernel available from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D ? Please keep in mind the following:
1) The one to test is at the very top line at the top of the page (not the daily folder).
2) The release names are irrelevant.
3) The folder time stamps aren't indicative of when the kernel actually was released upstream.
4) Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds .

If testing on your main install would be inconvenient, one may:
1) Install Ubuntu to a different partition and then test this there.
2) Backup, or clone the primary install.

If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description:
kernel-fixed-upstream
kernel-fixed-upstream-X.Y-rcZ

Where X, and Y are the first two numbers of the kernel version, and Z is the release candidate number if it exists.

If the mainline kernel does not fix the issue, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-X.Y-rcZ

Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream.

Also, you don't need to apport-collect further unless specifically requested to do so.

Once testing of the latest upstream kernel is complete, please mark this report Status Confirmed. Please let us know your results.

Thank you for your understanding.

tags: added: bios-outdated-1.11
tags: added: regression-release
Changed in linux (Ubuntu):
importance: Undecided → Low
status: Confirmed → Incomplete
Revision history for this message
bastien (bastien-bernard) wrote :

I just tried on the latest upstream kernel, 4.7-rc2, and the bug is reproduced.

tags: added: kernel-bug-exists-upstream kernel-bug-exists-upstream-4.7-rc2
penalvch (penalvch)
tags: added: needs-bisect
Revision history for this message
bastien (bastien-bernard) wrote :

Thank you for bringing this to my attention.
I updated the BIOS to 1.11.
I still reproduce the problem: system completely frozen after resume.

1) The output of the command:
N78 Ver. 01.11
05/09/2016

2) No improvement with this BIOS.
3) Setting the bug to confirmed.

Thanks.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

bastien, the next step is to fully commit bisect from kernel 4.4.3 to 4.4.8 in order to identify the last good kernel commit, followed immediately by the first bad one. This will allow for a more expedited analysis of the root cause of your issue. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection ?

Please note, finding adjacent kernel versions is not fully commit bisecting.

Also, the kernel release names are irrelevant for the purposes of bisecting.

After the offending commit (not kernel version) has been identified, then please mark this report Status Confirmed.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

tags: added: latest-bios-1.11
removed: bios-outdated-1.11
Changed in linux (Ubuntu):
importance: Low → Medium
status: Confirmed → Incomplete
tags: removed: need-duplicate-check
Revision history for this message
bastien (bastien-bernard) wrote :

I tried the following kernels:
- 4.4.3: no problem, resume after suspend works
- 4.4.4: impossible to boot on the OS
- 4.4.5: bug of this report (system frozen after reboot) is reproduced

I did a commit bissect between 4.4.3 and 4.4.5, and I observed either:
- no problem, resume after suspend works (git bisect good)
- or impossible to boot on the OS (git bisect bad)

At the end I got the following commit:
[b8b1ad305f8de05b241a57707d5b3de3692dbdfa] drm/amdgpu: don't load MEC2 on topaz

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

bastien, to confirm, if you test one commit back (not bisection point) from the below, the issue is not reproducible:
[b8b1ad305f8de05b241a57707d5b3de3692dbdfa] drm/amdgpu: don't load MEC2 on topaz

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
bastien (bastien-bernard) wrote :

My mistake, I did not go until the end of the bissect, one step was remaining.
So I now have the following bad commit:

4474b85771139f2da8f8f4f443e6fad08081e99e is the first bad commit
commit 4474b85771139f2da8f8f4f443e6fad08081e99e
Author: Alex Deucher <email address hidden>
Date: Tue Feb 2 16:24:20 2016 -0500

    drm/amdgpu: remove exp hardware support from iceland

If I test one commit back ([b8b1ad305f8de05b241a57707d5b3de3692dbdfa] drm/amdgpu: don't load MEC2 on topaz), I do not reproduce the problem.

I did another bisect to see from which commit I can boot into the OS and really see the problem of freeze after resume. The result is:

 commit 53e609099daa023ad7771ec8351202f2a7bee1c1
Author: Alex Deucher <email address hidden>
Date: Mon Mar 7 18:40:45 2016 -0500

    drm/amdgpu: fix topaz/tonga gmc assignment in 4.4 stable

So to sum-up:
- before commit 4474b85771139f2da8f8f4f443e6fad08081e99e, resume was working OK (version 4.4.3)
- from commit 4474b85771139f2da8f8f4f443e6fad08081e99e (drm/amdgpu: remove exp hardware support from iceland), I cannot boot on the OS (version 4.4.4)
- from commit 53e609099daa023ad7771ec8351202f2a7bee1c1 (drm/amdgpu: fix topaz/tonga gmc assignment in 4.4 stable), I can boot into the OS and see the problem of system frozen on resume (this bug report)(version 4.4.5 and above).

Revision history for this message
penalvch (penalvch) wrote :

bastien, to keep this relevant to upstream, could you please test the latest mainline kernel (4.7-rc6) and advise to the results?

tags: added: bisect-done
removed: needs-bisect
Revision history for this message
bastien (bastien-bernard) wrote :

I just tested 4.7-rc6, the bug is reproduced (system frozen on resume).

penalvch (penalvch)
tags: added: kernel-bug-exists-upstream-4.7-rc6
removed: kernel-bug-exists-upstream-4.7-rc2
Revision history for this message
penalvch (penalvch) wrote :

bastien, the issue you are reporting is an upstream one. Could you please report this problem via https://bugs.freedesktop.org/enter_bug.cgi?product=DRI :
Component: DRM/AMDgpu
Hardware: x86-64
OS: Linux (All)
CC: Alex Deucher

Please provide a direct URL to your bug report once you have made it so that it may be tracked.

Thank you for your help.

Changed in linux (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
bastien (bastien-bernard) wrote :
Revision history for this message
manof (manof) wrote :

I have the same problem with Linux kernel 4.4.8 on my fresh (bought 4 days ago) HP notebook 17-y011nc, with AMD/ATI Topaz XT [Radeon R7 M260/M265].

The interesting thing is, that originally I had the 4.4.0 kernel from Ubuntu 16.04.1 LTS but it sometimes got frozen during suspending. Not everytime, just sometimes. However, it was a way too often, so it was pretty annoying.

Then I have found somewhere in forums, that this bug was resolved in newer kernels, so I switched from 4.4.0 to 4.4.8. And it surely was resolved, however after resume from suspend, it ALWAYS got frozen, so my problem in fact got much worse with the kernel upgrade 4.4.0 => 4.4.8.

Revision history for this message
penalvch (penalvch) wrote :

manof, it will help immensely if you filed a new report with the Ubuntu repository kernel (not mainline/upstream) via a terminal:
ubuntu-bug linux

Please feel free to subscribe me to it.

For more on why this is helpful, please see https://wiki.ubuntu.com/ReportingBugs.

Revision history for this message
bastien (bastien-bernard) wrote :

Good news, this seems corrected on the last upstream kernel 4.9.0-040900rc6-generic.

Revision history for this message
penalvch (penalvch) wrote :

bastien, the next step is to fully reverse commit bisect from kernel 4.7-rc6 to 4.9-rc6 in order to identify the last bad commit, followed immediately by the first good one. Once this good commit has been identified, it may be reviewed for backporting. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection#How_do_I_reverse_bisect_the_upstream_kernel.3F ?

Please note, finding adjacent kernel versions is not fully commit bisecting.

Also, the kernel release names are irrelevant for the purposes of bisecting.

It is most helpful that after the fix commit (not kernel version) has been identified, you then mark this report Status Confirmed.

Thank you for your help.

tags: added: kernel-fixed-upstream kernel-fixed-upstream-4.9-rc6 needs-reverse-bisect
removed: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
bastien (bastien-bernard) wrote :

This bug got solved in 4.9-rc2, from what I saw:

- up to commit 9faa6b0277fab4ab91db4d69bc47566fdfbae48b drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c
-> the bug is reproduced

- up to one commit above, eeb2fa0c97ba661f8b7fb210a1de10928b67a47b drm/amdgpu: potential NULL dereference in debugfs code
-> I have a black screen after resuming the system

- up to one commit above, 24e8df6a6837d6cff182e84b838dc1d6971251fc drm/amd/powerplay: fix static checker warnings in smu7_hwmgr.c
-> the bug is not reproduced, the resume is OK.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
penalvch (penalvch)
tags: added: reverse-bisect-done
removed: bisect-done needs-reverse-bisect
Changed in linux (Ubuntu):
status: Confirmed → Triaged
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.