HP x360 - Ryzen 2500U Locks up

Bug #1772081 reported by Stu
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
High
Unassigned

Bug Description

Hi,
  I have an HP x360 Ryzen 2500u. It locks up a couple of times a day. Before I disabled C6, using this python script https://github.com/r4m0n/ZenStates-Linux it would lock up minutes after boot.
---
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: stu 3022 F.... pulseaudio
 /dev/snd/controlC0: stu 3022 F.... pulseaudio
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 18.04
InstallationDate: Installed on 2018-04-19 (32 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Beta amd64 (20180404)
MachineType: HP HP ENVY x360 Convertible 15-bq1xx
Package: linux (not installed)
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-20-generic root=UUID=5fbe26ec-88dd-4b0f-8e57-c577efacffc7 ro quiet splash vt.handoff=1
ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-20-generic N/A
 linux-backports-modules-4.15.0-20-generic N/A
 linux-firmware 1.173
StagingDrivers: r8822be
Tags: bionic staging
Uname: Linux 4.15.0-20-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 01/26/2018
dmi.bios.vendor: AMI
dmi.bios.version: F.16
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 83C6
dmi.board.vendor: HP
dmi.board.version: 63.18
dmi.chassis.type: 31
dmi.chassis.vendor: HP
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnAMI:bvrF.16:bd01/26/2018:svnHP:pnHPENVYx360Convertible15-bq1xx:pvr:rvnHP:rn83C6:rvr63.18:cvnHP:ct31:cvrChassisVersion:
dmi.product.family: 103C_5335KV HP Envy
dmi.product.name: HP ENVY x360 Convertible 15-bq1xx
dmi.sys.vendor: HP

Revision history for this message
Stu (stu-axon) wrote :
Revision history for this message
Stu (stu-axon) wrote :
Revision history for this message
Stu (stu-axon) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1772081

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Same as LP: #1690085?

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Revision history for this message
Stu (stu-axon) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected bionic staging
description: updated
Revision history for this message
Stu (stu-axon) wrote : CRDA.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : IwConfig.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : Lspci.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : Lsusb.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : ProcEnviron.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : ProcModules.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : PulseList.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : RfKill.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : UdevDb.txt

apport information

Revision history for this message
Stu (stu-axon) wrote : WifiSyslog.txt

apport information

Revision history for this message
Freihut (freihut) wrote :

Did you tried:
idle=nomwait
as kernel-parameter via grub?
It fixes CPU-related freezes for me with the Ryzen 2500U.

In the case you ran into amdgpu related freezes, try:
amdgpu.audio=0

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Is there a "Typical Idle Current" option in your BIOS?

Revision history for this message
Stu (stu-axon) wrote :

Hi,
  I'm away from my laptop for a week, it's the default HP bios, so I don't think there is a typical idle current option.

I'll try that option when I get back.

Revision history for this message
Stu (stu-axon) wrote :

Hi @kaihengfeng the bios is the one provided by HP, it doesn't seem to have that option.

@freihut

I will try both those options.

Currently I've been using
amd_iommu=on

and
pcie_aspm=off
(which *seemed* to reduce the amount of lockups).

I'll try disabling these and use idle=nomwait and amdgpu.audio.

I guess that pcie_aspm=off and idle=nomwait will both reduce battery life ?

Revision history for this message
Stu (stu-axon) wrote :
Download full text (3.4 KiB)

Hi,
   I just got a freeze 6 hours in to using idle=nomwait amdgpu.audio=0

It looks like GPU errors, these are the last messages in journal ctl before I rebooted:

Jun 25 23:04:06.590348 computer kernel: gmc_v9_0_process_interrupt: 71 callbacks suppressed
Jun 25 23:04:06.601295 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.601704 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a02000 from 27
Jun 25 23:04:06.601941 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00301031
Jun 25 23:04:06.602156 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.602435 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a04000 from 27
Jun 25 23:04:06.602663 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.602949 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.603189 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a01000 from 27
Jun 25 23:04:06.603413 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.603646 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.603867 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a06000 from 27
Jun 25 23:04:06.604108 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.604326 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.604569 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a08000 from 27
Jun 25 23:04:06.604890 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.605147 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.605371 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a06000 from 27
Jun 25 23:04:06.605606 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.605821 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.606050 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a01000 from 27
Jun 25 23:04:06.606275 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.606574 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.606792 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a0c000 from 27
Jun 25 23:04:06.607038 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.607256 computer kernel: amdgpu 0000:04:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:3 pasid:32768)
Jun 25 23:04:06.607483 computer kernel: amdgpu 0000:04:00.0: at page 0x0000000104a0a000 from 27
Jun 25 23:04:06.607700 computer kernel: amdgpu 0000:04:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Jun 25 23:04:06.607927 computer kern...

Read more...

Revision history for this message
Freihut (freihut) wrote :

I can't answer your question about battery life according to the parameters, I did not test that.

If you're running RavenRidge on Ubuntu, you will have to fight the CPU (Ryzen's CPU lock ups: https://bugzilla.kernel.org/show_bug.cgi?id=196683 ) and also amdgpu, because they're still working on it.

1st: Get your CPU working:
AFAIK:
disable SMT (seems to have some weird clocking side effects – I didn't figure that out)
OR
set CPU govenor to "powersafe" (will make the Ryzen's clock get kind of fixed to max 1,6 or 2ghz)
OR
idle=nomwait (my favorite solution so far)

While you're testing the CPU, try to disable amdgpu by kernel-parameter "nomodeset" to exclude any amdgpu-issues while testing.

My testcase-setup was: Let Palemoon (Firefox-Fork) autoplay random videos on Youtube, because it causes some partial load and needs no user actions. Freezes occurred between 4 minutes and 3 hours.

2nd: Get amdgpu working:
Well. You'll have a lot of fun finding a solution, that works for you!
*Start reading vega-based articles on phoronix.
*Then cry a lot.
*Then try out different versions of mesa/LLVM ( https://launchpad.net/~paulo-miguel-dias ).
*Then cry a lot.

I'm was testing Padoka stable ppa with Kernel 4.15 and 4.17.2 (mainline, installed via ukuu) and still had freezes (GUI was frozen, but sound was going on and REISUB was possible) until I copied new firmware from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu?id=66e1b33efe227432956136dd46bdb2a9b7f38a27
Just download all raven_*.bin files and put them to /lib/firmware/amdgpu/ and reboot. My freezes went away, no matter what Kernel (4.15 or 4.17) I'm using.
But if I'm running:
"update-initramfs -u"
booting will freeze at:
"switching to amdgpudrmfb from EFI VGA"

Solution is to boot with "nomodeset" and run:
"apt remove linux-firmware"
"apt install linux-firmware"
"update-initramfs -u"
and then copy the new firmware again and reboot. But I just ran into that issue, so that "solution" is brand new and not much tested. Better have backups, as you always should have.

Running RavenRidge on Ubuntu is a bit of a kerfuffle.

Revision history for this message
Hexawolf (hexawolf) wrote :

Confirming this issue, Ryzen 2500U on HP Notebook 15-db0229ur.
- Disabling C-State C6 partially solves the problem
- Latest BIOS from manufacturer (F.11 Rev.A) - unable to access any of advanced settings!

Are maintainers even aware of this?

Also bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1562530

Disabling C6 states is not a fix it really. It's C6 idle state still, disabling it increases power consumption of CPU at least by 80% in idle state. This is certainly a kernel issue. Random freezes are random, they cause massive data loss as all you can do is to force reboot the laptop. Disabling FS cache isn't a solution either as this will kill your drive faster.

Revision history for this message
Hexawolf (hexawolf) wrote :

Confirming this issue, Ryzen 2500U on HP Notebook 15-db0229ur.
- Disabling C-State C6 partially solves the problem
- Latest BIOS from manufacturer (F.11 Rev.A) - unable to access any of advanced settings!

Also bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1562530

Disabling C6 states is not a fix it really. It's C6 idle state still, disabling it increases power consumption of CPU at least by 80% in idle state. This is certainly a kernel issue. Random freezes are random, they cause massive data loss as all you can do is to force reboot the laptop.

AMD did publish an errata report, pointing that this may be fixed at software level (page 63).
https://www.amd.com/system/files/TechDocs/55449_Fam_17h_M_00h-0Fh_Rev_Guide.pdf

Revision history for this message
KennoVO (kenno-xs4all) wrote :

> disabling it increases power consumption of CPU at least by 80% in idle state

That doesn't say much at all. _Suppose_ the CPU's power consumption is 5W in C6 state, and it indeed increases by 80%, then it becomes 9W, or 4W more. I could live with that - at least until a better fix becomes available.

The problem is of course that those are imaginary numbers. Does anyone have some data on the increase in power consumption, in W, of a real system upon disabling C6 state?

Otherwise, I may attempt to measure it myself once I find some time. But that can take a while...

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.