function_graph tracer in ftrace related tests triggers kernel bug (arch/x86/xen/spinlock.c:62) and kernel panic on AWS cloud c3.xlarge
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ubuntu-kernel-tests |
New
|
Undecided
|
Unassigned | ||
linux-aws (Ubuntu) |
New
|
Undecided
|
Unassigned | ||
Bionic |
New
|
Undecided
|
Unassigned | ||
Focal |
New
|
Undecided
|
Unassigned | ||
Jammy |
New
|
Undecided
|
Unassigned | ||
Lunar |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
Test:
* ftrace:
* test_enable_
Will crash AWS instance c3.xlarge when testing the "function_graph" tracer.
We have a similar issue filed against Azure (bug 1882669). Filing a new bug report because on AWS this is affecting 5.4 ~ 6.2 AWS kernel. However on Azure this is not affecting newer kernels.
Take B-aws-5.4-1108 for example, with the ubuntu_
[ 211.675624] kernel BUG at /build/
[ 211.678258] invalid opcode: 0000 [#1] SMP PTI
[ 211.679596] CPU: 1 PID: 14 Comm: cpuhp/1 Not tainted 5.4.0-1108-aws #116~18.04.1-Ubuntu
[ 211.681825] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
[ 211.683728] RIP: 0010:dummy_
[ 211.685042] Code: 8b 75 e4 74 d6 44 89 e7 e8 f9 88 61 00 eb d6 44 89 e7 e8 6f ab 61 00 eb cc 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 80 3d 59 d0 9f 01 00 75 02 f3
[ 211.690314] RSP: 0000:ffffaecd40
[ 211.691934] RAX: ffffffffb462e3e0 RBX: 000000000000003b RCX: 0000000000000000
[ 211.694036] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b
[ 211.696159] RBP: ffffaecd4000ee38 R08: ffff8aefa6c036c0 R09: ffff8aefa6c038c0
[ 211.698232] R10: 0000000000000000 R11: ffffffffb6064da8 R12: 0000000000000000
[ 211.700365] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8aefa34b8700
[ 211.702482] FS: 000000000000000
[ 211.704894] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 211.706598] CR2: 0000000000000000 CR3: 00000001b400a001 CR4: 00000000001606e0
[ 211.708731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 211.710835] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 211.713448] Call Trace:
[ 211.714796] <IRQ>
[ 211.715996] __handle_
[ 211.717907] handle_
[ 211.719759] handle_
[ 211.721474] generic_
[ 211.723184] handle_
[ 211.724988] evtchn_
[ 211.726913] __xen_evtchn_
[ 211.728749] xen_evtchn_
[ 211.730520] xen_hvm_
[ 211.732271] </IRQ>
[ 211.733425] RIP: 0010:_raw_
[ 211.735470] Code: e8 70 3d 64 ff 4c 29 e0 4c 39 f0 76 cf 80 0b 08 eb 8a 90 90 90 0f 1f 44 00 00 55 48 89 e5 e8 a6 ad 66 ff 66 90 48 89 f7 57 9d <0f> 1f 44 00 00 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 c6 07
[ 211.741920] RSP: 0000:ffffaecd40
[ 211.744980] RAX: 0000000000000001 RBX: ffff8aefa34b8700 RCX: 000000000002cc00
[ 211.747420] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
[ 211.749882] RBP: ffffaecd400fbcf8 R08: ffff8aefa6c036c0 R09: ffff8aefa6c038c0
[ 211.752340] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000003b
[ 211.754785] R13: 0000000000000000 R14: ffff8aef95bdfa00 R15: ffff8aef95bdfaa4
[ 211.757249] __setup_
[ 211.758779] ? kmem_cache_
[ 211.760639] request_
[ 211.762358] bind_ipi_
[ 211.764124] ? xen_qlock_
[ 211.765734] ? snr_uncore_
[ 211.767496] xen_init_
[ 211.769135] ? snr_uncore_
[ 211.770864] xen_cpu_
[ 211.772500] cpuhp_invoke_
[ 211.774233] cpuhp_thread_
[ 211.775866] smpboot_
[ 211.777524] kthread+0x121/0x140
[ 211.778968] ? sort_range+
[ 211.780508] ? kthread_
[ 211.782073] ret_from_
[ 211.783622] Modules linked in: nls_iso8859_1 binfmt_misc serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_
[ 211.796497] ---[ end trace bb7d4e9bb7f852cb ]---
[ 211.798303] RIP: 0010:dummy_
[ 211.800022] Code: 8b 75 e4 74 d6 44 89 e7 e8 f9 88 61 00 eb d6 44 89 e7 e8 6f ab 61 00 eb cc 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 80 3d 59 d0 9f 01 00 75 02 f3
[ 211.806379] RSP: 0000:ffffaecd40
[ 211.808356] RAX: ffffffffb462e3e0 RBX: 000000000000003b RCX: 0000000000000000
[ 211.810792] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b
[ 211.813263] RBP: ffffaecd4000ee38 R08: ffff8aefa6c036c0 R09: ffff8aefa6c038c0
[ 211.815713] R10: 0000000000000000 R11: ffffffffb6064da8 R12: 0000000000000000
[ 211.818162] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8aefa34b8700
[ 211.820637] FS: 000000000000000
[ 211.823821] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 211.825904] CR2: 0000000000000000 CR3: 00000001b400a001 CR4: 00000000001606e0
[ 211.828357] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 211.830829] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 211.833284] Kernel panic - not syncing: Fatal exception in interrupt
[ 211.835575] Kernel Offset: 0x33600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000
Test output:
+ echo 1
+ . /home/ubuntu/
+ test -f available_tracers
+ cat available_tracers
+ echo hwlat
+ echo blk
+ echo mmiotrace
+ echo function_graph
(Test interrupted here)
To verify this, you can hack linux/tools/
#!/bin/sh
# SPDX-License-
# description: Basic test for tracers
# flags: instance
test -f available_tracers
for t in `cat available_tracers`; do
read -p "testing $t" foo
echo $t > current_tracer
done
echo nop > current_tracer
And run this manually with sudo ./ftracetest -vvv test.d/
For ubuntu_
And this is affecting 4.15 AWS as well. Here is the output from b-aws-4.15-2099
Test output:
+ echo blk
+ echo mmiotrace
+ echo function_graph
(Test interrupted here)
dmesg: linux-aws- fips-hcslsq/ linux-aws- fips-4. 15.0/arch/ x86/xen/ spinlock. c:69! iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops aesni_intel drm aes_x86_64 crypto_simd i2c_piix4 glue_helper cryptd i2c_core ixgbevf pata_acpi 2099-aws- fips #105-Ubuntu handler+ 0x4/0x10 a43e38 EFLAGS: 00010046 0(0000) GS:ffffa0c827a4 0000(0000) knlGS:000000000 0000000 irq_event_ percpu+ 0x44/0x1a0 irq_event_ percpu+ 0x32/0x80 percpu_ irq+0x3d/ 0x60 handle_ irq+0x28/ 0x40 irq_for_ port+0x8f/ 0xe0 2l_handle_ events+ 0x157/0x270 do_upcall+ 0x76/0xe0 do_upcall+ 0x2b/0x50 callback_ vector+ 0x90/0xa0 spin_unlock_ irqrestore+ 0x15/0x20 d37d00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff0c irq+0x424/ 0x6e0 threaded_ irq+0xf9/ 0x170 to_irqhandler+ 0xc6/0x1f0
[ 216.814155] kernel BUG at /build/
[ 216.816072] invalid opcode: 0000 [#1] SMP PTI
[ 216.816991] Modules linked in: nls_iso8859_1 binfmt_misc sb_edac intel_rapl_perf serio_raw sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_
[ 216.827515] CPU: 1 PID: 13 Comm: cpuhp/1 Not tainted 4.15.0-
[ 216.829132] Hardware name: Xen HVM domU, BIOS 4.11.amazon 08/24/2006
[ 216.830471] RIP: 0010:dummy_
[ 216.831362] RSP: 0000:ffffa0c827
[ 216.832477] RAX: ffffffff8302a9c0 RBX: 000000000000003b RCX: 000000000000003b
[ 216.833951] RDX: 0000000000400e00 RSI: 0000000000000000 RDI: 000000000000003b
[ 216.835425] RBP: ffffa0c827a43e38 R08: ffffa0c827000db0 R09: ffffa0c81b14ea00
[ 216.836902] R10: 0000000000000040 R11: 0000000000000000 R12: 0000000000000000
[ 216.838378] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa0c82444f400
[ 216.839854] FS: 000000000000000
[ 216.841533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 216.842708] CR2: 0000000000000000 CR3: 000000003b80a001 CR4: 00000000001606e0
[ 216.844141] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 216.845631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 216.847095] Call Trace:
[ 216.847640] <IRQ>
[ 216.848079] __handle_
[ 216.849070] handle_
[ 216.850009] handle_
[ 216.850829] generic_
[ 216.851687] handle_
[ 216.852556] evtchn_
[ 216.853537] __xen_evtchn_
[ 216.854453] xen_evtchn_
[ 216.855327] xen_hvm_
[ 216.856266] </IRQ>
[ 216.856724] RIP: 0010:_raw_
[ 216.857890] RSP: 0000:ffffb95780
[ 216.859472] RAX: 0000000000000001 RBX: ffffa0c82444f400 RCX: 000000000002cc00
[ 216.860907] RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
[ 216.862353] RBP: ffffb95780d37d00 R08: ffffa0c827000db0 R09: ffffa0c81b14ea00
[ 216.863782] R10: 0000000000000040 R11: 0000000000000246 R12: 000000000000003b
[ 216.865233] R13: 0000000000000000 R14: ffffa0c81b14ea00 R15: ffffa0c81b14eaa4
[ 216.866662] __setup_
[ 216.867392] request_
[ 216.868281] bind_ipi_
[ ...