zfs pool locks and see "INFO: task txg_sync:4307 blocked for more than 120 seconds. "

Bug #1889110 reported by Rob Babb
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
zfs-linux (Ubuntu)
Incomplete
Medium
Dimitri John Ledkov

Bug Description

ZFS filesystem becomes unresponsive and subsequent NFS shares unresponsive. ESXi sees all paths down.

See this error 3 times in a row.

[184383.479511] INFO: task txg_sync:4307 blocked for more than 120 seconds.
[184383.479565] Tainted: P IO 5.4.0-42-generic #46-Ubuntu
[184383.479607] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[184383.479655] txg_sync D 0 4307 2 0x80004000
[184383.479658] Call Trace:
[184383.479670] __schedule+0x2e3/0x740
[184383.479673] schedule+0x42/0xb0
[184383.479676] schedule_timeout+0x152/0x2f0
[184383.479683] ? __next_timer_interrupt+0xe0/0xe0
[184383.479685] io_schedule_timeout+0x1e/0x50
[184383.479697] __cv_timedwait_common+0x15e/0x1c0 [spl]
[184383.479702] ? wait_woken+0x80/0x80
[184383.479710] __cv_timedwait_io+0x19/0x20 [spl]
[184383.479816] zio_wait+0x11b/0x230 [zfs]
[184383.479905] ? __raw_spin_unlock+0x9/0x10 [zfs]
[184383.479983] dsl_pool_sync+0xbc/0x410 [zfs]
[184383.480069] spa_sync_iterate_to_convergence+0xe0/0x1c0 [zfs]
[184383.480156] spa_sync+0x312/0x5b0 [zfs]
[184383.480245] txg_sync_thread+0x27a/0x310 [zfs]
[184383.480334] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[184383.480344] thread_generic_wrapper+0x83/0xa0 [spl]
[184383.480347] kthread+0x104/0x140
[184383.480356] ? clear_bit+0x20/0x20 [spl]
[184383.480358] ? kthread_park+0x90/0x90
[184383.480361] ret_from_fork+0x35/0x40

Then nfsd hangs as well.

[184866.787445] INFO: task nfsd:6585 blocked for more than 120 seconds.
[184866.787485] Tainted: P IO 5.4.0-42-generic #46-Ubuntu
[184866.787526] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[184866.787573] nfsd D 0 6585 2 0x80004000
[184866.787575] Call Trace:
[184866.787578] __schedule+0x2e3/0x740
[184866.787675] ? __raw_spin_unlock+0x9/0x10 [zfs]
[184866.787678] schedule+0x42/0xb0
[184866.787685] cv_wait_common+0x133/0x180 [spl]
[184866.787688] ? wait_woken+0x80/0x80
[184866.787695] __cv_wait+0x15/0x20 [spl]
[184866.787764] dmu_tx_wait+0x1ee/0x210 [zfs]
[184866.787834] dmu_tx_assign+0x49/0x70 [zfs]
[184866.787929] zfs_write+0x461/0xd40 [zfs]
[184866.788025] ? atomic_sub_return.constprop.0+0xd/0x20 [zfs]
[184866.788033] ? atomic_dec+0xd/0x20 [spl]
[184866.788116] ? __raw_spin_unlock+0x9/0x10 [zfs]
[184866.788122] ? __d_obtain_alias+0x36/0x90
[184866.788217] zpl_write_common_iovec+0xad/0x120 [zfs]
[184866.788313] zpl_iter_write_common+0x8e/0xb0 [zfs]
[184866.788409] zpl_iter_write+0x56/0x90 [zfs]
[184866.788413] do_iter_readv_writev+0x14f/0x1d0
[184866.788416] do_iter_write+0x84/0x1a0
[184866.788418] vfs_iter_write+0x19/0x30
[184866.788442] nfsd_vfs_write+0xe0/0x480 [nfsd]
[184866.788454] nfsd_write+0x7a/0x160 [nfsd]
[184866.788458] ? kmem_cache_alloc+0x16d/0x230
[184866.788472] nfsd3_proc_write+0xc3/0x170 [nfsd]
[184866.788483] nfsd_dispatch+0xd6/0x220 [nfsd]
[184866.788508] svc_process_common+0x3af/0x700 [sunrpc]
[184866.788527] ? svc_sock_secure_port+0x16/0x30 [sunrpc]
[184866.788538] ? nfsd_svc+0x2d0/0x2d0 [nfsd]
[184866.788557] svc_process+0xd9/0x110 [sunrpc]
[184866.788568] nfsd+0xe8/0x150 [nfsd]
[184866.788570] kthread+0x104/0x140
[184866.788581] ? nfsd_destroy+0x60/0x60 [nfsd]
[184866.788583] ? kthread_park+0x90/0x90
[184866.788585] ret_from_fork+0x35/0x40

Linux zfs-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

root@zfs-01:/# lsb_release -rd
Description: Ubuntu 20.04 LTS
Release: 20.04

Revision history for this message
Rob Babb (atlantauser) wrote :

root@zfs-01:/# dpkg -l |grep -i zfs
ii libzfs2linux 0.8.3-1ubuntu12.2 amd64 OpenZFS filesystem library for Linux
ii libzpool2linux 0.8.3-1ubuntu12.2 amd64 OpenZFS pool library for Linux
ii zfs-auto-snapshot 1.2.4-2 all ZFS automatic snapshot service
ii zfs-zed 0.8.3-1ubuntu12.2 amd64 OpenZFS Event Daemon
ii zfsutils-linux 0.8.3-1ubuntu12.2 amd64 command-line tools to manage OpenZFS filesystems

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

When this happens, do you see "arc prune" or "z_upgrade" processes in top, using a lot of cpu?

Revision history for this message
Rob Babb (atlantauser) wrote :

Neither of those ring a bell. I'm trying to figure out how to reproduce as the load was very minimal at the time. There might have been a copy of a few large video files (15-50GB?)from 1 zfs array to another.

Is there a way to trigger a snapshot of top next time this happens?

Revision history for this message
Rob Babb (atlantauser) wrote :

I was able to capture the attached log messages during the last time window. I was doing a scrub on 2 different pools simultaneously.

This may not be related to zfs, but may be related to the mpt3sas driver or the lsi card itself. It sort of looks to me like the mpt3sas driver is resetting the lsi card, which looks like a power-on event, which drops the connections to the sas switch. Is that a correct assessment?

Revision history for this message
Colin Ian King (colin-king) wrote :

@Rob, from the logs it does appear that you are seeing underlying issues with the devices and the error is not in ZFS.

For example:

Aug 7 20:59:54 zfs-01 kernel: [26266.626836] blk_update_request: I/O error, dev sdn, sector 5806167448 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Aug 7 20:59:54 zfs-01 kernel: [26266.628131] zio pool=Red-ZFS-RZ2-3TB vdev=/dev/sdn1 error=5 type=1 offset=2972756684800 size=24576 flags=1808b0

....
Aug 7 20:59:55 zfs-01 kernel: [26267.374865] sd 2:0:14:0: Power-on or device reset occurred
Aug 7 20:59:55 zfs-01 kernel: [26267.374882] sd 2:0:14:0: [sdn] tag#5309 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK
Aug 7 20:59:55 zfs-01 kernel: [26267.374888] sd 2:0:14:0: [sdn] tag#5309 CDB: Read(16) 88 00 00 00 00 01 5a 13 1a f8 00 00 00 58 00 00
Aug 7 20:59:55 zfs-01 kernel: [26267.374894] sd 2:0:14:0: [sdn] tag#5309 Sense Key : Unit Attention [current]
Aug 7 20:59:55 zfs-01 kernel: [26267.374899] sd 2:0:14:0: [sdn] tag#5309 Add. Sense: Power on, reset, or bus device reset occurred

So, it may be a physical device error, and/or a combination of issues with the drive and/or controller. It may be worth double checking the H/W at this point. It does appear ZFS gets blocked on device failures.

Changed in zfs-linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Colin Ian King (colin-king) wrote :

Can you double check the H/W and let me know if this resolves the ZFS issue?

Changed in zfs-linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
Changed in zfs-linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Rob Babb (atlantauser) wrote :

Hi Colin,

It seems to happen on 2 different controllers (both LSI). I'm using the 9201-16e at the moment and it performs much faster overall, but doing something like a zfs scrub on a pool still causes the resets and thus zfs locks as part of those. Basically seems to be under heavy IO load. Although, this controller can handle much heavier IO that just a single scrub. For example, it seems much more likely to happen on my 3.5" 2TB and 3TB HDD drives, as opposed to my 2.5" 1TB SSD's. The SSD's scream through a scrub in about an hour, whereas the HDD's take a day or more.

The only thing I can think of is maybe to increase direct cooling on the controller in case it's overheating. But this is a Dell R710 server chassis with lots of high volume airflow.

It's really hard to pinpoint the problem between controller, driver, and filesystem.

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Rob,

It may be worth exercising the raw device by performing a lot of random I/O reads rather that via zfs, that way we can eliminate zfs out of the equation to see if it's a controller or driver issue and not a zfs error. That way we can at least use occams razor to remove a lot of software complexity from the debugging process.

Revision history for this message
Colin Ian King (colin-king) wrote :

So it may be that the write-back-throttling (wbt) for the underlying devices is getting confused about the exact throttle rates are for these devices and somehow getting stuck. It maybe worth experimenting by disabling the throttling and seeing if this gets I/O working again.

For example, to disable wbt for a device /dev/sda use:

echo 0 | sudo tee /sys/block/sda/queue/wbt_lat_usec

and if you need to reset it back to the default:

echo -1 | sudo tee /sys/block/sda/queue/wbt_lat_usec

..use the appropriate block device name for the block devices you have attached. It may even be worth setting the wbt_lat_usec to 0 for all the block devices in your pool as early as possible after boot and see if this helps.

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Rob, did the suggestions in comments #8 and #9 help?

Revision history for this message
Rob Babb (atlantauser) wrote :

The write-back-throttling didn't do anything different.

I have not had an opportunity yet to power down the unit and install extra fans directly on the card to see if that helps.

Revision history for this message
Colin Ian King (colin-king) wrote :

Hi Rob, is this still an open issue?

Revision history for this message
Rob Babb (atlantauser) wrote : Re: [Bug 1889110] Re: zfs pool locks and see "INFO: task txg_sync:4307 blocked for more than 120 seconds. "
Download full text (4.5 KiB)

Probably not. System has been pretty stable lately and I haven't been
pushing it too hard either.

On Fri, Apr 9, 2021 at 10:35 AM Colin Ian King <email address hidden>
wrote:

> Hi Rob, is this still an open issue?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1889110
>
> Title:
> zfs pool locks and see "INFO: task txg_sync:4307 blocked for more than
> 120 seconds. "
>
> Status in zfs-linux package in Ubuntu:
> Incomplete
>
> Bug description:
> ZFS filesystem becomes unresponsive and subsequent NFS shares
> unresponsive. ESXi sees all paths down.
>
> See this error 3 times in a row.
>
>
> [184383.479511] INFO: task txg_sync:4307 blocked for more than 120
> seconds.
>
> [184383.479565] Tainted: P IO 5.4.0-42-generic
> #46-Ubuntu
>
> [184383.479607] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
>
> [184383.479655] txg_sync D 0 4307 2 0x80004000
>
>
> [184383.479658] Call Trace:
>
>
> [184383.479670] __schedule+0x2e3/0x740
>
>
> [184383.479673] schedule+0x42/0xb0
>
>
> [184383.479676] schedule_timeout+0x152/0x2f0
>
>
> [184383.479683] ? __next_timer_interrupt+0xe0/0xe0
>
>
> [184383.479685] io_schedule_timeout+0x1e/0x50
>
>
> [184383.479697] __cv_timedwait_common+0x15e/0x1c0 [spl]
>
>
> [184383.479702] ? wait_woken+0x80/0x80
>
>
> [184383.479710] __cv_timedwait_io+0x19/0x20 [spl]
>
>
> [184383.479816] zio_wait+0x11b/0x230 [zfs]
>
>
> [184383.479905] ? __raw_spin_unlock+0x9/0x10 [zfs]
>
>
> [184383.479983] dsl_pool_sync+0xbc/0x410 [zfs]
>
>
> [184383.480069] spa_sync_iterate_to_convergence+0xe0/0x1c0 [zfs]
>
>
> [184383.480156] spa_sync+0x312/0x5b0 [zfs]
>
>
> [184383.480245] txg_sync_thread+0x27a/0x310 [zfs]
>
>
> [184383.480334] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
>
>
> [184383.480344] thread_generic_wrapper+0x83/0xa0 [spl]
>
>
> [184383.480347] kthread+0x104/0x140
>
>
> [184383.480356] ? clear_bit+0x20/0x20 [spl]
>
>
> [184383.480358] ? kthread_park+0x90/0x90
>
>
> [184383.480361] ret_from_fork+0x35/0x40
>
>
>
> Then nfsd hangs as well.
>
>
> [184866.787445] INFO: task nfsd:6585 blocked for more than 120 seconds.
> [184866.787485] Tainted: P IO 5.4.0-42-generic
> #46-Ubuntu
> [184866.787526] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [184866.787573] nfsd D 0 6585 2 0x80004000
> [184866.787575] Call Trace:
> [184866.787578] __schedule+0x2e3/0x740
> [184866.787675] ? __raw_spin_unlock+0x9/0x10 [zfs]
> [184866.787678] schedule+0x42/0xb0
> [184866.787685] cv_wait_common+0x133/0x180 [spl]
> [184866.787688] ? wait_woken+0x80/0x80
> [184866.787695] __cv_wait+0x15/0x20 [spl]
> [184866.787764] dmu_tx_wait+0x1ee/0x210 [zfs]
> [184866.787834] dmu_tx_assign+0x49/0x70 [zfs]
> [184866.787929] zfs_write+0x461/0xd40 [zfs]
> [184866.788025] ? atomic_sub_return.constprop.0+0xd/0x20 [zfs]
> [184866.788033] ? atomic_dec+0xd/0x20 [spl]
> [184866.788116] ? __raw_spin_unlock+0x9/0x10 [zfs]
> [184866.788122...

Read more...

Revision history for this message
Peter Halicky (peto-halicky) wrote :

I would say this is still an open issue. There is a long discussion on github about it and seems like there's currently no solution (yet?)

https://github.com/openzfs/zfs/issues/9130

Changed in zfs-linux (Ubuntu):
assignee: Colin Ian King (colin-king) → Dimitri John Ledkov (xnox)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.