lvcreate -s hangs when overlayfs is in use

Bug #1328595 reported by Dirk Fleischer
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

I am using xbuntu 14.04 LTS in combination with KVM, lxc, logical volumes and overlayfs.
After I started using overlayfs i realized that my backup based on LVM snapshots made the system "go mad".

As described in the manual I tried the kernel from 13.10 (3.11.0-xx) with the same result.
A test against main line does not help, since it does not include the overlayfs support.

I tested the behaviour on a server as well as in a VirtualBox environment with the same result:
lvcreate -s -n backup -L512M /dev/VG/NAME hangs and does not come back.

This issue is easy reproducable:

sudo mkdir /mnt/test /mnt/delta
sudo mount -t overlayfs -o lowerdir=/usr,upperdir=/mnt/delta overlayfs /mnt/test
sudo lvcreate -s -n backup -L512M /dev/xubuntu-vg/root

-> Hang
df@feretti-virt:~$ uname -r
3.13.0-29-generic

df@feretti-virt:~$ cat /proc/version_signature
Ubuntu 3.13.0-29.53-generic 3.13.11.2

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-29-generic 3.13.0-29.53
ProcVersionSignature: Ubuntu 3.13.0-29.53-generic 3.13.11.2
Uname: Linux 3.13.0-29-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.2
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/pcmC0D1c', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Tue Jun 10 18:02:13 2014
HibernationDevice: RESUME=UUID=6e68761a-6768-4e19-9c4e-f188538d6068
InstallationDate: Installed on 2014-02-18 (112 days ago)
InstallationMedia: Xubuntu 13.10 "Saucy Salamander" - Release amd64 (20131016)
Lsusb:
 Bus 001 Device 002: ID 80ee:0021 VirtualBox USB Tablet
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: innotek GmbH VirtualBox
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-29-generic root=/dev/mapper/xubuntu--vg-root ro quiet splash vt.handoff=7
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-29-generic N/A
 linux-backports-modules-3.13.0-29-generic N/A
 linux-firmware 1.127.2
RfKill:

SourcePackage: linux
UpgradeStatus: Upgraded to trusty on 2014-04-22 (49 days ago)
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.board.name: VirtualBox
dmi.board.vendor: Oracle Corporation
dmi.board.version: 1.2
dmi.chassis.type: 1
dmi.chassis.vendor: Oracle Corporation
dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:rvnOracleCorporation:rnVirtualBox:rvr1.2:cvnOracleCorporation:ct1:cvr:
dmi.product.name: VirtualBox
dmi.product.version: 1.2
dmi.sys.vendor: innotek GmbH

Revision history for this message
Dirk Fleischer (dirk-fleischer) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue occur in a previous version of Ubuntu, or is this a new issue?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.15 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-utopic/

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Dirk Fleischer (dirk-fleischer) wrote :

As mentioned I already tested with a Kernel from 13.10 with the same result.

To be sure I installed a minimal Ubuntu 13.10 in a virtual machine and did the steps to reproduce the issue.
-> The Problem existed in 13.10 as well.
df@ubuntu-bug-check:~$ cat /etc/issue
Ubuntu 13.10 \n \l
sudo mount -t overlayfs -o lowerdir=/usr,upperdir=/mnt/delta overlayfs /mnt/test
sudo lvcreate -s -n backup -L512M /dev/ubuntu-vg/root
-> Hang
I will attach the dmesg output to this post.

A test using a mainline kernel does not work, because overlayfs is not (yet) in mainline:
df@feretti-virt:~$ uname -r
3.15.0-031500-generic
df@feretti-virt:~$ sudo mount -t overlayfs -o lowerdir=/usr,upperdir=/mnt/delta overlayfs /mnt/test
mount: unknown filesystem type 'overlayfs'

So I will set the tag to 'kernel-unable-to-test-upstream'.
Please change that in case there is a tag better fitting the situation.

Thanks for looking into it.

tags: added: kernel-unable-to-test-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote : Test with newer development kernel (3.13.0-24.46)

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

  With the recent release of this Ubuntu release, would like to confirm if this bug is still present. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.13.0-24.46
Revision history for this message
Dirk Fleischer (dirk-fleischer) wrote :

When I do the steps:
sudo apt-get update
sudo apt-get dist-upgrade

I do not get a new kernel.
Anyway, the Kernel version on the system seams to be more current then the one mentioned in the post above:
df@feretti-virt:~$ uname -r
3.13.0-29-generic

So either I am missing something or the bug does still exist.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Dirk Fleischer (dirk-fleischer) wrote :
Download full text (3.3 KiB)

Update:
I tried to find a work around by using aufs.

To make a long story short: There seams to be the same issue.

mkdir /tmp/dir1 /tmp/aufs-root
mount -t aufs -o br=/tmp/dir1:/home/test,xino=/dev/shm/aufs.xino none /tmp/aufs-root

lvcreate -s -L 5G /dev/virt/opt -n vm_backup

-> Hang

dmesg

[1213440.776036] INFO: task ypbind:9957 blocked for more than 120 seconds.
[1213440.776141] Not tainted 3.13.0-29-generic #53-Ubuntu
[1213440.776233] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1213440.776351] ypbind D ffff88042fc14440 0 9957 8265 0x20020000
[1213440.776357] ffff880414fe9dd8 0000000000000002 ffff880414c547d0 ffff880414fe9fd8
[1213440.776362] 0000000000014440 0000000000014440 ffff880414c547d0 ffff880413016000
[1213440.776366] 0000000000000001 0000000000000001 0000000000000000 ffff8804130162b0
[1213440.776370] Call Trace:
[1213440.776380] [<ffffffff8171e639>] schedule+0x29/0x70
[1213440.776386] [<ffffffff811bee43>] __sb_start_write+0x93/0xe0
[1213440.776393] [<ffffffff810aaea0>] ? prepare_to_wait_event+0x100/0x100
[1213440.776397] [<ffffffff811bcb5e>] compat_do_readv_writev+0x20e/0x260
[1213440.776452] [<ffffffffa03e2ec0>] ? xfs_file_buffered_aio_write+0x1a0/0x1a0 [xfs]
[1213440.776455] [<ffffffff811bc030>] ? do_sync_read+0x90/0x90
[1213440.776461] [<ffffffff8111140c>] ? acct_account_cputime+0x1c/0x20
[1213440.776465] [<ffffffff8109d77b>] ? account_user_time+0x8b/0xa0
[1213440.776468] [<ffffffff8109dd94>] ? vtime_account_user+0x54/0x60
[1213440.776472] [<ffffffff811bcc57>] compat_writev+0x37/0x70
[1213440.776476] [<ffffffff811bdd69>] compat_SyS_writev+0x49/0xa0
[1213440.776480] [<ffffffff8172c8ec>] sysenter_dispatch+0x7/0x21
[1213440.776497] INFO: task lvcreate:28589 blocked for more than 120 seconds.
[1213440.776603] Not tainted 3.13.0-29-generic #53-Ubuntu
[1213440.776695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1213440.776811] lvcreate D ffff88042fcd4440 0 28589 24487 0x00000000
[1213440.776815] ffff8802ce4c1bc8 0000000000000002 ffff8802ce7b0000 ffff8802ce4c1fd8
[1213440.776819] 0000000000014440 0000000000014440 ffff8802ce7b0000 ffff880413016290
[1213440.776823] ffff8802ce4c1c00 ffff8802ce7b0000 ffff880413016218 0000000000000001
[1213440.776827] Call Trace:
[1213440.776831] [<ffffffff8171e639>] schedule+0x29/0x70
[1213440.776835] [<ffffffff811bef2d>] sb_wait_write+0x9d/0xb0
[1213440.776839] [<ffffffff810aaea0>] ? prepare_to_wait_event+0x100/0x100
[1213440.776842] [<ffffffff811bf128>] freeze_super+0x68/0x130
[1213440.776847] [<ffffffff811f6645>] freeze_bdev+0x75/0xd0
[1213440.776853] [<ffffffff815b6fca>] dm_suspend+0x11a/0x1e0
[1213440.776857] [<ffffffff815bc384>] dev_suspend+0x194/0x220
[1213440.776861] [<ffffffff815bc1f0>] ? table_load+0x350/0x350
[1213440.776864] [<ffffffff815bcbb5>] ctl_ioctl+0x255/0x500
[1213440.776868] [<ffffffff815bce73>] dm_ctl_ioctl+0x13/0x20
[1213440.776873] [<ffffffff811cf9c0>] do_vfs_ioctl+0x2e0/0x4c0
[1213440.776877] [<ffffffff8109dd94>] ? vtime_account_user+0x54/0x60
[1213440.776880] [<ffffffff811cfc21>] SyS_ioctl+0x81/0xa0
[1213440.776885] [<ffffffff8172adff>] tracesy...

Read more...

Revision history for this message
Dirk Fleischer (dirk-fleischer) wrote :

I did a little more testing.

root@virt:~# uname -r
3.13.0-32-generic

Actually the hanging seams to happen only when the overlayfs parts are on the logical volume being the snapshot source.
Let's say /opt is a logical volume in the volume group virt.

Case 1 (hanging):
mkdir /opt/test/
cd /opt/test/
mkdir upper lower file_system
touch lower/bla.dat

mount -t overlayfs -o lowerdir=/opt/test/lower,upperdir=/opt/test/upper overlayfs /opt/test/file_system

lvcreate -s -L 5G /dev/virt/opt -n vm_backup

-> Hang (see previous posts)

Case 2 (not hanging):
mkdir /opt/test/
lvcreate -n test -L10G virt
mkfs.xfs /dev/virt/test
mount /dev/virt/test /opt/test
cd /opt/test/
mkdir upper lower file_system
touch lower/bla.dat
mount -t overlayfs -o lowerdir=/opt/test/lower,upperdir=/opt/test/upper overlayfs /opt/test/file_system
lvcreate -s -L 5G /dev/virt/opt -n vm_backup

Snapshot created even through the overlayfs is in use (but on a different volume).

So the work around for the moment is to use two types of logical volumes :
1 one type for use with overlayfs where one has to make sure that no one ever takes a snapshot of (at least until this bug is fixed.)
2. one type where overlayfs never gets used but where snapshots can be taken from.

May be this helps someone running into this issue.

Revision history for this message
Robin Baumgartner (rbaumgartner) wrote :

I can confirm that this is still an issue in Ubuntu 14.04.4. I ran into this problem trying to backup a server hosting lxc containers, some of them using overlayfs as the root fs. When the containers are running (and therefore overlayfs is mounted) lvcreate -s hangs forever. The debug output is:

root@host:~# lvcreate -n foobar -L 5G -s -d -v /dev/vg0/var
    Setting logging type to disk
    Setting chunksize to 8 sectors.
    Finding volume group "vg0"
    Archiving volume group "vg0" metadata (seqno 5091).
    Creating logical volume foobar
    Creating volume group backup "/etc/lvm/backup/vg0" (seqno 5092).
    Found volume group "vg0"
    activation/volume_list configuration setting not defined: Checking only host tags for vg0/foobar
    Creating vg0-foobar
    Loading vg0-foobar table (252:5)
    Resuming vg0-foobar (252:5)
    Clearing start of logical volume "foobar"
    Creating logical volume snapshot0
    Found volume group "vg0"
    Found volume group "vg0"
    Executing: /sbin/modprobe dm-snapshot
    Creating vg0-var-real
    Loading vg0-var-real table (252:6)
    Loading vg0-var table (252:3)
    Creating vg0-foobar-cow
    Loading vg0-foobar-cow table (252:7)
    Resuming vg0-foobar-cow (252:7)
    Loading vg0-foobar table (252:5)
    Suspending vg0-var (252:3) with filesystem sync with device flush

When the containers are stopped (and therefore overlayfs is no longer mounted), the snapshot is created without issues.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.