"File Open" dialogues for gedit, etc., take 25 seconds to appear
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tracker (Ubuntu) |
In Progress
|
Medium
|
Unassigned |
Bug Description
Binary package hint: tracker
Ubuntu 9.04, tracker 0.6.93-0ubuntu1. trackerd and tracker-indexer
reached deadlock over a couple of ~/.cache/tracker files resulting in 25
second timeouts before the "File Open" dialogue of various programs
would appear.
A friend reported the "File Open" dialogues of various GUI programs,
e.g. gedit, Scite, ooffice, was very slow in opening, e.g. taking more
than 20 seconds. I strace'd gedit whilst he did a "File Open".
15520 22.5956 gettimeofday(
15520 22.5963 writev(15, [{"l\1\
15520 22.5966 gettimeofday(
15520 22.5967 writev(15, [{"l\1\
15520 22.5968 gettimeofday(
15520 22.5976 poll([{fd=15, events=POLLIN}], 1, 25000) = 1 ([{fd=15, revents=POLLIN}])
15520 22.5977 read(15, "l\2\1\
15520 22.5981 read(15, 0xf82e20, 2048) = -1 EAGAIN (Resource temporarily unavailable)
15520 22.5982 gettimeofday(
15520 22.5983 poll([{fd=15, events=POLLIN}], 1, 24999) = 0 (Timeout)
15520 47.6211 gettimeofday(
15520 47.6215 writev(15, [{"l\1\
15520 47.6219 writev(15, [{"l\1\
15520 47.6221 writev(15, [{"l\1\
Note the 25.0228 second gap between the start of the poll() and the next
system call; this matches the 25 second delay the user sees. gedit's
file descriptor 15 is a socket.
$ lsof -p 15520
gedit 15520 peter 15u unix 0xffff88005317ec00 2003719 socket
After a bit of poking around, I realised it was trying to read from
trackerd.
$ ps l $(pidof trackerd) $(pidof tracker-indexer)
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 1000 5339 5196 39 19 310584 10184 hrtime SN ? 0:58 /usr/lib/
0 1000 5370 1 39 19 175772 18392 hrtime SNl ? 1:02 /usr/lib/
$
strace showed trackerd was endlessly attempting to get a read lock on
its FD 12.
$ strace -tt -p 5339
10:16:42.095702 fcntl(12, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = -1 EAGAIN (Resource temporarily unavailable)
10:16:42.096032 nanosleep({0, 100000000}, NULL) = 0
10:16:42.196494 fcntl(12, F_SETLK, {type=F_RDLCK, whence=SEEK_SET, start=1073741824, len=1}) = -1 EAGAIN (Resource temporarily unavailable)
10:16:42.196816 nanosleep({0, 100000000}, NULL) = 0
trackerd's FD 12 is file-meta.db.
$ lsof -p 5339
trackerd 5339 peter 12u REG 8,52 2077696 17186898 /home/peter/
And one other program, tracker-indexer, has that file open.
$ lsof /home/peter/
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
trackerd 5339 peter 12u REG 8,52 2077696 17186898 /home/peter/
tracker-i 5370 peter 15uw REG 8,52 2077696 17186898 /home/peter/
$
It's endlessly trying to get a write lock on its FD 13,
$ strace -tt -p 5370
10:18:46.653514 fcntl(13, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=1073741826, len=510}) = -1 EAGAIN (Resource temporarily unavailable)
10:18:46.653655 nanosleep({0, 100000000}, NULL) = 0
10:18:46.753864 fcntl(13, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=1073741826, len=510}) = -1 EAGAIN (Resource temporarily unavailable)
10:18:46.754004 nanosleep({0, 100000000}, NULL) = 0
which is common.db,
$ lsof -p 5370
tracker-i 5370 peter 13ur REG 8,52 84992 17186897 /home/peter/
and trackerd is the other program that has that file open.
$ lsof /home/peter/
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
trackerd 5339 peter 10ur REG 8,52 84992 17186897 /home/peter/
tracker-i 5370 peter 13ur REG 8,52 84992 17186897 /home/peter/
$
The inode numbers of the two files,
$ ls -i /home/peter/
> /home/peter/
17186898 /home/peter/
17186897 /home/peter/
which appear in the list of all the locks held.
$ cat /proc/locks
1: POSIX ADVISORY READ 8029 08:34:17187002 1073741826 1073742335
2: POSIX ADVISORY READ 15938 08:32:2810628 0 EOF
3: POSIX ADVISORY READ 15938 08:32:2771634 0 EOF
4: POSIX ADVISORY READ 15938 08:32:2771633 0 EOF
5: POSIX ADVISORY READ 15938 08:32:2810417 0 EOF
6: POSIX ADVISORY READ 15938 08:32:2810616 0 EOF
7: POSIX ADVISORY READ 15938 08:32:2810627 0 EOF
8: POSIX ADVISORY READ 15938 08:32:2810416 0 EOF
9: POSIX ADVISORY WRITE 8029 08:34:17186986 1073741824 1073742335
10: POSIX ADVISORY WRITE 8029 08:34:17186981 0 EOF
11: POSIX ADVISORY READ 5339 08:34:17186897 1073741826 1073742335
12: POSIX ADVISORY WRITE 5370 08:34:17186897 1073741824 1073741825
13: POSIX ADVISORY WRITE 5370 08:34:17186899 1073741824 1073742335
14: POSIX ADVISORY WRITE 5370 08:34:17186898 1073741824 1073742335
15: POSIX ADVISORY READ 5370 08:34:17186897 1073741826 1073742335
16: FLOCK ADVISORY WRITE 4885 00:12:10848 0 EOF
17: POSIX ADVISORY WRITE 4858 00:12:10817 0 EOF
$
Re-ording that a bit and adding in the lock attempts from above, we have
common.db 1073741824 len=2 WRITE tracker-indexer
common.db 1073741826 len=510 READ trackerd
common.db 1073741826 len=510 READ tracker-indexer
common.db 1073741826 len=510 F_WRLCK tracker-indexer
file-meta.db 1073741824 len=512 WRITE tracker-indexer
file-meta.db 1073741824 len=1 F_RDLCK trackerd
It's clear that both are trying to get a lock that's blocked by an
existing lock the other has on the same file. This bug could be
confirmed based solely on the above investigation IMHO, it does not need
someone else to suffer the same bug *and* get to the point where they
realise it's the above causing it. This thread,
http://
similar problems. Their solution is to purge the tracker package!
Changed in tracker (Ubuntu): | |
assignee: | Chris Coulson (chrisccoulson) → nobody |
Killing tracker-indexer frees some of the file locks, tracker-indexer gets restarted, and "File Open" dialogues then only take a few seconds to appear, like normal.