pt-sift does not work if pt-stalk did not collect due to a full disk

Bug #1172317 reported by Daniël van Eeden
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Daniel Nichter

Bug Description

Percona Toolkit 2.2.1-2 (on Ubuntu 12.10)

When I use pt-stalk there are files collected, but pt-stalk can't find them.

It looks like pt-stalk generates "*-disk-space" files where pt-sift expects "*-df" files.

tags: added: pt-sift pt-stalk
Changed in percona-toolkit:
status: New → Confirmed
milestone: none → 2.2.3
Revision history for this message
Brian Fraser (fraserbn) wrote :

Those two aren't quite the same. *-disk-space is the output of 'df -P -k $dir', where $dir is the --dest directory, while *-df is the output of 'df -k'. If I remember correctly, the only way you can get a -disk-space file but not a -df file is if the disk is out of space, which forces pt-stalk to quite early.

Changed in percona-toolkit:
status: Confirmed → Incomplete
Revision history for this message
Daniël van Eeden (dveeden) wrote :

This is what I tried:

root@daniel-thinkpad:~# pt-stalk --iterations 1 --nostalk --socket /tmp/mysql_sandbox5610.sock --password msandbox --user root
2013_05_07_11_30_44 Starting /usr/bin/pt-stalk --function=status --variable=Threads_running --threshold=25 --match= --cycles=0 --interval=1 --iterations=1 --run-time=30 --sleep=300 --dest=/var/lib/pt-stalk --prefix= --notify-by-email= --log=/var/log/pt-stalk.log --pid=/var/run/pt-stalk.pid --plugin=
2013_05_07_11_30_44 Not stalking; collect triggered immediately
2013_05_07_11_30_44 Collect 1 triggered
2013_05_07_11_30_44 Collect 1 PID 23940
2013_05_07_11_30_44 Collect 1 done
2013_05_07_11_30_44 Waiting up to 90 seconds for subprocesses to finish...
2013_05_07_11_31_15 Exiting because no more iterations
2013_05_07_11_31_15 /usr/bin/pt-stalk exit status 0
root@daniel-thinkpad:~# pt-sift /var/lib/pt-stalk
Error: There are no pt-stalk files in /var/lib/pt-stalk
For more information, 'man pt-sift' or 'perldoc /usr/bin/pt-sift'.
root@daniel-thinkpad:~# pt-sift
Error: is not a directory, and there are no pt-stalk files in the curent working directory (/var/lib/pt-stalk) with a prefix.
For more information, 'man pt-sift' or 'perldoc /usr/bin/pt-sift'.
root@daniel-thinkpad:~# ls /var/lib/pt-stalk
2013_05_07_11_30_44-disk-space 2013_05_07_11_30_44-lsof 2013_05_07_11_30_44-opentables2 2013_05_07_11_30_44-top
2013_05_07_11_30_44-hostname 2013_05_07_11_30_44-mutex-status1 2013_05_07_11_30_44-output 2013_05_07_11_30_44-trigger
2013_05_07_11_30_44-innodbstatus1 2013_05_07_11_30_44-mutex-status2 2013_05_07_11_30_44-pmap 2013_05_07_11_30_44-variables
2013_05_07_11_30_44-innodbstatus2 2013_05_07_11_30_44-mysqladmin 2013_05_07_11_30_44-ps 2013_05_07_11_30_44-vmstat
2013_05_07_11_30_44-log_error 2013_05_07_11_30_44-opentables1 2013_05_07_11_30_44-sysctl 2013_05_07_11_30_44-vmstat-overall

Revision history for this message
Daniël van Eeden (dveeden) wrote :

It seems like:
- pt-sift without options does not correctly use the /var/lib/pt-stalk default dir.
- if the /var/lib/pt-stalk dir is specified as argument for pt-sift it works.
- pt-stalk seems to somehow not collect "-df" files. (time < 12:00 ?)

root@daniel-thinkpad:~# ls -ltr /var/lib/pt-stalk/*-disk-space /var/lib/pt-stalk/*-df
-rw-r--r-- 1 root root 125 May 7 11:30 /var/lib/pt-stalk/2013_05_07_11_30_44-disk-space
-rw-r--r-- 1 root root 125 May 7 11:34 /var/lib/pt-stalk/2013_05_07_11_34_42-disk-space
-rw-r--r-- 1 root root 125 May 7 12:35 /var/lib/pt-stalk/2013_05_07_12_34_46-disk-space
-rw-r--r-- 1 root root 19410 May 7 12:35 /var/lib/pt-stalk/2013_05_07_12_34_46-df
-rw-r--r-- 1 root root 125 May 7 12:37 /var/lib/pt-stalk/2013_05_07_12_37_20-disk-space
-rw-r--r-- 1 root root 19410 May 7 12:37 /var/lib/pt-stalk/2013_05_07_12_37_20-df
-rw-r--r-- 1 root root 125 May 7 12:42 /var/lib/pt-stalk/2013_05_07_12_42_22-disk-space
-rw-r--r-- 1 root root 19410 May 7 12:42 /var/lib/pt-stalk/2013_05_07_12_42_22-df
-rw-r--r-- 1 root root 125 May 7 12:47 /var/lib/pt-stalk/2013_05_07_12_46_42-disk-space
-rw-r--r-- 1 root root 19410 May 7 12:47 /var/lib/pt-stalk/2013_05_07_12_46_42-df
root@daniel-thinkpad:~# pt-sift
Error: is not a directory, and there are no pt-stalk files in the curent working directory (/var/lib/pt-stalk) with a prefix.
For more information, 'man pt-sift' or 'perldoc /usr/bin/pt-sift'.
root@daniel-thinkpad:~# pt-sift /var/lib/pt-stalk

  2013_05_07_12_34_46 2013_05_07_12_37_20 2013_05_07_12_42_22
  2013_05_07_12_46_42

Select a timestamp from the list [2013_05_07_12_46_42] ^CCaught signal, exiting

Revision history for this message
Daniël van Eeden (dveeden) wrote :

Okay, now I found what happens:

pt-stalk collections skips df as there isn't enough disk space, and pt-sift needs this file to 'detect' timestamps.

Changed in percona-toolkit:
status: Incomplete → Confirmed
summary: - pt-sift not compatible with pt-stalk (-df vs -disk-space)
+ pt-sift not compatible with pt-stalk (if disk full)
Changed in percona-toolkit:
importance: Undecided → Medium
Revision history for this message
Daniel Nichter (daniel-nichter) wrote : Re: pt-sift not compatible with pt-stalk (if disk full)

Correct:

-disk-space files are from:

disk_space "$OPT_DEST" > "$OPT_DEST/$prefix-disk-space"
            check_disk_space \
               "$OPT_DEST/$prefix-disk-space" \
               "$OPT_DISK_BYTES_FREE" \
               "$OPT_DISK_PCT_FREE" \
               "$margin"

-df files are from a collection:

      (echo $ts; df -k) >> "$d/$p-df" &

So the attached branch looks good: there may not be -df or other collection files if other things go wrong, but there should always be an -output file for every collection because of:

               (
                  collect "$OPT_DEST" "$prefix"
               ) >> "$OPT_DEST/$prefix-output" 2>&1 &

I.e. collect = -output file even if the collection immediately dies. So I'll merge the branch. Thanks for the fix, Daniel!

summary: - pt-sift not compatible with pt-stalk (if disk full)
+ pt-sift does not work if pt-stalk did not collect due to a full disk
Changed in percona-toolkit:
status: Confirmed → Fix Committed
assignee: nobody → Daniel Nichter (daniel-nichter)
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-612

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.