Comment 0 for bug 1435706

Revision history for this message
bugproxy (bugproxy) wrote :

Problem Description
=========================================
DevLossTO, FastIoFailTO settings do not match multipath.conf expected values

---uname output---
Linux ilp1fc85apA4.tuc.stglabs.ibm.com 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:09:21 UTC 2014 ppc64le ppc64le ppc64le GNU/Linuxuname -m

Machine Type = p7 8247

Steps to Reproduce
===================================
 Verify DevLossTO, FastIoFailTO setting match multipath.conf expected values

== Comment: #31 - Thadeu Lima De Souza Cascardo <email address hidden> - 2015-03-20 10:57:20 ==
OK.

From the point of view of multipathd, everything seems correct, by looking at the logs.

I even parsed syslog and the output of getHBAInfo in order to find inconsistencies, and the inconsistency is between what multipathd logged as configured for a given target, and what its rport reports at getHBAInfo.

So, either multipathd is not configuring the timeouts even though it has the right configuration, or something else is changing those timeouts.

The other problem is that multipathd does not include the dev_loss_tmo configuration for 2145 as can be seen from list config. So, it could be not parsing the configuration correctly, or there could be a problem with the configuration.

At this point, to move forward, I would like to take a look at your system, and try reconfigure and looking at some strace output of multipathd, to check for writes into sysfs.

== Comment: #34 - Thadeu Lima De Souza Cascardo <email address hidden> - 2015-03-20 15:56:46 ==
OK, so I investigated in the system and read some of the code and checked changelog.

It looks like Ubuntu is shipping a fairly old version of multipath-tools, which is understandable, since multipath-tools is not very good in doing frequent releases, so one needs to either ship a version closer to upstream git or include its own large set of patches.

One of the patches missing is the one attached next. Without that, any devices included in the built-in hardware table will have some of its attributes from the config file ignored. That is the case with 2145. So, we lose the dev_loss_tmo setting for that device.

Cascardo.

== Comment: #38 - Thadeu Lima De Souza Cascardo <email address hidden> - 2015-03-20 16:25:39 ==
The bug this patch fixes would explain why fast_io_fail_tmo is not correctly set in some cases, but not dev_loss_tmo. So, probably, there is another missing patch here. I would like to experiment with the two patches I mentioned, however. Let's try to do this on Monday?

Cascardo.