Comment 4 for bug 1836913

Revision history for this message
Christian Ehrhardt (paelzer) wrote :

I was trying to recreate this on x86 with a 128G guest and 64 CPUs.
I see numad action:

Thu Jul 18 10:51:22 2019: Advising pid 13197 (qemu-system-x86) move from nodes (0-1) to nodes (1)
Thu Jul 18 10:51:23 2019: PID 13197 moved to node(s) 1 in 0.19 seconds

Running stressapptest [1] in Host and guest for a while triggered more of those, without crashes (expected).

Restarting numad did not break it on this system.
A shutdown seems to do a re-evaluation and then go on as usual:
Thu Jul 18 11:00:54 2019: Shutting down numad
Thu Jul 18 11:00:54 2019: Registering numad version 20150602 PID 15629
Thu Jul 18 11:01:01 2019: Advising pid 15500 (stressapptest) move from nodes (0-1) to nodes (0-1)
Thu Jul 18 11:01:01 2019: PID 15500 moved to node(s) 0-1 in 0.0 seconds
Thu Jul 18 11:01:06 2019: Advising pid 13197 (qemu-system-x86) move from nodes (0-1) to nodes (0-1)
Thu Jul 18 11:01:06 2019: PID 13197 moved to node(s) 0-1 in 0.0 seconds

So the assumption for now is that this is either ppc64el specific or even specific to our particular P9 (dradis).

Lowering importance as it seems not to be a general issue.
I'll ping Frank if he wants to reverse mirror that to IBM.

[1]: https://github.com/stressapptest/stressapptest/releases