Nova scheduler randomly fails to schedule CPU-pinned instance-flavors with hugepages - fails increases as running instance count grows

Bug #1738501 reported by Trygve Vea
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Low
Unassigned

Bug Description

Description
===========
Isolated to single hypervisor.
Nova scheduler randomly fails to schedule CPU-pinned instance-flavors with hugepages - fails increases as running instance count grows.

Steps to reproduce
==================

1) Hypervisor with two numa-nodes, 2x Intel Gold 6126, 256GB RAM (128GB in each numa node), 61440x2M hugepages in each node. Hypervisor running nothing else than OpenStack

2) Flavor specified with:
 - 4 vCPUs
 - 20480 MB RAM
 - hw:cpu_policy dedicated
 - hw:cpu_thread_policy require
 - hw:mem_page_size 2MB

3) Try to schedule 12 instances of the mentioned flavor

Expected result
===============

12 instances running on hypervisor, neatly packed using up all hugepages.

Actual result
=============

NUMA node 0 is full, NUMA node 1 has 2-3 instances or so. This varies from attempt to attempt.

Workaround
==========

Leave all running instances as they are, schedule more instances until the desired amount of instances have been successfully created. (It took 32 create attempts to fill all 12 slots for me)

Problem will not exist if hugepages are disabled from flavor and hypervisor.

Environment
===========
Running OpenStack Ocata, RDO packages on Centos 7.4.
Linux 3.10.0-514.10.2.el7.x86_64
nova 15.0.7

Compute:
openstack-nova-compute-15.0.7-1.el7.noarch

Ctrl:
openstack-nova-conductor-15.0.7-1.el7.noarch
python2-novaclient-7.1.2-1.el7.noarch
python-nova-15.0.7-1.el7.noarch
openstack-nova-novncproxy-15.0.7-1.el7.noarch
openstack-nova-placement-api-15.0.7-1.el7.noarch
openstack-nova-common-15.0.7-1.el7.noarch
openstack-nova-api-15.0.7-1.el7.noarch
openstack-nova-scheduler-15.0.7-1.el7.noarch
openstack-nova-console-15.0.7-1.el7.noarch

Using Libvirt+KVM

libvirt 3.2.0-14.el7_4 (ev)
qemu 2.9.0-16.el7_4 (ev)

Storage is pure qcow2 on /var/lib/nova

Neutron with linuxbridge-agent for networking.

Tags: libvirt numa
tags: added: sched
tags: added: scheduler
removed: sched
tags: added: libvirt numa
removed: scheduler
Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
David Hill (david-hill-ubisoft) wrote :

Quick question here: Does it solve the problem if you set a number that is lower or equal to the number of available computes that have available resources ? IE: You have 6 computes with available resources and you use "--max-count 6" or "--count 6" ?

Revision history for this message
Trygve Vea (trygve-vea-gmail) wrote :

No, that didn't help at the time. I remember when there were a single slot left, I had to retry multiple times - eventually it did work.

To the extend I am unsure about this, I know I tested this on a 2x6126 system, which has 48 threads divided over two sockets. If at least 1 VM would successfully build at every attempt, I would need only 12 tries to fill up all slots - not 32 as stated.

I don't have any servers available for testing as of now. But we're considering using VPP for networking, which does require hugepages on the VM flavors.

Revision history for this message
Ilya Popov (ilya-p) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.