In the “Mitaka” release of OpenStack, we had an issue that when using CPU pinning in instances, the scheduler did not take into account that some CPU cores were already occupied by other, also-pinned, instances. In order to find and fix this problem, a Python script can connect to the libvirt API and query pinned CPUs.
The script is run on all compute nodes via Ansible and exits with exit code 1 if a host has a double CPU pin. Therefore, all machines with a “failed” script need manual intervention.
#!/usr/bin/env python2 # Script to check if pinned CPUs are used more than once import sys import libvirt cpus_used = [None] * 100 conn = libvirt.openReadOnly(None) for domain_id in conn.listDomainsID(): domain = conn.lookupByID(domain_id) pins = domain.vcpuPinInfo() for cpu in pins: # cpu is an bitfield (list of bools) with the length of available CPU cores # If a CPU is pinned, exactly one value is True, its index is indicating which # CPU we are talking about if cpu.count(True) == 1: # this list comprehension converts it to an int cpu_id = [i for i, x in enumerate(cpu) if x] if not cpus_used[cpu_id]: # haven't seen it used before. mark it as used cpus_used[cpu_id] = True else: # CPU was already used. print a message and exit with error print("CPU used twice: %d on VM %s" % (cpu_id, domain.name())) # no need to check further, manual intervention required sys.exit(1)
— Feb 10, 2019