Mirantis OpenStack

Live migration doesn't work properly for Windows VM

Bug #1657708 reported by Oleksandr Liemieshko on 2017-01-19

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Mirantis OpenStack	Status tracked in 10.0.x
10.0.x	Invalid	High	MOS Nova	Mirantis OpenStack 10.0
8.0.x	Invalid	High	Oleksandr Liemieshko	Mirantis OpenStack 8.0-updates
9.x	Invalid	High	MOS Maintenance	Mirantis OpenStack 9.x-updates

Bug Description

When attempting a live-migration of an instance(Windows) from one compute node to another I got a situation when the instance failed to migrate properly.
Instance has "SHUTOFF" status in nova, it is in "shut off" state on an old compute node and "running" on a new one.
Moreover if you open VNC connection to the instance before migration it will continue to work in spite of the states which were provided above. But if you try to open new VNC connection to instance you will get "Failed to connect to server (code: 1006)" error.

In nova:
root@node-7:~# nova show dcd2d83c-2670-470c-8544-f7f858023162
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | node-9.domain.tld |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node-9.domain.tld |
| OS-EXT-SRV-ATTR:instance_name | instance-0000000c |
| OS-EXT-STS:power_state | 4 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | stopped |
| OS-SRV-USG:launched_at | 2017-01-18T14:55:34.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| admin_internal_net network | 192.168.111.12 |
| config_drive | |
| created | 2017-01-18T14:53:27Z |
| flavor | m1.medium (3) |
| hostId | ce0eff158351c1c9ccb5742b968fe95700b9cfc9c4665a0795b28629 |
| id | dcd2d83c-2670-470c-8544-f7f858023162 |
| image | win2012r2 (7d7f4c95-ed75-4897-b7f3-21c5a93a94ed) |
| key_name | - |
| metadata | {} |
| name | orig |
| os-extended-volumes:volumes_attached | [] |
| security_groups | default |
| status | SHUTOFF |
| tenant_id | 3bdaac4958d248d3a775294181962df5 |
| updated | 2017-01-18T15:42:53Z |
| user_id | 6a8a2c38b7144e648aaa094464194dde |
+--------------------------------------+----------------------------------------------------------+

On an old compute node:
root@node-9:~# virsh list --all | grep "instance-0000000c"
- instance-0000000c shut off

On a new compute node:
root@node-8:~# virsh list --all | grep "instance-0000000c"
61 instance-0000000c running

After "Hard Reboot Instance" it is "running" on both compute nodes

root@node-8:~# date
Thu Jan 19 11:07:19 UTC 2017
root@node-8:~# virsh list --all | grep "instance-0000000c"
61 instance-0000000c running

root@node-9:~# date
Thu Jan 19 11:07:13 UTC 2017
root@node-9:~# virsh list --all | grep "instance-0000000c"
142 instance-0000000c running

root@node-7:~# nova show dcd2d83c-2670-470c-8544-f7f858023162
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | node-9.domain.tld |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node-9.domain.tld |
| OS-EXT-SRV-ATTR:instance_name | instance-0000000c |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2017-01-18T14:55:34.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| admin_internal_net network | 192.168.111.12 |
| config_drive | |
| created | 2017-01-18T14:53:27Z |
| flavor | m1.medium (3) |
| hostId | ce0eff158351c1c9ccb5742b968fe95700b9cfc9c4665a0795b28629 |
| id | dcd2d83c-2670-470c-8544-f7f858023162 |
| image | win2012r2 (7d7f4c95-ed75-4897-b7f3-21c5a93a94ed) |
| key_name | - |
| metadata | {} |
| name | orig |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | ACTIVE |
| tenant_id | 3bdaac4958d248d3a775294181962df5 |
| updated | 2017-01-19T11:02:21Z |
| user_id | 6a8a2c38b7144e648aaa094464194dde |
+--------------------------------------+----------------------------------------------------------+

Steps to reproduce:
    - Fuel 8.0
    - 1 controller, 2 computes
    - VLAN
    - Ceph for all
    - image for Windows (https://cloudbase.it/windows-cloud-images/)

Scenario:
1. Create new VW from Windows image
2. Open VNC connection to the instance before migration
3. Try to migrate "Live Migrate Instance" few times

Tags:

Oleksandr Liemieshko (oliemieshko) on 2017-01-19

Changed in mos:
importance:	Undecided → High
assignee:	nobody → MOS Nova (mos-nova)
tags:	added: cus
tags:	added: customer-found removed: cus
tags:	added: support
tags:	added: area-nova