Inconsistency of determining IP addresses in MAAS environment between hosts and LXC containers

Bug #1473069 reported by Darryl Weaver
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Unassigned
Landscape Server
Invalid
High
Unassigned

Bug Description

I am seeing inconsistency with LXC containers and hosts in a MAAS environment with multiple subnets configured.

There is a management network, which is used by all the machines via MAAS DHCP to obtain IP addresses, these become the private address for all the machines and charms in the juju deployment, e.g. 192.168.92.0/24.

Additional network subnets are added either pre or post deployment, e.g. 172.16.x.y.
Multiple subnets are added.
The hosts (either physical or virtual) continue to provide the private address as the original private address, i.e. 192.168.92.x.

But LXC containers private address changes to one of the additional networks that were added, in fact the lowest numerical option, i.e. 172.16.20.0/24 is used instead.

This is inconsistent and can break complex deployments such as Openstack when deploying with multiple physical networks and LXC containers, depending on how those networks are isolated.

For example, we see juju status output like this:

  "6":
    agent-state: started
    agent-version: 1.24.2
    dns-name: halfrunt.hv.dazcloud.com
    instance-id: /MAAS/api/1.0/nodes/node-28b38eb4-e781-11e4-a7bd-52540073f449/
    series: trusty
    hardware: arch=amd64 cpu-cores=4 mem=32768M
  "7":
    agent-state: started
    agent-version: 1.24.2
    dns-name: priv9.hv.dazcloud.com
    instance-id: /MAAS/api/1.0/nodes/node-ccb1de24-ea0d-11e4-8d2c-52540073f449/
    series: trusty
    containers:
      7/lxc/0:
        agent-state: started
        agent-version: 1.24.2
        dns-name: 172.16.20.60
        instance-id: juju-machine-7-lxc-0
        series: trusty
        hardware: arch=amd64
      7/lxc/1:
        agent-state: started
        agent-version: 1.24.2
        dns-name: 172.16.20.62
        instance-id: juju-machine-7-lxc-1
        series: trusty
        hardware: arch=amd64
      7/lxc/2:
        agent-state: started
        agent-version: 1.24.2
        dns-name: 172.16.20.63
        instance-id: juju-machine-7-lxc-2
        series: trusty
        hardware: arch=amd64
      7/lxc/3:
        agent-state: started
        agent-version: 1.24.2
        dns-name: 172.16.20.64
        instance-id: juju-machine-7-lxc-3
        series: trusty
        hardware: arch=amd64
      7/lxc/4:
        agent-state: started
        agent-version: 1.24.2
        dns-name: 172.16.20.65
        instance-id: juju-machine-7-lxc-4
        series: trusty
        hardware: arch=amd64
      7/lxc/5:
        agent-state: started
        agent-version: 1.24.2
        dns-name: 172.16.20.66
        instance-id: juju-machine-7-lxc-5
        series: trusty
        hardware: arch=amd64

Revision history for this message
Darryl Weaver (dweaver) wrote :
tags: added: addressability lxc maas-provider network
Revision history for this message
Darryl Weaver (dweaver) wrote :

The inconsistency I am seeing appears to be caused by the preference of juju to select a hostname rather than an IP address if it is configured and to fall back to the lowest IP address if no hostname is configured.

As MAAS configures a DNS hostname for a machine listed in MAAS, but gives out an IP to a container without adding it in to the DNS server as a hostname, so containers only have an IP, no DNS hostname.

Excerpt from network/address.go:

// sortOrder calculates the "weight" of the address when sorting,
// taking into account the preferIPv6 flag:
// - public IPs first;
// - hostnames after that, but "localhost" will be last if present;
// - cloud-local next;
// - machine-local next;
// - link-local next;
// - non-hostnames with unknown scope last.

Revision history for this message
Darryl Weaver (dweaver) wrote :

An immediate fix is to therefore configure an A record and PTR record for all the IP addresses in the dynamic range in MAAS used for the management network.
e.g. 10.x.y.z A dynamic-host-10-x-y-z

Revision history for this message
Darryl Weaver (dweaver) wrote :

Actually, that is wrong.
The container addresses already have a PTR and A record in the MAAS DNS nameserver, so we are back to a juju problem with ordering of the addresses.
But it seems to be based on the fact that the hostname returned by MAAS DNS is being ignored by Juju and it is selecting by IP address instead of selecting the hostname.

So, in /etc/bind9/maas/zone.hv.dazcloud.com:
$GENERATE 51-99 192-168-92-$ IN A 192.168.92.$

& in /etc/bind9/maas/zone.92.168.192.in-addr.arpa:
$GENERATE 51-99 $.92.168.192.in-addr.arpa. IN PTR 192-168-92-$.hv.dazcloud.com.

Testing this from a juju machine, the hostname lookup works as expected, but juju status still returns a different IP address as both the private and public address and no hostname.

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.25.0
Revision history for this message
Andy Jeffries (8-andy) wrote :

+1

Revision history for this message
Darryl Weaver (dweaver) wrote :

Experimenting further, I can see that it is not the lowest numerical address, but the first address reported by lxc-ls that juju selects as the private address. I deployed on another configuration with different IP addresses (i.e. 192.168.x.x addresses only), but it selected the first address as one that was on eth2 and was in the middle of the range, but it was the first address listed when running:
lxc-ls --fancy

Curtis Hovey (sinzui)
tags: added: bug-squad
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.25-alpha1 → 1.25-beta1
Gavin Panella (allenap)
tags: added: networking
removed: network
Changed in juju-core:
milestone: 1.25-beta1 → 1.25-beta2
Revision history for this message
Cheryl Jennings (cherylj) wrote :

I believe this has been fixed with bug 1435283 which is in the proposed 1.24.7. Could you test with that level to verify that it has been resolved?

Revision history for this message
Darryl Weaver (dweaver) wrote :

Yes, I can test with 1.24.7 once there is a package available to test with.

Revision history for this message
Darryl Weaver (dweaver) wrote :

Tested with 1.24.7 and the issue persists.

The order of deployment may matter.
In my example deployment:

1) I deploy the machines I need using juju add-machine or a suitable bundle, and I add 1 container to each server.
2) I then terminate-machine each created container.
3) I modify the NIC configuration to add bridges for each of the physical NICs.
4) I then modify the juju-trusty-lxc-template config file to add the additional NICs to /var/lib/lxc/juju-trusty-lxc-template/config
5) I then also add a dhcp configuration for each additional NIC in /var/lib/lxc/juju-trusty-lxc-template/rootfs/etc/network/interfaces.d/
6) I then continue the juju-deployment with a bundle to deploy the containers on each host.

The deployment succeeds but the IP addresses selected are IP addresses that are not on the management network, but one of the additional networks.

As this is an Openstack deployment, this ends up with containers on the data network and the physical and KVM virtual machines on the management network, preventing internal communication of Openstack components.

So, my deployment example adds the additional NIC configuration before deployment of the LXC containers and not adding the configuration post deployment.

The selection if which IP is the primary on a container in a MAAS environment should be based on which IP is associated with eth0 and not determined by the nature of the IP address itself. The management network in a MAAS environment would typically be a private network address and not a publicly accessible IP address.

Revision history for this message
Darryl Weaver (dweaver) wrote :
Download full text (7.8 KiB)

I also experimented with adding additional NICs to LXC containers post deployment and that was not consistent either.
I deployed 2 machines as described above with multiple NICs in the template before deployment and they consistently selected the first IP address from lxc-ls --fancy.

But if I deployed the containers with only 1 NIC and then added additional NICs to the specific containers and rebooted them, they would obtain new IP addresses, but juju would sometimes report the previous address, i.e. it did not switch to a different address.

However, some containers did switch to a different address on one particular host, and it also switched address to 2 different networks, meaning one physical machine had containers with the primary address reported by juju on the management network, data network and storage cluster network.

The final deployment status output is shown here: http://pastebin.ubuntu.com/12887890/
Machine 1 and 2 had the template modified before deploying any containers.
Machine 5 and 6 had the container configuration modified post deployment.

1/lxc/X and 2/lxc/X containers show addresses as 192.168.140.x

5/lxc/X containers show addresses as 192.168.140.x
except
5/lxc/7 shows address as 192.168.160.90

6/lxc/X containers show addresses as 192.168.92.x which are correct.

However, output from lxc-ls --fancy on each host always shows 192.168.140.x, then 192.168.160.x then 192.168.92.x.
e.g.:
juju ssh 5 sudo lxc-ls --fancy
Warning: Permanently added '192.168.92.100' (ECDSA) to the list of known hosts.
Warning: Permanently added '192.168.92.104' (ECDSA) to the list of known hosts.
NAME STATE IPV4 IPV6 GROUPS AUTOSTART
---------------------------------------------------------------------------------------------------------
juju-machine-5-lxc-0 RUNNING 192.168.140.24, 192.168.160.64, 192.168.92.72 - - YES
juju-machine-5-lxc-1 RUNNING 192.168.140.27, 192.168.160.60, 192.168.92.75 - - YES
juju-machine-5-lxc-10 RUNNING 192.168.140.87, 192.168.160.85, 192.168.92.93 - - YES
juju-machine-5-lxc-11 RUNNING 192.168.140.23, 192.168.160.99, 192.168.92.94 - - YES
juju-machine-5-lxc-2 RUNNING 192.168.140.88, 192.168.160.86, 192.168.92.76 - - YES
juju-machine-5-lxc-3 RUNNING 192.168.140.90, 192.168.160.88, 192.168.92.78 - - YES
juju-machine-5-lxc-4 RUNNING 192.168.140.84, 192.168.160.82, 192.168.92.81 - - YES
juju-machine-5-lxc-5 RUNNING 192.168.140.85, 192.168.160.83, 192.168.92.82 - - YES
juju-machine-5-lxc-6 RUNNING 192.168.140.89, 192.168.160.87, 192.168.92.84 - - YES
juju-machine-5-lxc-7 RUNNING 192.168.140.21, 192.168.160.90, 192.168.92.86 - - YES
juju-machine-5-lxc-8 RUNNING 192.168.140.86, 192.168.160.84, 192.168.92.88 - - YES
juju-machine-5-lxc-9 RUNNING 192.168.140.60, 192.168.160.89, 192.168.92.90 - - YES
juju-trusty-lxc-template STOPPED - ...

Read more...

Revision history for this message
Darryl Weaver (dweaver) wrote :

Ii have not yet experimented with public IP addresses which may add to the complexity of which IP is the primary one for juju communications. In a MAAS environment that should be the private management network.

Changed in juju-core:
milestone: 1.25-beta2 → 1.25.1
Revision history for this message
Darryl Weaver (dweaver) wrote :

I actually ran that test incorrectly and I was still using 1.24.6 agents.
So, I re-deployed using 1.24.6 agents and then upgraded the deployment to 1.24.7.
So, all agents were running 1.24.7 and containers on physical hosts only had 1 NIC with one address on the management network.

I then modified the lxc configuration file for each container to add 2 additional NICs bridged to physical interfaces.
I restarted each of the containers and they obtained new addresses.

However, juju definitely switched the address on about 7 containers.

So, it would seem that even when I run the test after deployment plugging in new networks, the address can switch.

Final deployment status is here:
http://pastebin.ubuntu.com/12896894/

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Thanks for all your analysis on this, dweaver. I'll see if I can raise the priority within the team.

Revision history for this message
Darryl Weaver (dweaver) wrote :

I have confirmed that in 1.25.0 if you add additional NICs in the lxc template and it deploys a container with multiple NICs initially, then the IP address selected is not consistent with the rest of the deployment.

If however, the containers are deployed first with only 1 NIC and then left for a period of time and then additional NICs configured on existing LXC containers and rebooted, then the address does not appear to change in Juju.

I think my previous test was done too quickly and so some containers flipped the address to the other network.

It would still be useful for juju to stick to one network (or space) that all units use to communicate on the private address which is not publicly routed and it be deterministic as to which addresses are preferred and not dependent on the order coming from various tools, such as lxc-ls.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Darryl, can you paste your environments.yaml for that environment used in comment #14 (scrubbed of secrets, ofc) ?

We aim to provide native support for multi-NIC LXC container deployments (likely in 1.27).

Revision history for this message
Darryl Weaver (dweaver) wrote :
Download full text (4.0 KiB)

Here is a copy of the ~/.juju/environments/dazcloud.jenv file for my environment with modified secrets:

user: admin
password: ********
environ-uuid: 248031ec-2d9d-4f27-8b90-4f780b9ad4f6
server-uuid: 248031ec-2d9d-4f27-8b90-4f780b9ad4f6
state-servers:
- 192.168.92.100:17070
server-hostnames:
- bootstrap.hv.dazcloud.com:17070
- 192.168.92.100:17070
ca-cert: |
  -----BEGIN CERTIFICATE-----
  MIICYTCCAcygAwIBAgIBADALBgkqhkiG9w0BAQUwRjENMAsGA1UEChMEanVqdTE1
  MDMGA1UEAwwsanVqdS1nZW5lcmF0ZWQgQ0EgZm9yIGVudmlyb25tZW50ICJkYXpj
  bG91ZCIwHhcNMTUxMDI4MjIxNTM5WhcNMjUxMTA0MjIxNTM5WjBGMQ0wCwYDVQQK
  EwRqdWp1MTUwMwYDVQQDDCxqdWp1LWdlbmVyYXRlZCBDQSBmb3IgZW52aXJvbm1l
  bnQgImRhemNsb3VkIjCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAqcjzmQPl
  XkjLSB6lBIe2mPMfzquzzTbkvYKOjX4k2dxJxxeHhfzXY63TW0PnFCc3uZs5ZQ2m
  G966jqw7Jjh486BzPVppNh44h8BBYS2YDncODJNhBU2hCrNJVZ6Ku/t2eqzGpcTJ
  BCEPWuBZPbixvYG7mw+C/hdPyIcPHGznK00CAwEAAaNjMGEwDgYDVR0PAQH/BAQD
  AgCkMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFL0jIl/elkMfgds6fFS8oCEG
  5SFsMB8GA1UdIwQYMBaAFL0jIl/elkMfgds6fFS8oCEG5SFsMAsGCSqGSIbDIQEB
  BQOBgQBwDzK4YfCmltdzK3sdAJFIDRuQtjW8FRfTBYocFJ69ZY4qrlan2LUYeAJG
  XZxzr1A0qYc7aoVdjcnl4yj9PNR/zTb1MgKUDbgMGBzPvjv8o2dUMJmiRMOUR1ZI
  zTb+6tI0+8gs/8DExfx2GQaLNfUlCFodESJtN8eoc1EQNpcRUw==
  -----END CERTIFICATE-----
bootstrap-config:
  admin-secret: *********
  agent-metadata-url: ""
  allow-lxc-loop-mounts: false
  api-port: 17070
  authorized-keys: |
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDKChYBFpM776lKYpx4fev36eYaH8fZkhppvhCvp5OYKL3kWV/hDigUkyC1hcAoHfUHPXueSLgdLvLnx9G51PTed5Zv9EW6ya9m0AzwVR7DqePRZO2fcZK2Rmht/2a244pKipoL8FIhmRXGm35+/gT5K5aA0bHXMB1UtSoOYdxPOwi9KLhGRui1WN0OcsejaHtRnXPZEQ5l63+eo0m8oYBcw4aQPduutIqhNDu5RL0jkYGcZ03Sbjb/E0Rop7NDxkVbO5e11D93pu1Zf4A5sOcAzWdIECTMZjCu/2fXdmG1qPJ1psf6bRcl30JW4QqfgVpB/YZQtzo73Fr4ZNKt0YMj juju-client-key
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDV+tdtzh+SRVU4sy1vdeS0Ac6yJo3R33eDeMDLidTicSu7K1mkg9UzmRHNIrXXJ89rvfCxZeVLiVpdCAJsvkxUtz5Z0hYQJUctvNn8oFtxfnLMxc1+sSGGV/pLXXbaCVlEJl89wsUh5LJ83gYZU+7cUCJ+ss7P0+QOFXFwOK2Wy8rTr9WLnn0AOai1Mkiod36/uT8EInH7lqBytjRrzd0fa5i7kw8KUGRebzTRlHjFGidqEAf6m07ztVXxopgLqQJp6Djm82dmggawqagWDPTd4Iv8NKD2jQtDsOI//vUsEIZbOdljssanHpSMJRLIF0eSrCSBDGOLgcQJCML9xvcV ubuntu@maas
  bootstrap-addresses-delay: 10
  bootstrap-retry-delay: 5
  bootstrap-timeout: 1800
  ca-cert: |
    -----BEGIN CERTIFICATE-----
    MIICYTCCAcygAwIBAgIBADALBgkqhkiG9w0BAQUwRjENMAsGA1UEChMEanVqdTE1
    MDMGA1UEAwwsanVqdS1nZW5lcmF0ZWQgQ0EgZm9yIGVudmlyb25tZW50ICJkYXpj
    bG91ZCIwHhcNMTUxMDI4MjIxNTM5WhcNMjUxMTA0MjIxNTM5WjBGMQ0wCwYDVQQK
    EwRqdWp1MTUwMwYDVQQDDCxqdWp1LWdlbmVyYXRlZCBDQSBmb3IGZW52aXJvbm1l
    bnQgImRhemNsb3VkIjCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAqcjzmQPl
    XkjLSB6lBIe2mPMfzquzzTbkvYKOjX4k2dxJxxeHhfzXY63TW0PnFCc3uZs5ZQ2m
    G966jqw7Jjh486BzPVppNh44h8BBYS2YDncODJNhBU2hCrNJVZ6Ku/t2eqzGpcTJ
    BCEPWuBZPbixvYG7mw+C/hdPyIcPHGznK00CAwEAAaNjMGEwDgYDVR0PAQH/BAQD
    AgCkMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFL0jIl/elkMfgds6fFS8oCEG
    5SFsMB8GA1UdIwQYMBaAFL0jIl/elkMfgds6fFS8oCEG5SFsMAsGCSqGSIb3DQEB
    BQOBgQBwDzK4YfCmltdzK3sdAJFIDRuQtjW8FRfTBYocFJ69ZY4qrlan2LUYeAJG
    XZxzr1A0qYc7aoVdjcnl4yj9PNR/zTb1MgKUDbgMGBzPvjv8o2dUMJmiRMOUR1ZI
    zTb+6tI0+8gs/8DExfx2GQa...

Read more...

Revision history for this message
Darryl Weaver (dweaver) wrote :

Checking a new deployment, it would seem that addresses remain consistent if you deploy with a single network only, then add additional interfaces post deployment.

If you add new interfaces before new units have been initialised then they will potentially set an IP inconsistent with the rest of the deployment.

Changed in juju-core:
milestone: 1.25.1 → 1.26.0
no longer affects: maas
Changed in juju-core:
milestone: 1.26.0 → 2.0-beta5
Changed in juju-core:
milestone: 2.0-beta5 → 2.0-beta4
Revision history for this message
Cheryl Jennings (cherylj) wrote :

Multi-NIC support for containers has landed and should be out in the next juju2 release (currently aiming for next week)

Changed in juju-core:
milestone: 2.0-beta4 → 2.0-rc1
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta5 → 2.0-rc1
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta6 → 2.0-beta7
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta7 → 2.0-beta8
Changed in juju-core:
milestone: 2.0-beta8 → 2.0.0
tags: added: 2.0
affects: juju-core → juju
Changed in juju:
milestone: 2.0.0 → none
milestone: none → 2.0.0
Revision history for this message
Alexis Bruemmer (alexis-bruemmer) wrote :

Darryl, Please re-open if this is still an issue on juju 2.0

Changed in juju:
status: Triaged → Incomplete
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.0-rc3 → 2.0.0
Changed in juju:
milestone: 2.0.0 → none
Revision history for this message
Darryl Weaver (dweaver) wrote :

This is still a problem in Juju 2.0.1.
I deploy a complex network bundle and Juju does not default to any particular network and I get different network subnets from different machines/containers.

Even worse, in Juju 2.0.1 the bridges to eth0/1 fail and the containers start on the lxdbr0 interface on an isolated 10. network, that means the deployments now entirely fail.

Changed in juju:
status: Incomplete → New
Changed in juju:
status: New → Triaged
milestone: none → 2.2.0
Changed in juju:
milestone: 2.2.0 → 2.1.0
assignee: nobody → Richard Harding (rharding)
Revision history for this message
David Britton (dpb) wrote :

For concrete reproductions of this (especially when using MAAS to deploy openstack), please see:

- https://bugs.launchpad.net/juju/+bug/1646329
- https://bugs.launchpad.net/juju/+bug/1646322
- https://bugs.launchpad.net/juju/+bug/1644429

tags: added: landscape
Ryan Beisner (1chb1n)
tags: added: uosci
Chris Gregan (cgregan)
tags: added: cdo-qa-blocker
Changed in landscape:
milestone: none → 16.12
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

This bug is crazy :)

juju ssh <n>, where <n> is a remote node in a MAAS cluster far far away, tries to connect to my local virbr0 interface:
andreas@nsn7:~$ juju ssh --debug 5
17:24:41 INFO juju.cmd supercommand.go:63 running juju [2.1-beta3 gc go1.6.2]
17:24:41 DEBUG juju.cmd supercommand.go:64 args: []string{"juju", "ssh", "--debug", "5"}
17:24:41 INFO juju.juju api.go:72 connecting to API addresses: [10.245.202.4:17070]
17:24:41 INFO juju.api apiclient.go:570 dialing "wss://10.245.202.4:17070/model/7a97887e-34b0-4d7f-82c2-53e1cba1d8d2/api"
17:24:42 INFO juju.api apiclient.go:501 connection established to "wss://10.245.202.4:17070/model/7a97887e-34b0-4d7f-82c2-53e1cba1d8d2/api"
17:24:43 DEBUG juju.juju api.go:263 API hostnames unchanged - not resolving
17:24:43 DEBUG juju.cmd.juju.commands ssh_common.go:263 proxy-ssh is false
17:24:43 INFO juju.network hostport.go:274 dialed "192.168.122.1:22" successfully
17:24:43 DEBUG juju.cmd.juju.commands ssh_common.go:367 using target "5" address "192.168.122.1"
17:24:44 DEBUG juju.utils.ssh ssh.go:292 using OpenSSH ssh client
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:r34Eo3QVARHyMofRFeWv5ETsgykBh1jJMx2a5A8cCcM.
Please contact your system administrator.
Add correct host key in /tmp/ssh_known_hosts181812550 to get rid of this message.
Offending RSA key in /tmp/ssh_known_hosts181812550:7
  remove with:
  ssh-keygen -f "/tmp/ssh_known_hosts181812550" -R 192.168.122.1
ECDSA host key for 192.168.122.1 has changed and you have requested strict checking.
Host key verification failed.
17:24:44 DEBUG juju.api monitor.go:35 RPC connection died
17:24:44 INFO cmd supercommand.go:465 command finished

andreas@nsn7:~$ ip a show dev virbr0
5: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever

/var/log/auth:
Jan 6 17:33:46 nsn7 sshd[32461]: Did not receive identification string from 192.168.122.1
Jan 6 17:33:46 nsn7 sshd[32463]: Connection closed by 192.168.122.1 port 43590 [preauth]

machine 5 in juju status:
Machine State DNS Inst id Series AZ
1 started 10.245.200.40 4y3h8m xenial budapest
1/lxd/0 started 10.245.202.15 juju-a1d8d2-1-lxd-0 xenial
2 started 10.245.200.36 4y3h8k xenial budapest
2/lxd/0 started 10.245.200.41 juju-a1d8d2-2-lxd-0 xenial
5 started 10.245.200.37 4y3ha8 xenial prague
...

Changed in landscape:
status: New → Triaged
importance: Undecided → High
Changed in landscape:
milestone: 16.12 → 17.01
Revision history for this message
John A Meinel (jameinel) wrote :

For the VIRBR0 portion of the bug, that should be addressed as bug #1644429 which has a patch in 2.1 and 2.2 right now.

For the SSH keys being rejected, we have bug #1646329 which also should have a patch in 2.1 (there is a proposed fix, which is being reviewed.)

As for the overall "what IP address should be reported for machines that have >1 address", that is still a bit more up in the air, because any sort of heuristic feels like it will be wrong when you ask from a different point of view.

I haven't fully understood what the address ordering is, because some of the original statements (the lowest valued IP address), don't fit what I've seen. Namely that any given MAAS node will return the same order (surviving through a destroy-controller && bootstrap cycle), but the order between nodes is not the same. (node 2 gave ens3 and then ens4, node 3 gave ens4 and then ens3).

Some of this gets better with containers only joining spaces that they have been explicitly requested, as then they won't have IP addresses for all spaces (which is a 2.1 patch). They still are likely to have >1 IP address. Using a heuristic like "What is the PXE network" is particularly poor once you are dealing with containers, because there is no particular reason to expect that they will even need to be on the PXE network.

It is possible that we could model some sort of "space priority", so that for a given space, addresses from that space sort earlier/later than addresses from other spaces.

One thing we're trying to address for 2.1 is that "juju show-machine X" will show you all of the IP addresses for that machine, clustered by what space they are in. I'm not sure if that patch will land in time (tracked as bug #1653997).

That still doesn't handle the case of "what does tabular status show".

Changed in juju:
assignee: Richard Harding (rharding) → nobody
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Removing 2.1 milestone as we will not be addressing this issue completely in 2.1.

Changed in juju:
milestone: 2.1.0 → none
Chad Smith (chad.smith)
Changed in landscape:
milestone: 17.01 → 17.02
Revision history for this message
Chad Smith (chad.smith) wrote :

We've fixed log-collector to work around this issue, but there is nothing to fix in landscape proper.

Changed in landscape:
status: Triaged → Invalid
Revision history for this message
Anastasia (anastasia-macmood) wrote :

This bugs has morphed considerably in the last 2 years since it was originally reported.
Juju team has addressed several aspects that improve this experience, see comment # 23.
I am marking this as Fix Committed in juju.
Should you experience further surprises, please file a separate dedicated report with your observations as well as your expectations.

Changed in juju:
status: Triaged → Fix Committed
milestone: none → 2.2-beta1
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.