in SYSTEM mode, VM ips are not automatically discovered by CC or NC on switched networks

Bug #347622 reported by Daniel Nurmi
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Eucalyptus
In Progress
Low
graziano obertelli
eucalyptus (Ubuntu)
Fix Released
Critical
Soren Hansen
Jaunty
Fix Released
Critical
Soren Hansen

Bug Description

When a CC is configured in SYSTEM mode, VMs are attached to a ethernet bridge and depend on an external DHCP server (not controlled by eucalyptus) to get an IP address. On a switched network, the NC and CC cannot automatically discover the IP of a VM unless the VM uses the network and the NC/CC arp tables are populated (allowing the NC or CC to resolve the known MAC address with the allocated IP).

Possible solution - add an iptable rule to the NC that logs DHCP traffic to syslog, periodically inspect syslog to discover IP addresses, send single ICMP packet to discovered IPs to populate NC arp table, parse arp table to discover MAC/IP mapping for VMs

Daniel Nurmi (nurmi)
Changed in eucalyptus:
assignee: nobody → nurmi
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Daniel Nurmi (nurmi) wrote :

fix is as described above - NC now adds a (benign) iptables rule that logs incoming DHCP responses. If an instance has '0.0.0.0' address, it will call an external helper script (/usr/share/eucalyptus/populate_arp.pl) to send single ICMP packets to DHCP REPLY IPs until the arp cache is populated with the VM's mac/ip mapping. Then, then NC will pick up the IP and will not run the pinger any longer.

fix is in r249

Changed in eucalyptus:
status: Confirmed → Fix Committed
Rick Clark (dendrobates)
Changed in eucalyptus (Ubuntu Jaunty):
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
Daniel Nurmi (nurmi) wrote :

made a minor tweak to 'populate_arp.pl' for r250: added 'use Net::Ping;' (somehow this change got lost in the commit chain and it is required for that helper script to run)

Revision history for this message
Steve Langasek (vorlon) wrote :

As we now have version 1.5~bzr266-0ubuntu1 in jaunty, I believe this bug is fixed; marking accordingly.

Changed in eucalyptus (Ubuntu Jaunty):
assignee: nobody → soren
status: Confirmed → Fix Released
Revision history for this message
Soren Hansen (soren) wrote :

Yes, sorry, I somehow omitted this bug from the changelog entry.

Revision history for this message
Lukas Mesani (lukas-prozeta) wrote :

Still not working under jaunty :(

root@x-dom0-01:/etc/eucalyptus# dpkg -l|grep iptables
ii iptables 1.4.1.1-4ubuntu3 administration tools for packet filtering an
root@x-dom0-01:/etc/eucalyptus# dpkg -l|grep eucalyptus
ii eucalyptus-common 1.5.2-0euca424ga68 Elastic Utility Computing Architecture - Com
ii eucalyptus-gl 1.5.2-0euca424ga68 Elastic Utility Computing Architecture - Log
ii eucalyptus-nc 1.5.2-0euca424ga68 Elastic Utility Computing Architecture - Nod

Revision history for this message
Chris (djchrishart) wrote :

It seems that this still isn't resolved. We're running Ubuntu 10.04 Enterprise Cloud and our scenario fits this one perfectly.

------------------------------------------
uecadmin@cloudnc:~$ uname -a
Linux cloudnc 2.6.32-21-generic-pae #32-Ubuntu SMP Fri Apr 16 09:39:35 UTC 2010 i686 GNU/Linux

------------------------------------------
uecadmin@cloudnc:~$ dpkg -l | grep iptables
ii iptables 1.4.4-2ubuntu2 administration tools for packet filtering an
uecadmin@cloudnc:~$ dpkg -l | grep eucalyptus
ii eucalyptus-common 1.6.2-0ubuntu30.2 Elastic Utility Computing Architecture - Com
ii eucalyptus-gl 1.6.2-0ubuntu30.2 Elastic Utility Computing Architecture - Log
ii eucalyptus-nc 1.6.2-0ubuntu30.2 Elastic Utility Computing Architecture - Nod

------------------------------------------
uecadmin@cloudnc:~$ sudo iptables -L
[sudo] password for uecadmin:
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT udp -- anywhere anywhere udp dpt:domain
ACCEPT tcp -- anywhere anywhere tcp dpt:domain
ACCEPT udp -- anywhere anywhere udp dpt:bootps
ACCEPT tcp -- anywhere anywhere tcp dpt:bootps

Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere 192.168.122.0/24 state RELATED,ESTABLISHED
ACCEPT all -- 192.168.122.0/24 anywhere
ACCEPT all -- anywhere anywhere
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
LOG udp -- anywhere anywhere udp spts:bootps:bootpc dpts:bootps:bootpc LOG level info

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

------------------------------------------
We checked the /usr/share/eucalyptus/populate_arp.pl file and indeed see the 'use Net::Ping;' in it.

Any thoughts?

Revision history for this message
Eric Woodlief (ewoodlief) wrote :

We've experienced a similar problem with neither the CC nor the NCs consistently discovering the public/private IP addresses of instances.

Here is how to reproduce our particular scenario:
It is in SYSTEM mode and a DHCP service is running on the CC. The DHCP can assign IP addresses from many possible subnets. If a NC (that happens to have its br0 bound somewhere in, say, subnet1) starts an instance that is told by DHCP to use an IP address in subnet2, this NC's ARP table will never populate with that particular instance IP and it will go undiscovered. If the instance happens to receive an IP address on its NC's subnet, the ARP entry is set and the instance IP is discovered.

(Somewhat of a) Solution:
One could have all NCs bound to all subnets, but that would be a waste of IP addresses and possibly introduce routing problems. The simplest solution is to have the CC bind to all subnets, for example: "ifconfig eth0:1 <subnet2_ip> netmask ..." And just create /etc/network/interface entries so that it will persist on restarts. Since the DHCP service is on the CC, its ARP table will populate across all subnets, and the CC will always discover and report IP addresses in euca-describe-instances. However, the NCs will still show 0.0.0.0 in its nc.log, but that is acceptable for now.

Revision history for this message
graziano obertelli (graziano.obertelli) wrote :

Targeting this bug for milestone 2.0.4: we probably need to document this, since we cannot bind to subnet we are not aware exists.

Changed in eucalyptus:
milestone: none → 2.0.4
Revision history for this message
graziano obertelli (graziano.obertelli) wrote :

Also dropping importance since the original issue was resolved: we just need to document the multi-subnet behavior.

Changed in eucalyptus:
status: Fix Committed → In Progress
importance: High → Low
assignee: Daniel Nurmi (nurmi) → graziano obertelli (graziano.obertelli)
Revision history for this message
Andy Grimm (agrimm) wrote :

This issue is now being tracked upstream at http://eucalyptus.atlassian.net/browse/EUCA-2638

Please watch that issue for further updates.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.