Routes lost on DHCP lease renewal (breaks VPN)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
NetworkManager |
Fix Released
|
Medium
|
|||
network-manager (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Intrepid |
Won't Fix
|
Medium
|
Unassigned |
Bug Description
Binary package hint: network-manager
When NetworkManager renews the DHCP lease, some routes are lost. So far I've noticed these being affected:
- the Zeroconf link-local route to 169.254.0.0/16 (I don't really see the point in having this route when a Zeroconf address is not assigned to the interface anyway, so this isn't a too big problem).
- the host-route to the OpenVPN gateway that's added when a VPN connection is established. This is a big problem for me, as the OpenVPN gateway resides within one of the networks that are to be tunneled over the tunnel, so loss of this route leads to a catch-22 situation where you try to route the OpenVPN packets inside the tunnel itself, and nothing works.
Magnus Svensson confirmed the bug in #269071; he observed it using using vpnc.
I've also confirmed that the Zeroconf route is lost even if I do not start the OpenVPN connection, so the problem has to be in the core of NetworkManager.
There's more info in bug #269071, quoting relevant parts from one comment under:
tore@envy:~$ ip r
10.8.0.5 dev tun0 proto kernel scope link src 10.8.0.6
VPN-GW.
10.0.0.0/24 dev wlan0 proto kernel scope link src 10.0.0.51 metric 2
VPN.VPN.VPN.0/19 dev tun0 proto static scope link
169.254.0.0/16 dev wlan0 scope link metric 1000
VPN.VPN.0.0/12 dev tun0 proto static scope link
default via 10.0.0.1 dev wlan0 proto static
I've added the routes to VPN.VPN.* manually using the workaround I mentioned in another comment, since the default route redirection doesn't work. One thing missing here is that there should be a route to 10.8.0.1 (the OpenVPN server's internal OpenVPN address). It used to be automatically added before (and indeed, it pushes that address as the DNS dhcp-option too), so that's another regression.
You can see the host-route added to VPN-GW x 4 (which is neccessary for the OpenVPN connection to work, since that particular address is part of my employers VPN.VPN.VPN.0/19 network (and the OpenVPN packets can't very well be routed inside of the OpenVPN tunnel).
Then the VPN connection stop working. In my logs I see the following:
Oct 14 23:10:46 envy dhclient: DHCPREQUEST of 10.0.0.51 on wlan0 to 10.0.0.1 port 67
Oct 14 23:10:46 envy dhclient: DHCPACK of 10.0.0.51 from 10.0.0.1
Oct 14 23:10:46 envy NetworkManager: <info> DHCP: device wlan0 state changed bound -> renew
Oct 14 23:10:46 envy NetworkManager: <info> address 10.0.0.51
Oct 14 23:10:46 envy NetworkManager: <info> prefix 24 (255.255.255.0)
Oct 14 23:10:46 envy NetworkManager: <info> gateway 10.0.0.1
Oct 14 23:10:46 envy NetworkManager: <info> nameserver '217.13.7.140'
Oct 14 23:10:46 envy NetworkManager: <info> nameserver '217.13.4.24'
Oct 14 23:10:46 envy NetworkManager: <info> domain name 'lan'
Oct 14 23:10:46 envy NetworkManager: <info> (wlan0): writing resolv.conf to /sbin/resolvconf
Oct 14 23:10:46 envy dhclient: bound to 10.0.0.51 -- renewal in 2735 seconds.
Oct 14 23:10:46 envy avahi-daemon[5342]: Withdrawing address record for 10.0.0.51 on wlan0.
Oct 14 23:10:46 envy avahi-daemon[5342]: Leaving mDNS multicast group on interface wlan0.IPv4 with address 10.0.0.51.
Oct 14 23:10:46 envy avahi-daemon[5342]: Interface wlan0.IPv4 no longer relevant for mDNS.
Oct 14 23:10:46 envy avahi-daemon[5342]: Joining mDNS multicast group on interface wlan0.IPv4 with address 10.0.0.51.
Oct 14 23:10:46 envy avahi-daemon[5342]: New relevant interface wlan0.IPv4 for mDNS.
Oct 14 23:10:46 envy avahi-daemon[5342]: Registering new address record for 10.0.0.51 on wlan0.IPv4.
Oct 14 23:10:47 envy NetworkManager: <info> (wlan0): writing resolv.conf to /sbin/resolvconf
Oct 14 23:10:47 envy NetworkManager: <info> Policy set (wlan0) as default device for routing and DNS.
Oct 14 23:12:36 envy nm-openvpn[23085]: [openvpn-
Oct 14 23:12:36 envy nm-openvpn[23085]: SIGUSR1[
Oct 14 23:12:38 envy nm-openvpn[23085]: WARNING: No server certificate verification method has been enabled. See http://
Oct 14 23:12:38 envy nm-openvpn[23085]: NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Oct 14 23:12:38 envy nm-openvpn[23085]: Re-using SSL/TLS context
Oct 14 23:12:38 envy nm-openvpn[23085]: LZO compression initialized
Oct 14 23:12:38 envy nm-openvpn[23085]: UDPv4 link local: [undef]
Oct 14 23:12:38 envy nm-openvpn[23085]: UDPv4 link remote: VPN-GW.
Oct 14 23:13:25 envy NetworkManager: <info> (wlan0): supplicant connection state change: 7 -> 6
Oct 14 23:13:25 envy NetworkManager: <info> (wlan0): supplicant connection state change: 6 -> 7
Oct 14 23:13:38 envy nm-openvpn[23085]: TLS Error: TLS key negotiation failed to occur within 60 seconds (check your networkconnecti
Oct 14 23:13:38 envy nm-openvpn[23085]: TLS Error: TLS handshake failed
Oct 14 23:13:38 envy nm-openvpn[23085]: SIGUSR1[
Oct 14 23:13:40 envy nm-openvpn[23085]: WARNING: No server certificate verification method has been enabled. See http://
[...repeated...]
And the new routing table:
tore@envy:~$ ip r
10.8.0.5 dev tun0 proto kernel scope link src 10.8.0.6
10.0.0.0/24 dev wlan0 proto kernel scope link src 10.0.0.51 metric 2
VPN.VPN.VPN.0/19 dev tun0 proto static scope link
VPN.VPN.0.0/12 dev tun0 proto static scope link
default via 10.0.0.1 dev wlan0 proto static
So what just happened? It seems that the DHCP lease (given out by a lame residental CPE/NAT device which terminates my DSL circuit) was up for renewal. As part of this procedure n-m removed the route to VPN-GW, which of cause breaks the tunnel since those packets are now routed to inside the VPN tunnel (a catch-22), since the VPN routes themselves are left in place. The same problem would occur if the redirection of the default route redirection worked, since VPN-GW is within the default route too. So the host (/32) route to the VPN GW needs to left intact for the VPN connection to work, but right now this too seems to be broken.
Tore
Changed in network-manager: | |
status: | New → Confirmed |
Changed in network-manager: | |
importance: | Undecided → Medium |
status: | Confirmed → Triaged |
Changed in network-manager: | |
status: | Unknown → Fix Released |
Changed in network-manager: | |
importance: | Unknown → Medium |
This bug is now fixed upstream by Dan Williams, in SVN r4277.
Tore