baremetal devstack kubelet can't do network probes to pods

Bug #1693378 reported by Antoni Segura Puimedon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kuryr-kubernetes
Fix Released
High
Unassigned

Bug Description

On baremetal devstack:
- The kubelet runs in the host and its networking
- The host networking can only access Neutron ports via FIPs
- The pods that are not marked as HostNetworking=True run on Neutron ports in the 'private-network'

The above facts mean that all network readiness and liveness probes from the kubelet to the pods will fail. As such, the containers in the pods will be endlessly restarted and end in crashloop backoff.

This problem also affects non-devstack baremetal deployments.

As a workaround you can:

1. Create a port for each kubelet in your baremetal cluster in the pod subnet
2. Create an OVS port on br-int on each worker node and bind the port created in step 1 to the newly created ovs port (if your deployment uses hybrid firewall driver, that means creating a bridge, one veth pair and adding one side of the veth pair as a port to ovs br-int.
3. Make sure the port is up and add the IP address to it.

The above steps will have created a link scoped route to the internal network that kubelet will automatically start using.

Revision history for this message
Antoni Segura Puimedon (celebdor) wrote :

As for a proper resolution to this bug. I suppose that we should do better than the workaround. My late in the night thought is as follows:

1. Create a subnet for baremetal worker nodes with its SG
2. Create a port and bind it per worker node
3. Add a Probe handler that watches pod events and adds SGs for the liveness and readiness probes allowing the whole SG of the worker nodes access to that specific port.

Changed in kuryr-kubernetes:
importance: Undecided → High
milestone: none → pike-3
status: New → Triaged
Revision history for this message
Antoni Segura Puimedon (celebdor) wrote :

For reference, here's what I did in my system for the workaround:

  208 openstack port create --network private --security-group=a6bf3c24-3101-43aa-8fce-7a7561c2fe6e kubelet
  217 sudo brctl addbr qbr9fdfeb37-44
  223 sudo ip link add type veth
  227 sudo ip link set dev veth0 name qvo9fdfeb37-44
  228 sudo ip link set dev veth1 name qvb9fdfeb37-44
  231 sudo ovs-vsctl add-port br-int qvo9fdfeb37-44 tag=1
  236 sudo brctl addif qbr9fdfeb37-44 qvb9fdfeb37-44
  242 sudo brctl setfd qbr9fdfeb37-44 0
  243 sudo brctl stp qbr9fdfeb37-44 off
  245 sudo ip link set qvo9fdfeb37-44 up
  246 sudo ip link set qvd9fdfeb37-44 up
  247 sudo ip link set qvb9fdfeb37-44 up
  248 sudo ip link set qbr9fdfeb37-44 up
  249 sudo ovs-vsctl set interface qvo9fdfeb37-44 external_ids:iface-id="9fdfeb37-444d-4166-9977-2891ee18b6c7" external_ids:attached-mac=fa:16:3e:89:27:c3 external_ids:owner="compute:kuryr" external_ids:iface-status=active
  255 sudo ip addr add 10.0.0.10/26 qbr9fdfeb37-44

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kuryr-kubernetes (master)

Reviewed: https://review.openstack.org/468100
Committed: https://git.openstack.org/cgit/openstack/kuryr-kubernetes/commit/?id=d85481386e8df646d9d2a544c19e97eda1174cdc
Submitter: Jenkins
Branch: master

commit d85481386e8df646d9d2a544c19e97eda1174cdc
Author: Antoni Segura Puimedon <email address hidden>
Date: Thu May 25 18:39:41 2017 +0200

    devstack: Add configuration for kubelet probes

    Most pods for real workloads come with a good amount of readiness and
    liveness probes that are performed via either http or tcp. The problem
    that we have in devstack baremetal is that the kubelet sits on the host
    networking and has no way to reach the pods that are on the overlay.

    This patch adds a configuration option that if enabled, makes devstack
    create an port for the kubelet to use and binds it in ovs.

    Closes-bug: #1693378
    Partially Implements: devstack-support-api-accessing-pods
    Signed-off-by: Antoni Segura Puimedon <email address hidden>
    Change-Id: I8929ca43d162d63efc5b5e5fdf601cd336dd5bc6

Changed in kuryr-kubernetes:
status: Triaged → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kuryr-kubernetes 0.2.0

This issue was fixed in the openstack/kuryr-kubernetes 0.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.