Kubectl get pods timeout with openstack-integrator

Bug #1868062 reported by Stamatis Katsaounis
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Openstack Integrator Charm
Fix Released
Medium
Stamatis Katsaounis

Bug Description

Hi all,

During the last step of installation with openstack-integration instead of kubeapi-load-balancer I am facing the following error:

While everything is active and in ready state, the kube-master units are in waiting for kube-system pods to be ready.

I tried to troubleshoot and I found out that the cli call is receiving timeout because it uses /root/.kube/config explicitly. If I delete this part in file charm/lib/charms/layer/kubernetes_common.py#219 then the kubectl succeeds and proceeds.

This is the offending line:
command = ['kubectl', '--kubeconfig=' + kubeclientconfig_path] + list(args)

And this is my modified working line:
command = ['kubectl'] + list(args)

PS. I am using the default options of charmed-kubernetes bundle.yaml.

Kind regards,
Stamatis

Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :

An update to the original report:

It seems that .kube/config does not work at all. Not only when juju runs kubectl commands but also when the user tries to use kubectl. Something inside the config is wrong. Notice that without openstack-integration (only the charmed-kubernetew bundle.yaml) everything is working as expected.

Revision history for this message
George Kraft (cynerva) wrote :

Can you share the output of `juju status --format yaml` and `juju debug-log --replay`?

> And this is my modified working line:
> command = ['kubectl'] + list(args)

Just as a heads up, this workaround will not work in Kubernetes 1.18, since they are removing kubectl's default behavior of connecting to localhost[1].

[1]: https://github.com/kubernetes/kubernetes/pull/86173

summary: - Kubectl get pods timeout
+ Kubectl get pods timeout with openstack-integrator
Changed in charm-kubernetes-master:
status: New → Incomplete
Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :
Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :
Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :

Here you are

Revision history for this message
George Kraft (cynerva) wrote :

Thank you. Looks like this is a deployment that uses openstack-integrator as the loadbalancer instead of kubeapi-load-balancer. I suspect there is something wrong with the connection between kubernetes-master and the loadbalancer; kubernetes-master uses the loadbalancer IP when rendering /root/.kube/config, but that seems to not be working here.

One other quick question: what version of openstack are you running?

Changed in charm-kubernetes-master:
status: Incomplete → Confirmed
Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :

I am using OpenStack Train.

I have not set any of openstack-integration configs yet: lb-floating-network, floating-network-id, lb-subnet and also Juju controllers and model have been set to an internal OpenStack network.

Maybe something of the above is related (or not).

Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :

Hi George,

I managed to overcome the problem and I wan to share the solution and the issues I faced:

1. The original problem was an outcome of missing security groups. Being more specific, Octavia Load Balancer could not communicate with kubernetes-master nova instances. The reason was that kubernetes-master were missing a security group (or rule from their existing groups depending on the design you want to follow) to allow ingress traffic to port 6443. As a result, despite being in the pool of the Load Balancer, the Load Balancer could not communicate with them.

2. I could use kubectl from my Juju jumphost because I was hitting Load Balancer VIP (this IP is written in .kube/config) and the Load Balancer could not speak with kubernetes-master instances.

My solution was to add a new security group to kubernetes-master instances to allow INGRESS to 6443 port from the private network I am using for the Juju units.

3. Another issue I faced was the following: During my tests I tried to use option manage-security-groups=true. But this cannot work if the provided OpenStack credentials are belonging to a member of a project. The reason is that the charm code tries to apply a security group to the OpenStack port which represent the Load Balancer VIP. This cannot be done from a member, thus, the code receives an error.

My solution to the problem above was to manually do the command from my admin user, comment out the line in the charm code, restart the juju agent of openstack-integrator and apply a change to a config option to trigger the update-config action.

George Kraft (cynerva)
Changed in charm-kubernetes-master:
importance: Undecided → High
Changed in charm-openstack-integrator:
importance: Undecided → High
Changed in charm-kubernetes-master:
status: Confirmed → Triaged
Changed in charm-openstack-integrator:
status: New → Triaged
Revision history for this message
Stamatis Katsaounis (skatsaounis) wrote :

Fix for point 1): Add a security group for Kubernetes Master nodes to be able to receive traffic from Octavia Load Balancer when Port Security is enabled: https://github.com/juju-solutions/charm-openstack-integrator/pull/34

George Kraft (cynerva)
Changed in charm-kubernetes-master:
importance: High → Medium
Changed in charm-openstack-integrator:
importance: High → Medium
tags: added: review-needed
Changed in charm-openstack-integrator:
status: Triaged → In Progress
Changed in charm-openstack-integrator:
assignee: nobody → Stamatis Katsaounis (skatsaounis)
Cory Johns (johnsca)
tags: removed: review-needed
Changed in charm-openstack-integrator:
status: In Progress → Fix Committed
milestone: none → 1.19
no longer affects: charm-kubernetes-master
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Please see bug 1893512. The patch landed here has broken the charm for anyone that is using the openstack-integrator charm with a non-admin tenant (which as far as I know is everybody).

Revision history for this message
Edward Hope-Morley (hopem) wrote :

I've dug into this a bit more (see bug 1868062 for details) but in short, this patch alone is not the cause of the problem but under the right conditions could also fail. The problem is caused by manage-security-groups=true resulting in the charm trying to access the LB vip port's sg which a non-admin tenant cannot modify or even access since it is created (if using Octavia) by Octavia itself. If all is well then the code in this patch would never execute anyway since the port should have been added to the SG when the LB was created. Since Octavia is really the only supported way to create LBs (neutron lbaasv2 is deprecated) I can't imagine a scenario where the code added here would be needed.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Correction, the bug I meant to refer to in my previous comment is bug 1893512

Changed in charm-openstack-integrator:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.