Dear all,
I am moving slowly in the new possible production configuration of my stable/xena cluster.
I have succefully installed with openstack-ansible stable/xena the overal ecosystem and later added several additional services (I already open some bugs regarding what I found).
Yesterday I tried to lunch the first kubernetes cluster using magnum.
I found an annoying issue regarding the verification of self-signed SSL certificates that have been used to install openstack during the creation of the kube-master virtual machine.
The cluster is initialized, heat stack created and initialized. The master node is created, and initialized properly with all the software onboard. But the stack then stuck until 60min timeout and cluster creation fails.
After a few investigations and readings online of 5 years old discussions, I found someone that was pointing out a possible SSL issue between the master node with keystone...
During this 60min time, I am able to get access via ssh to Fedora Coreos 35 image I have download for the purpose and query journaltcl -xef to monitor what is happning and I find:
Jun 26 12:00:05 c3-upfijbzzjaut-master-0 systemd[1]: Started Hostname Service.
░░ Subject: A start job for unit systemd-hostnamed.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit systemd-hostnamed.service has finished successfully.
░░
░░ The job identifier is 1160.
Jun 26 12:00:05 c3-upfijbzzjaut-master-0 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 26 12:00:05 c3-upfijbzzjaut-master-0 audit[2532]: USER_START pid=2532 uid=0 auid=1000 ses=1 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=1000 exe="/usr/sbin/sshd" hostname=? addr=192.168.2.6 terminal=ssh res=success'
Jun 26 12:00:05 c3-upfijbzzjaut-master-0 audit[2532]: CRYPTO_KEY_USER pid=2532 uid=0 auid=1000 ses=1 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:d3:4c:a5:d7:0e:5b:c2:6e:3a:8f:84:e4:75:f7:40:97:b9:11:e0:70:e8:b8:36:3e:e9:33:68:3f:22:6d:dd:0c direction=? spid=2571 suid=1000 exe="/usr/sbin/sshd" hostname=? addr=? terminal=? res=success'
Jun 26 12:00:05 c3-upfijbzzjaut-master-0 audit[2532]: USER_END pid=2532 uid=0 auid=1000 ses=1 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=1000 exe="/usr/sbin/sshd" hostname=? addr=192.168.2.6 terminal=ssh res=success'
Jun 26 12:00:12 c3-upfijbzzjaut-master-0 conmon[2386]: Authorization failed: SSL exception connecting to https://10.0.0.10:5000/v3/auth/tokens: HTTPSConnectionPool(host='10.0.0.10', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)')))
Jun 26 12:00:12 c3-upfijbzzjaut-master-0 conmon[2386]: Source [heat] Unavailable.
Jun 26 12:00:12 c3-upfijbzzjaut-master-0 podman[2346]: Authorization failed: SSL exception connecting to https://10.0.0.10:5000/v3/auth/tokens: HTTPSConnectionPool(host='10.0.0.10', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)')))
Jun 26 12:00:12 c3-upfijbzzjaut-master-0 podman[2346]: Source [heat] Unavailable.
Jun 26 12:00:12 c3-upfijbzzjaut-master-0 podman[2346]: /var/lib/os-collect-config/local-data not found. Skipping
Jun 26 12:00:12 c3-upfijbzzjaut-master-0 conmon[2386]: /var/lib/os-collect-config/local-data not found. Skipping
Jun 26 12:00:28 c3-upfijbzzjaut-master-0 conmon[2386]: Authorization failed: SSL exception connecting to https://10.0.0.10:5000/v3/auth/tokens: HTTPSConnectionPool(host='10.0.0.10', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)')))
Jun 26 12:00:28 c3-upfijbzzjaut-master-0 podman[2346]: Authorization failed: SSL exception connecting to https://10.0.0.10:5000/v3/auth/tokens: HTTPSConnectionPool(host='10.0.0.10', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)')))
Jun 26 12:00:28 c3-upfijbzzjaut-master-0 podman[2346]: Source [heat] Unavailable.
Jun 26 12:00:28 c3-upfijbzzjaut-master-0 podman[2346]: /var/lib/os-collect-config/local-data not found. Skipping
Jun 26 12:00:28 c3-upfijbzzjaut-master-0 conmon[2386]: Source [heat] Unavailable.
Jun 26 12:00:28 c3-upfijbzzjaut-master-0 conmon[2386]: /var/lib/os-collect-config/local-data not found. Skipping
Jun 26 12:00:35 c3-upfijbzzjaut-master-0 systemd[1]: systemd-hostnamed.service: Deactivated successfully.
It is confirmed that the master node is not able to check as valid the SSL certificate of the Keystone service.
I have also tried to manually install the same SSL certificates that are used by my HAProxy nodes (HA config) in the machine because this will for sure unlock the situation, but without any lucky.
Does someone create a Kubenetes cluster with Magnum recently after Xena was release?
Do you have any workaround to this problem?
I have also tried to find if the software making the request is a scripting and luckily uses curl, but I did not find it. If it is a curl call i can modify the request statement by simply put -k parameter to the call to ignore the SSL certicates check
Thanks for any help.
Davide
I start to think this is an heat problem.
I restarted again a new cluster installation and when I log into master node (kube master VM) I see printed:
[systemd] container- agent.service 2ancod3fmckw- master- 0 ~]$
Failed Units: 1
heat-
[core@c1-
and journalctl shows always the same error.
I think this is something not related to openstack-ansible.
Someone agree with me?