OpenShift Reissue Certificate Manually
#Red Hat , #container , #certificate , #error
OpenShift Reissue Certificate Manually
I recently ran the redeploy certificates playbook on my 3.11 cluster and found it broke apps that rely on the certificate signer ca, as it issues a new certificate signer ca but does not retrigger new certificates to be generated from it (at least not for all of the apps). In my case, it killed the latest Prometheus deployment and I got service unavailable messages from the router.
To diagnose the problem, I checked the logs on the grafana app:
$ oc logs -f grafana-5ff4bb48f5-q46ff -c grafana-proxy
...
2018/12/13 15:49:29 server.go:2923: http: TLS handshake error from 10.129.2.1:52306: remote error: tls: unknown certificate authority
So I decided to check the tls certificate, which is usually mounted from a secret:
$ oc get secrets | grep tls
alertmanager-main-tls kubernetes.io/tls 2 38m
grafana-tls kubernetes.io/tls 2 42m
kube-state-metrics-tls kubernetes.io/tls 2 38m
node-exporter-tls kubernetes.io/tls 2 38m
prometheus-k8s-tls kubernetes.io/tls 2 38m
It is far more useful to be able to actually read the certificate contents in a legible format rather than stare at either the PEM or base64 encoding.
$ oc get secret grafana-tls -n openshift-monitoring -o yaml | grep tls.crt | awk '{print $2}' | base64 -d - | openssl x509 -in - -text -noout
Looking at my certificate, everything seemed to be fine. It had not expired. But I did recall the logs specifically stating that the certificate authority was unknown, so let’s focus on that.
$ oc get secret grafana-tls -n openshift-monitoring -o yaml | grep tls.crt | awk '{print $2}' | base64 -d - | openssl x509 -in - -text -noout | grep Issuer Issuer: CN = openshift-service-serving-signer@1536343562
Let’s compare it to one of the apps that is working correctly:
$ oc get secret console-serving-cert -n openshift-console -o yaml | grep tls.crt | awk '{print $2}' | base64 -d - | openssl x509 -in - -text -noout | grep Issuer Issuer: CN = openshift-service-serving-signer@1544190532
Aha! The redeploy certificates playbook didn’t update my Prometheus deployment. That seems like a big miss. Let’s fix it.
Delete the old secret since it’s worthless now.
$ oc delete secret grafana-tls -n openshift-monitoring
Remove the certificate signing annotations by manually editing the service responsible for the secret or using these commands.
$ oc annotate service grafana-tls \ service.alpha.openshift.io/serving-cert-secret-name- \ service.alpha.openshift.io/serving-cert-signed-by-
$ oc annotate service grafana-tls \ service.alpha.openshift.io/serving-cert-secret-name=grafana-tls
You should be able to see a new secret just created, and you can check it for the correct CA.
$ oc get secret grafana-tls -n openshift-monitoring -o yaml | grep tls.crt | awk '{print $2}' | base64 -d - | openssl x509 -in - -text -noout | grep Issuer Issuer: CN = openshift-service-serving-signer@1544190532
Now delete all the pods to recreate new ones with the new certificate:
$ oc delete pods --all -n openshift-monitoring
Give it a few minutes and all is back up and running.