Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get "https://api.ocp4.xxx.local:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused #9034

Open
uriworkaccount opened this issue Sep 18, 2024 · 2 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@uriworkaccount
Copy link

Version

./openshift-install version
./openshift-install 4.14.10
built from commit a57f47f37ad5615dadeafd66118a4cbeebd075ec
release image registry-bastion.xxx.local:5000/ocp4@sha256:03cc63c0c48b2416889e9ee53f2efc2c940323c15f08384b439c00de8e66e8aa
release architecture amd64

Platform:

vsphere

IPI

What happened?

after trying to install openshift 4.10.14 on IPI i get an error for the installation:

DEBUG Still waiting for the cluster to initialize: Cluster operators authentication, console, ingress, monitoring, openshift-samples are not available
ERROR Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get "https://api.ocp4.xxx.local:6443/apis/config.openshift.io/v1/clusteroperators": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused
ERROR Cluster initialization failed because one or more operators are not functioning properly.
ERROR The cluster should be accessible for troubleshooting as detailed in the documentation linked below,
ERROR https://docs.openshift.com/container-platform/latest/support/troubleshooting/troubleshooting-installations.html
ERROR The 'wait-for install-complete' subcommand can then be used to continue the installation
ERROR failed to initialize the cluster: Cluster operators authentication, console, ingress, monitoring, openshift-samples are not available

etcd is ok. so is kube-api. their logs show nothing unusual.

crictl ps | grep etcd
a2b8b4d37cc55       abc100ec44000b4c92f4b62ba46503bfcdefcf316ec0404a14ae7b3ea9c884f7                                                         2 hours ago         Running             etcd-readyz                                   0                   b2d7589a9e67d       etcd-ocp4-cpmbw-master-0
6f10bb56a0d00       e6c5281b78cf86b70ce52d6d1549dad5c649a70d1a0896cf9f1d5eaabad98c9a                                                         2 hours ago         Running             etcd-metrics                                  0                   b2d7589a9e67d       etcd-ocp4-cpmbw-master-0
b971ba893acf4       e6c5281b78cf86b70ce52d6d1549dad5c649a70d1a0896cf9f1d5eaabad98c9a                                                         2 hours ago         Running             etcd                                          0                   b2d7589a9e67d       etcd-ocp4-cpmbw-master-0
636ed022b11aa       e6c5281b78cf86b70ce52d6d1549dad5c649a70d1a0896cf9f1d5eaabad98c9a                                                         2 hours ago         Running             etcdctl                                       0                   b2d7589a9e67d       etcd-ocp4-cpmbw-master-0
0ad042db6a16d       abc100ec44000b4c92f4b62ba46503bfcdefcf316ec0404a14ae7b3ea9c884f7                                                         2 hours ago         Running             guard                                         0                   4ee2d84f5b0b6       etcd-guard-ocp4-cpmbw-master-0
crictl ps | grep kube-api
88f6cb2f4548d       d5f77abcb012a5585c20e4da0a56de07e3be903ae56a08a350b98bf2ad027458                                                         2 hours ago         Running             kube-apiserver-check-endpoints                0                   3fd18afe5d6b6       kube-apiserver-ocp4-cpmbw-master-0
0a333532d0f5c       d5f77abcb012a5585c20e4da0a56de07e3be903ae56a08a350b98bf2ad027458                                                         2 hours ago         Running             kube-apiserver-insecure-readyz                0                   3fd18afe5d6b6       kube-apiserver-ocp4-cpmbw-master-0
660c25c5014f5       d5f77abcb012a5585c20e4da0a56de07e3be903ae56a08a350b98bf2ad027458                                                         2 hours ago         Running             kube-apiserver-cert-regeneration-controller   0                   3fd18afe5d6b6       kube-apiserver-ocp4-cpmbw-master-0
c48a5731ff80a       d5f77abcb012a5585c20e4da0a56de07e3be903ae56a08a350b98bf2ad027458                                                         2 hours ago         Running             kube-apiserver-cert-syncer                    0                   3fd18afe5d6b6       kube-apiserver-ocp4-cpmbw-master-0
39f8040aeb83b       b3c28dd0f0a94032e38819463fee5333c5f4367c14da02c913197f1d00539357                                                         2 hours ago         Running             kube-apiserver                                0                   3fd18afe5d6b6       kube-apiserver-ocp4-cpmbw-master-0

oc get co gives:

oc get co
E0918 18:09:15.304236   45203 memcache.go:265] couldn't get current server API group list: Get "https://api.ocp4.xxx.local:6443/api?timeout=32s": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused
E0918 18:09:15.304936   45203 memcache.go:265] couldn't get current server API group list: Get "https://api.ocp4.xxx.local:6443/api?timeout=32s": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused
E0918 18:09:15.306659   45203 memcache.go:265] couldn't get current server API group list: Get "https://api.ocp4.xxx.local:6443/api?timeout=32s": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused
E0918 18:09:15.308383   45203 memcache.go:265] couldn't get current server API group list: Get "https://api.ocp4.xxx.local:6443/api?timeout=32s": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused
E0918 18:09:15.309417   45203 memcache.go:265] couldn't get current server API group list: Get "https://api.ocp4.xxx.local:6443/api?timeout=32s": dial tcp xxx.xxx.xxx.xxx:6443: connect: connection refused
The connection to the server api.ocp4.xxx.local:6443 was refused - did you specify the right host or port?

I can ping it, telnet says refused,

ping api.ocp4.xxx.local
PING api.ocp4.xxx.local (xxx.xxx.xxx.xxx) 56(84) bytes of data.
64 bytes from xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): icmp_seq=1 ttl=64 time=0.423 ms
64 bytes from xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx): icmp_seq=2 ttl=64 time=0.260 ms

telnet api.ocp4.xxx.local 6443
Trying xxx.xxx.xxx.xxx...
telnet: connect to address xxx.xxx.xxx.xxx: Connection refused

when i try to simulate the request from the bastion it says it's refused, when I try from one of the masters it succeeds:

from bastion:

curl -v -XGET  -H "Accept: application/json;g=apidiscovery.k8s.io;v=v2beta1;as=APIGroupDiscoveryList,application/json" -H "User-Agent: oc/4.14.0 (linux/amd64) kubernetes/286cfa5" 'https://api.ocp4.xxx.local:6443/api?timeout=32s'
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying xxx.xxx.xxx.xxx...
* TCP_NODELAY set
* connect to xxx.xxx.xxx.xxx port 6443 failed: Connection refused
* Failed to connect to api.ocp4.xxx.local port 6443: Connection refused
* Closing connection 0
curl: (7) Failed to connect to api.ocp4.xxx.local port 6443: Connection refused

from master node:

curl -v -XGET  -H "Accept: application/json;g=apidiscovery.k8s.io;v=v2beta1;as=APIGroupDiscoveryList,application/json" -H "User-Agent: oc/4.14.0 (linux/amd64) kubernetes/286cfa5" 'https://xxx.xxx.xxx.xxx:6443/api?timeout=32s'
Note: Unnecessary use of -X or --request, GET is already inferred.
*   Trying xxx.xxx.xxx.xxx:6443...
* Connected to xxx.xxx.xxx.xxx (xxx.xxx.xxx.xxx) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS header, Unknown (21):
* TLSv1.3 (OUT), TLS alert, unknown CA (560):
* SSL certificate problem: self-signed certificate in certificate chain
* Closing connection 0
curl: (60) SSL certificate problem: self-signed certificate in certificate chain
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

What you expected to happen?

finish install

How to reproduce it (as minimally and precisely as possible)?

./openshift-install create cluster --log-level=debug

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 18, 2024
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

2 participants