Can anyone point me in the right direction for troubleshooting this?
Akito
February 28, 2022, 4:26pm
2
opened 04:59AM - 09 Sep 21 UTC
kind/bug
team/area1
**Rancher Server Setup**
- Rancher version: 2.6.0
- Installation option (Docke… r install/Helm Chart): Helm Chart
- If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE2 1.18.20
- Proxy/Cert Details: Using cert-manager with rancher certs
**Information about the Cluster**
- Kubernetes version: RKE2 1.18.20
- Cluster Type (Local/Downstream): Local
- If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider):
<!--
* Custom = Running a docker command on a node
* Imported = Running kubectl apply onto an existing k8s cluster
* Hosted = EKS, GKE, AKS, etc
* Infrastructure Provider = Rancher provisioning the nodes using different node drivers (e.g. AWS, Digital Ocean, etc)
-->
**Describe the bug**
[ERROR] Unknown error: timeout trying to get secret for service account: cattle-system/pod-impersonation-helm-op-z5hlx
**To Reproduce**
Trying to install Monitoring from Cluster Tools page with default values/options generates the above error.
**Result**
Unable to install Monitoring, failed installation
**Expected Result**
Sucessfull installation as previous versions in Rancher
**Screenshots**

SURE-4021
opened 03:24AM - 08 Jun 17 UTC
closed 04:57AM - 01 Mar 18 UTC
sig/apps
**Is this a request for help?** (If yes, you should use our troubleshooting guid… e and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.):
**Note:** Please file issues for subcomponents under the appropriate repo
| Component | Repo |
| --------- | ------------------------------------------------------------------ |
| kubectl | [kubernetes/kubectl](https://github.com/kubernetes/kubectl/issues/new) |
| kubeadm | [kubernetes/kubeadm](https://github.com/kubernetes/kubeadm/issues/new) |
**What keywords did you search in Kubernetes issues before filing this one?** (If you have found any duplicates, you should instead reply there.):
Mount failed unexisted secret serviceaccount
---
**Is this a BUG REPORT or FEATURE REQUEST?** (choose one):
BUG REPORT
<!--
If this is a BUG REPORT, please:
- Fill in as much of the template below as you can. If you leave out
information, we can't help you as well.
Calico node DaemonSet of self-hosted mode.
If this is a FEATURE REQUEST, please:
- Describe *in detail* the feature/behavior/change you'd like to see.
In both cases, be ready for followup questions, and please respond in a timely
manner. If we can't reproduce a bug or think a feature already exists, we
might close your issue. If we're wrong, PLEASE feel free to reopen it and
explain why.
-->
**Kubernetes version** (use `kubectl version`):
v1.6.1
**Environment**:
- **Cloud provider or hardware configuration**: x86_64
- **OS** (e.g. from /etc/os-release): DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS"
- **Kernel** (e.g. `uname -a`): Linux 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
- **Install tools**: ansible
- **Others**:
**What happened**:
I have deployed a Kubernetes cluster with kube-apiserver, kube-controller-manager, kube-scheduler and kube-proxy deployed as static Pod. Then set up calico network with self-hosted mode. Everything works fine.
But when I restart docker engine of my master node, Calico node pod cannot be started because it tries to mount a default serviceaccount secret which does not exist. I found out kube-controller-manager will re-create secret for each serviceaccount in each namespace after restart. But pod of daemonset like calico node did not sync up and replaced with new created secret. That might be a race condition problem.
**What you expected to happen**:
All Pod can start successfully after restart docker engine.
**How to reproduce it** (as minimally and precisely as possible):
Just restart docker engine of master node. It seems not reproducible every time. I think it might be a race condition problem.
**Anything else we need to know**:
None.
Check if the secret exists and is available.
Thank you. Got that fixed.
Akito
March 1, 2022, 11:42am
4
I’m sure, it would be helpful for other people having the same issue, if you could explain your solution.
We had Istio running in the cluster. We removed Istio and it fixed the issue. It’s probably because we didn’t configure any Istio VirtualServices. But we also saw a condition with the helm pods where it would almost immediately time out while the istio sidecar was still booting up.
1 Like