Hey! We've joined GitLab! 🎉 Read more

Troubleshooting

Kubernetes-based debugging

When the Opstrace instance or parts of it appear to not be healthy then debugging should start with getting insights about the underlying Kubernetes cluster and its deployments.

Connect kubectl to your Opstrace Instance

Make sure use the same AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY that were used for creating the Opstrace instance. Then run

aws eks update-kubeconfig --name <clustername> --region <awsregion>

For example:

aws eks update-kubeconfig --name testcluster --region us-west-2

Get an overview over all container states

When the Opstrace instance or parts of it appear to not be healthy then debugging should start with getting an overview over all Kubernetes deployments and individual container states. This can be obtained with the following command:

kubectl describe all --all-namespaces > describe_all.out

This will reveal when a certain container is for example in a crash loop, or when it never started in the first place as of for example an error while pulling the corresponding container image.

Fetch Opstrace controller logs

When the Opstrace controller (a deployment running in the Opstrace instance) is suspected to have run into a problem then it is important to fetch and inspect its logs:

kubectl logs deployment/opstrace-controller \
--all-containers=true --namespace=kube-system > controller.log