Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate window agent crashes due to invalid rendering of helm value, and RBAC relating to metricsconfiguration when operator is enabled for windows using Hubble #1122

Open
BeegiiK opened this issue Dec 11, 2024 · 0 comments
Assignees

Comments

@BeegiiK
Copy link
Contributor

BeegiiK commented Dec 11, 2024

Describe the bug
Windows agent is crashing due to an invalid rendering of a helm value. Once that is updated, I noticed that the windows agent will still crash due to some timeouts if the operator is enabled in the windows config-map, following the hubble helm values

Also, kubeconfig file fails to be found when using the legacy path.

ts=2024-12-10T16:58:48.634Z level=info caller=hnsstats/hnsstats_windows.go:212 msg="Start hnsstats plugin..."
W1210 16:58:49.990792    7108 reflector.go:547] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:232: failed to list *v1alpha1.MetricsConfiguration: metricsconfigurations.retina.sh is forbidden: User "system:serviceaccount:kube-system:retina-agent" cannot list resource "metricsconfigurations" in API group "retina.sh" at the cluster scope

To Reproduce
Install retina onto your AKS cluster

helm upgrade --install retina ./deploy/hubble/manifests/controller/helm/retina/ \
        --namespace kube-system \
        --set os.windows=true \
        --set operator.enabled=true \
        --set operator.repository=ghcr.io/beegiik/retina/retina-operator \
        --set operator.tag=v0.0.20-14-gc46b678 \
        --set agent.enabled=true \
        --set agent.repository=ghcr.io/beegiik/retina/retina-agent \
        --set agent.tag=v0.0.20-14-gc46b678 \
        --set agent.init.enabled=true \
        --set agent.init.repository=ghcr.io/beegiik/retina/retina-init \
        --set agent.init.tag=v0.0.20-14-gc46b678 \
        --set logLevel=info \
        --set hubble.tls.enabled=true \
        --set hubble.relay.tls.server.enabled=true \
        --set hubble.tls.auto.enabled=true \
        --set hubble.tls.auto.method=cronJob \
        --set hubble.tls.auto.certValidityDuration=1 \
        --set hubble.tls.auto.schedule="*/10 * * * *"   

and you'll face the error above.

Expected behavior
All agents should be stable and running

Screenshots
If applicable, add screenshots to help explain your problem.

Platform (please complete the following information):

  • OS: Windows
  • Kubernetes Version:
  • Host: AKS
  • Retina Version: v0.0.20

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant