-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(healthcheck): Update Hubble Readiness & Liveliness probes #1048
Conversation
deploy/hubble/manifests/controller/helm/retina/templates/agent/daemonset.yaml
Outdated
Show resolved
Hide resolved
- hubble | ||
- status | ||
initialDelaySeconds: 30 | ||
periodSeconds: 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does the hubble status cmd take to respond at scale? If the command is generally quick and fits within the 30s window, perfect, but if it's frequently greater than 30s for a healthy response then restarting the pod may do more harm than good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to run some scale tests. TBC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status should be pretty static w.r.t. scale. Should have the ring buffer size (which is fixed) as upper limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also some adjustments needed to run scale tests for hubble atm
Do any conditions exist with Hubble status where the exit code is 0 however the retina process itself is in a bad state? |
Following the code of |
If retina is deadlocked how does the current approach determine this? |
Description
Updating Readiness probe to use hubble status command
Adding a liveliness probe to use hubble status command
Related Issue
#1047
Checklist
git commit -S -s ...
). See this documentation on signing commits.Screenshots (if applicable) or Testing Completed
Checks functioning as expected. Tested with an invalid tcp address to make the gRPC connection for hubble status and containers get recycled as expected.
Additional Notes