-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet: volumeManager.WaitForUnmount does not wait for emptydir to be unmounted successfully #113563
Comments
/sig storage /cc @msau42 |
/kind bug |
/triage accepted doesn't look like a regression /cc @xmcqueen |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove lifecycle-rotten |
Another one for kubernetes/test-infra#32957 This issue looks like it may be tricky to solve but remains a source of flakes and should stay tracked. |
/triage accepted |
@msau42 Would you be willing to see if this issue is still relevant? |
Maybe #125070 will resolve this? |
What happened?
Kubelet has an internal
syncTerminatedPod
which is called after pods are terminated. The function is responsible for some final pod cleanup and is responsible to ensure that volumes mounted to the pod are unmounted. The function callsvolumeManager.WaitForUnmount
:kubernetes/pkg/kubelet/kubelet.go
Lines 1881 to 1885 in 7d9c0e0
As part of doing some testing for a different issue, I came across an issue with emptydir handling of unmounting -- it looks like that
volumeManager.WaitForUnmount
will return true if even if the empty dir was not unmounted successfully.Chatted with @msau42 about this issue and it seems this is because, during
WaitForUnmount
, it is checking for mounted state:kubernetes/pkg/kubelet/volumemanager/cache/actual_state_of_world.go
Line 965 in 7d9c0e0
However, if there is an error during unmounting the volume is marked as "uncertain" (
kubernetes/pkg/volume/util/operationexecutor/operation_generator.go
Line 879 in 7d9c0e0
WaitForUnmount
succeeding despite an error during unmounting. It's unclear if this is expected behavior.What did you expect to happen?
I expected that
volumeManager.WaitForUnmount
will block (or return error) if the emptydir had an error unmounting.How can we reproduce it (as minimally and precisely as possible)?
Create the following pod:
Get the pod uid
Enter the kind-worker (node) and do a
chattr +i
on the emtpydir. This will make the emptydir volume immutable and prevent it from being unmounted (and deleted).Now delete, the pod:
Here's the logs:
Kubelet logs - https://gist.github.com/4f9b1fa0edda8d260c90a1f18c9dc6e5
Kubelet logs for test-pd pod: https://gist.github.com/d63b8713b71bef1e712ca138fcb5d602
The notable logs:
Termination (
syncTerminatingPod
):But
syncTerminatedPod
succeeded (despite the volume not actually unmounting) (This is becauseWaitForUnmount
succeeded incorrectly)Volume continues to try to be unmounted later
Anything else we need to know?
No response
Kubernetes version
1.25.2
Cloud provider
n/a
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: