-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: let Windows CNS use the InClusterConfig #3248
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 3 changed files in this pull request and generated no suggestions.
Files not reviewed (1)
- cns/Dockerfile: Language not supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no powershell.exe needed I think
@@ -32,8 +32,6 @@ EXPOSE 10090 | |||
|
|||
# skopeo inspect docker://mcr.microsoft.com/oss/kubernetes/windows-host-process-containers-base-image:v1.0.0 --format "{{.Name}}@{{.Digest}}" | |||
FROM mcr.microsoft.com/oss/kubernetes/windows-host-process-containers-base-image@sha256:b4c9637e032f667c52d1eccfa31ad8c63f1b035e8639f3f48a510536bf34032b as windows | |||
COPY --from=builder /azure-container-networking/cns/kubeconfigtemplate.yaml kubeconfigtemplate.yaml | |||
COPY --from=builder /azure-container-networking/npm/examples/windows/setkubeconfigpath.ps1 setkubeconfigpath.ps1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may want to leave these around in the dockerfile for a bit in case the release of cns which has this removed gets ahead of the overlaymgr release that will remove them from the manifest (or overlaymgr rolls back)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we won't be releasing CNS urgently, the production impact can be mitigated entirely through a daemonset change in AKS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a need to keep cns/kubeconfigtemplate.yaml
and npm/examples/windows/setkubeconfigpath.ps1
in the repo any longer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely no but they are referenced in the NPM build, and we need to make sure nothing we're building out of this branch can land on k8s <1.28
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
# Description As of [AKS 1.30](https://github.com/Azure/AKS/releases/tag/2024-06-09), service account tokens refresh every ~1 hour when OIDC is enabled. They were previously valid for a year. [This setkubeconfigpath.ps1 script](https://github.com/Azure/azure-container-networking/blob/47b243c42fd16119a96ab6d06eb602ac2ce40e7d/npm/examples/windows/setkubeconfigpath.ps1) was at some point necessary on Windows to create a valid kubeconfig for Retina WIndows. It copies the token from the token file to create a custom kubeconfig from a template at startup and then we pass Retina Windows that file via --kubeconfig. The script runs at startup and never re-runs, so the token that exists at Pod start is the token CNS will try to use forever. Yearly token lifespans were long enough that no Retina Windows Pod was ever up long enough to hit token expiration. This becomes an issue with hourly token lifespans. An hour after Pod start, the token becomes invalid and Retina Windows can no longer auth to the API server. For PodSubnet clusters, this permanently prevents Retina Windows from being able to scale the IPAM pool and provide more Pod IPs. Fix for CNS which referenced: Azure/azure-container-networking#3248 ## Related Issue If this pull request is related to any issue, please mention it here. Additionally, make sure that the issue is assigned to you before submitting this pull request. ## Checklist - [x] I have read the [contributing documentation](https://retina.sh/docs/contributing). - [x] I signed and signed-off the commits (`git commit -S -s ...`). See [this documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification) on signing commits. - [x] I have correctly attributed the author(s) of the code. - [x] I have tested the changes locally. - [x] I have followed the project's style guidelines. - [ ] I have updated the documentation, if necessary. - [ ] I have added tests, if applicable. ## Screenshots (if applicable) or Testing Completed Please add any relevant screenshots or GIFs to showcase the changes made. ## Additional Notes Add any additional notes or context about the pull request here. --- Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more information on how to contribute to this project.
# Description As of [AKS 1.30](https://github.com/Azure/AKS/releases/tag/2024-06-09), service account tokens refresh every ~1 hour when OIDC is enabled. They were previously valid for a year. [This setkubeconfigpath.ps1 script](https://github.com/Azure/azure-container-networking/blob/47b243c42fd16119a96ab6d06eb602ac2ce40e7d/npm/examples/windows/setkubeconfigpath.ps1) was at some point necessary on Windows to create a valid kubeconfig for Retina WIndows. It copies the token from the token file to create a custom kubeconfig from a template at startup and then we pass Retina Windows that file via --kubeconfig. The script runs at startup and never re-runs, so the token that exists at Pod start is the token CNS will try to use forever. Yearly token lifespans were long enough that no Retina Windows Pod was ever up long enough to hit token expiration. This becomes an issue with hourly token lifespans. An hour after Pod start, the token becomes invalid and Retina Windows can no longer auth to the API server. For PodSubnet clusters, this permanently prevents Retina Windows from being able to scale the IPAM pool and provide more Pod IPs. Fix for CNS which referenced: Azure/azure-container-networking#3248 ## Related Issue If this pull request is related to any issue, please mention it here. Additionally, make sure that the issue is assigned to you before submitting this pull request. ## Checklist - [x] I have read the [contributing documentation](https://retina.sh/docs/contributing). - [x] I signed and signed-off the commits (`git commit -S -s ...`). See [this documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification) on signing commits. - [x] I have correctly attributed the author(s) of the code. - [x] I have tested the changes locally. - [x] I have followed the project's style guidelines. - [ ] I have updated the documentation, if necessary. - [ ] I have added tests, if applicable. ## Screenshots (if applicable) or Testing Completed Please add any relevant screenshots or GIFs to showcase the changes made. ## Additional Notes Add any additional notes or context about the pull request here. --- Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more information on how to contribute to this project.
Signed-off-by: Evan Baker <[email protected]>
Signed-off-by: Evan Baker <[email protected]>
765c550
to
c5b0288
Compare
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
Reason for Change:
As of AKS 1.30, service account tokens refresh every ~1 hour when OIDC is enabled. They were previously valid for a year.
This
setkubeconfigpath.ps1
script was at some point necessary on Windows to create a valid kubeconfig for CNS (and NPM and others), but based on my testing we don't need it today. It copies the token from the token file to create a custom kubeconfig from a template at startup and then we pass CNS that file via--kubeconfig
.The script runs at startup and never re-runs, so the token that exists at Pod start is the token CNS will try to use forever. Yearly token lifespans were long enough that no CNS Pod was ever up long enough to hit token expiration.
This becomes an issue with hourly token lifespans. An hour after Pod start, the token becomes invalid and CNS can no longer auth to the API server. For PodSubnet clusters, this permanently prevents CNS from being able to scale the IPAM pool and provide more Pod IPs.
Issue Fixed:
Azure/AKS#4679
Requirements:
Notes: