Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actions runner controller gets stuck when pods get evicted #3840

Open
4 tasks done
PythonCoderAS opened this issue Dec 10, 2024 · 1 comment
Open
4 tasks done

Actions runner controller gets stuck when pods get evicted #3840

PythonCoderAS opened this issue Dec 10, 2024 · 1 comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@PythonCoderAS
Copy link

Checks

Controller Version

0.9.3

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. Trigger a lot of jobs that require runners (maybe a matrix with a large number of combinations)
2. Trigger a disk pressure taint on the node (an example is to run out of PVC space on the storage allocator)
3. Remove any other node from the cluster

Describe the bug

My cluster hit a disk pressure taint and 15 pods got evicted. However, when the taint was removed, the evicted pods were stuck and no new pods were launched, meaning that no new self-hosted actions ran until I manually deleted the evicted pods.

Describe the expected behavior

Either it should auto-delete the evicted pods, or just create new pods on top.

Additional Context

githubConfigUrl: "https://github.com/HARP-research-Inc"
githubConfigSecret:
  github_token: "<redacted>"
containerMode:
  type: "kubernetes"
  kubernetesModeWorkVolumeClaim:
    accessModes: ["ReadWriteOnce"]
    storageClassName: "local-path"
    resources:
      requests:
        storage: 2Gi
minRunners: 3

Controller Logs

https://gist.github.com/PythonCoderAS/19d8fc2f3e6a0623ca8311bede0b027e

Runner Pod Logs

https://gist.github.com/PythonCoderAS/1f39f9769e9ce8e3060f1e4791d7508d
@PythonCoderAS PythonCoderAS added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Dec 10, 2024
Copy link
Contributor

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
Development

No branches or pull requests

1 participant