Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actions Job on a self-hosted Runners fails with "Interrupt" and "POST request - HTTP Status: BadRequest" #3856

Open
4 tasks done
arseny-zinchenko opened this issue Dec 18, 2024 · 1 comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@arseny-zinchenko
Copy link

Checks

Controller Version

0.9.3

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

The issue occurs randomly, so a specific reproduction method has not been identified.

Describe the bug

A GitHub Actions job is failing with an error:

KeyboardInterrupt
make: *** [Makefile:24: activate-venv-nexus] Interrupt
Error: Process completed with exit code 130.

A corresponding Pod's CPU and RAM usage is below its Kubernetes Limits, and no OOM Killer was executed.
A Persistent Volume also isn't full.

When checking a Kubernetes Pod's logs, found the following error:

	
[RUNNER 2024-12-18 12:39:52Z ERR  GitHubActionsService] POST request to https://pipelinesghubeus6.actions.githubusercontent.com/***/_apis/oauth2/token failed. HTTP Status: BadRequest

And in the github-controller namespace:

EphemeralRunner	Runner does not exist in GitHub service	{"version": "0.9.3", "ephemeralrunner": {"name":"kraken-eks-runners-hjn6j-runner-cv6qd","namespace":"ops-github-runners-ns"}, "runnerId": 4103}
2024-12-18 14:39:53.532	
EphemeralRunner	Checking if runner exists in GitHub service	{"version": "0.9.3", "ephemeralrunner": {"name":"kraken-eks-runners-hjn6j-runner-cv6qd","namespace":"ops-github-runners-ns"}, "runnerId": 4103}

Re-running a Job usually helps, but sometimes may need to be restarted a few times.

Describe the expected behavior

The job completes without errors.

Additional Context

containerMode:
  type: "dind"

template:
  spec:
    initContainers:
    - name: kube-init
      image: ghcr.io/actions/actions-runner:latest
      command: ["sudo", "chown", "-R", "1001:123", "/home/runner/_work"]
      volumeMounts:
        - name: work
          mountPath: /home/runner/_work  
    containers:
      - name: dind
        image: 492***148.dkr.ecr.us-east-1.amazonaws.com/github-runners/docker-dind:latest
        args:
          - dockerd
          - --host=unix:///var/run/docker.sock
          - --group=$(DOCKER_GROUP_GID)
        env:
          - name: DOCKER_GROUP_GID
            value: "123"        
      - name: runner
        image: 492***148.dkr.ecr.us-east-1.amazonaws.com/github-runners/kraken:0.14
        command: ["/home/runner/run.sh"]
        env:
        - name: RUNNER_EKS
          value: "true"
        securityContext:
          capabilities:
            add: ["SYS_PTRACE"]
        allowPrivilegeEscalation: true
        resources:
          requests:
            cpu: 2
            memory: 4Gi
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app.kubernetes.io/name: kraken-eks-runners
    volumes:
      - name: work
        ephemeral:
          volumeClaimTemplate:
            spec:
              accessModes: [ "ReadWriteOnce" ]
              storageClassName: "gp3-iops"
              resources:
                requests:
                  storage: 40Gi

The 492***148.dkr.ecr.us-east-1.amazonaws.com/github-runners/kraken:0.14 Docker image is built from the latest Runners:

FROM ghcr.io/actions/actions-runner:2.321.0


### Controller Logs

```shell
https://gist.github.com/arseny-zinchenko/5aacf7174840ba3d4e63287f749fcb4e

Runner Pod Logs

https://gist.github.com/arseny-zinchenko/06e6e99f3b0884d60370e3d67d78af85
@arseny-zinchenko arseny-zinchenko added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Dec 18, 2024
Copy link
Contributor

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
Development

No branches or pull requests

1 participant