-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to the path '/proc/<ID>/oom_score_adj' is denied #3132
Comments
Hello! Thank you for filing an issue. The maintainers will triage your issue shortly. In the meantime, please take a look at the troubleshooting guide for bug reports. If this is a feature request, please review our contribution guidelines. |
Hey @romanvogman, This issue is related to the runner. However, can you please confirm that the job executes without issues? I know the runner raises this exception but usually, it does not influence the execution of the job. I am curious is this exception affecting your job, or are you reporting that the runner throws the exception? |
Hi @nikola-jokic !
|
Oh, from this report, it definitely is not causing failure of the job. The output that you provided showed that the hook execution failed. We should include better error reporting in the hook. The It is possible that the node pressure is causing this kind of issue. The job pod needs to land on the runner node, so that may be causing issues with the hook implementation. I will close this issue here, since it is not ARC related, but feel free to comment on it! |
Hi @romanvogman , we have encountered the same issue as you described (using our GKE cluster for ARC). Just wondering - have you managed to solve it? |
Same here GKE + ARC |
@nikola-jokic wrote:
@nikola-jokic Where is the right place to open this issue so it gets addressed? I remain confused about where the source code is for the runners used for Runner Controller Sets and where to open issues about them. This is still happening in version 0.8.2
|
Hey, Just to clarify, the issue with the access to the path is denied should not influence the workings of the runner at all. It is just an annoying exception that the runner throws. If you want to submit it, you can create an issue in the runner repo. As far as the error reporting goes with the hook, we are hoping to publish a new 0.5.1 release soon and re-publish the image. That can help troubleshoot the hook setup. However, the |
@nikola-jokic wrote:
See, this is what I'm talking about with regard to confusion. That repo ( |
@nikola-jokic Could you advise how to trouble shoot the hook problems? As described above, we are both running in GKE (standard, no autopilot), regular GitHub jobs work fine, the problem is with the containerized ones and kubernetes mode in ARC. The runner pod should start a second workflow pod for the container, but this is not happening. I can see inside the runner pod that hook process is running, however I do not see any relevant logs, even when I tried to provide |
Hey @zerola, Of course, currently debugging hook is almost impossible since the information about the error is hidden in the exception and not logged anywhere. This has been changed, starting at 0.5.0 release, but that release introduced a bug on alpine containers, so we ended up rolling back the hook version added to the runner version 2.312.0. We have a PR ready that should be released, but for now, you would have to build your own hook and provide it to the runner. If you decide to go with that approach, please use the branch where this PR is, or if you don't use alpine containers in your workflow, you can probably safely use the 0.5.0 release. |
Hi @nikola-jokic , thanks for instructions. I have built my own runner image based on your https://github.com/actions/runner/blob/main/images/Dockerfile and provided the
|
Can you please turn on debugging and see the output in the workflow? |
Could you advise please how to turn on debugging? I found only |
Does the step output that can be seen in the UI show the reason for the failure? Based on this issue, it does seem to help so I'm trying to understand how are we missing the HTTP response log on the latest 0.5.0 version |
No, the output in UI is still this:
|
Have you guys managed to find a solution for this? I'm running into the same problem. |
I'm not able to catch what the problem is, I'm getting the same logs as @zerola |
@caiocsgomes - Unfortunately no, in the end we have decided to use Docker-In-Docker mode for containerized workflows and that works. In any case, I will keep an eye on this PR if someone manages to solve it. |
So, is this related to / caused by this upstream issue? |
Hey @zerola, sorry for the late reply. As @nikola-jokic mentioned - the issue wasn't related to From our side we were running containerized tasks which required arc to run in a dind mode. After changing to dind (with a few other unrelated fixes) the issue was resolved. Hope it helps to anyone who encounters this issue |
Hi, can someone definitely confirm that kubernetes mode does not support containerized task? |
We should probably better report the error on the hook side. There is definitely a room to improve. @remidebette, I'm sorry I don't understand, what do you mean when you say containerized task? Are you referring to the container step? |
Hi @nikola-jokic, we have been trying a "vanilla" install of the scaleset helm chart in kubernetes mode, switched our CI jobs to containers and are encountering the issue that I see in several tickets:
In my understanding, this script is not stable and people in the discussions online have issues with it and switch back to dind instead. For example What is specific to us is that we are using an onpremisses rancher cluster, the PVC class is ceph-rbd and the helm charts are installed with flux. |
The script should be fine, but the error reported does not give you any clue what is going on. If you can, please let me know if the workflow pod is created, but something is incorrect there. If there is an example workflow I can run to see what is going on, that would also be helpful. One thing to note, if you are using private images, the container hook will not inherit the pull policy of the runner pod |
@nikola-jokic HI this error come from the runner not sure why |
Hi I've encountered the same error while trying to run ARC in kubernetes mode and was following the same guide as @romanvogman. containerMode:
type: "kubernetes" ## type can be set to dind or kubernetes
## the following is required when containerMode.type=kubernetes
kubernetesModeWorkVolumeClaim:
accessModes: ["ReadWriteOnce"]
# For local testing, use https://github.com/openebs/dynamic-localpv-provisioner/blob/develop/docs/quickstart.md to provide dynamic provision volume with storageClassName: openebs-hostpath
storageClassName: "openebs-hostpath"
resources:
requests:
storage: 1Gi
+ kubernetesModeServiceAccount:
+ annotations: This wasn't a part of the video that we both followed ,I see that this was already existed in 0.7.0 . So this change was introduced somewhere between |
Checks
Controller Version
0.7.0
Deployment Method
Helm
Checks
To Reproduce
Describe the bug
Trying to set up a self hosted runner in a kubernets mode that will replace a self hosted runner which currently runs on a dedicated VM.
When running a github action that runs a jenkins action I'm seeing a permissions error on
Access to the path '/proc/<ID>/oom_score_adj' is denied
Describe the expected behavior
Expect it to run as is when running on a self hosted runner in a dedicated VM instead of in a kubernets cluster
Additional Context
Followed the following video to set it up
Controller Logs
Runner Pod Logs
The text was updated successfully, but these errors were encountered: