Add an option for ephermeral runners to be deleted if they fail to fetch a job #109

Open
opened 2026-04-17 13:03:29 +00:00 by BtbN · 1 comment

How to use this feature request

  • Please describe your first hand experience in a comment to show why you are interested into the same feature.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.

First hand experience

The environment I run our Windows/OSX runners in does not allow me to use --wait, as it both costs per minute the VM is up, and also forbids in its TOS blocking VMs longer than necessary.
So my ephermeral runners in that environment just exit if they found no job to run.
But this leaves them registered with the Forgejo instance indefinitely, and since the VM that ran them and holds their token is long gone, they won't ever be able to fetch a job and run it.

Needs and benefits

Ideally, the runner should just be deleted as if it had run a job when failing to find a job to run.

Feature Description

There should be some option to either have an ephemeral runner able to self-delete with its token, or flag it to be auto-deleted when failing to fetch a task.
Since I am currently still stuck using the old "./runner register" method, any such flag could also hopefully be added to it.

What needs to happen before a feature request is ready to be implemented?

Users can complete the first step (accumulating first and experience) on their own, even if this feature request did not catch the eye of someone with the necessary skills to implement it. And when it reaches that point, it will stand out and have a much higher chance of being implemented.

  1. A few other users contributed their own first hand experience.
    To fully grasp the scope of a feature request, and to brainstorm possible solutions, a feature request will generally wait until several users have provided their perspective.
    Thumbs-up reactions help gauge popularity, but do not provide the same amount of useful information.
  2. The "Needs and benefit" and "Feature description" are finalized.
    Results from discussions and additional user experiences are incorporated into a final summary to provide a single reference for the developers working on this change.
    This can be done by the author of the issue or anyone else in a followup comment.
  3. The label Stage/Idea is changed to Stage/Ready.
  4. Feature request is created in the repository where the code resides.
    Depending on the feature request it can be in Forgejo or Forgejo runner.
    A copy/paste of the "Needs and benefit" and "Feature description" should be used, with link to this issue so the developer knows where to find more details if they need to.
### How to use this feature request * Please describe your first hand experience in a comment to show why you are interested into the same feature. * Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue. * Subscribe to receive notifications on status change and new comments. ### First hand experience The environment I run our Windows/OSX runners in does not allow me to use --wait, as it both costs per minute the VM is up, and also forbids in its TOS blocking VMs longer than necessary. So my ephermeral runners in that environment just exit if they found no job to run. But this leaves them registered with the Forgejo instance indefinitely, and since the VM that ran them and holds their token is long gone, they won't ever be able to fetch a job and run it. ### Needs and benefits Ideally, the runner should just be deleted as if it had run a job when failing to find a job to run. ### Feature Description There should be some option to either have an ephemeral runner able to self-delete with its token, or flag it to be auto-deleted when failing to fetch a task. Since I am currently still stuck using the old "./runner register" method, any such flag could also hopefully be added to it. ### What needs to happen before a feature request is ready to be implemented? Users can complete the first step (accumulating first and experience) on their own, even if this feature request did not catch the eye of someone with the necessary skills to implement it. And when it reaches that point, it will stand out and have a much higher chance of being implemented. 1. **A few other users contributed their own first hand experience.** To fully grasp the scope of a feature request, and to brainstorm possible solutions, a feature request will generally wait until several users have provided their perspective. Thumbs-up reactions help gauge popularity, but do not provide the same amount of useful information. 1. **The "Needs and benefit" and "Feature description" are finalized.** Results from discussions and additional user experiences are incorporated into a final summary to provide a single reference for the developers working on this change. This can be done by the author of the issue or anyone else in a followup comment. 1. **The label `Stage/Idea` is changed to `Stage/Ready`.** 1. **Feature request is created in the repository where the code resides.** Depending on the feature request it can be in [Forgejo](https://codeberg.org/forgejo/forgejo/issues/new?template=.forgejo%2fissue_template%2ffeature-request.yaml) or [Forgejo runner](https://code.forgejo.org/forgejo/runner/issues/new?template=.forgejo%2fissue_template%2ffeature-request.yaml). A copy/paste of the "Needs and benefit" and "Feature description" should be used, with link to this issue so the developer knows where to find more details if they need to.
Member

Very rough sketch:

var ( 
    Wait     EphemeralMode = iota
    NoWait
)

The mode could be included when registering the runner: forgejo-runner register --mode NoWait ... Same for the HTTP API.

Wait (default) would work like it does today: The ephemeral runner will be removed after it has completed its task. NoWait would allow at most one FetchTask() call. If a task is waiting, it would be returned and the ephemeral runner would be deleted after it has been completed. If no task is waiting, the runner would be removed right away.

I don't think it would interfere with the idempotency key. If no task is returned, it doesn't matter if Forgejo Runner doesn't receive the reply. And if a task is returned, it would work as it does today.

I seriously dislike the naming. It suggests that it would switch --wait on or off which isn't the case. RemoveAfterTaskCompletion (instead of Wait) and RemoveAfterUnsuccessfulFetch (instead of NoWait) are both a mouthful.

Very rough sketch: ```go var ( Wait EphemeralMode = iota NoWait ) ``` The mode could be included when registering the runner: `forgejo-runner register --mode NoWait ...` Same for the HTTP API. `Wait` (default) would work like it does today: The ephemeral runner will be removed after it has completed its task. `NoWait` would allow at most one `FetchTask()` call. If a task is waiting, it would be returned and the ephemeral runner would be deleted after it has been completed. If no task is waiting, the runner would be removed right away. I don't think it would interfere with the idempotency key. If no task is returned, it doesn't matter if Forgejo Runner doesn't receive the reply. And if a task is returned, it would work as it does today. I seriously dislike the naming. It suggests that it would switch `--wait` on or off which isn't the case. `RemoveAfterTaskCompletion` (instead of `Wait`) and `RemoveAfterUnsuccessfulFetch` (instead of `NoWait`) are both a mouthful.
Sign in to join this conversation.
No labels
Stage
Idea
Stage
Ready
No milestone
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
forgejo/forgejo-actions-feature-requests#109
No description provided.