Prometheus metrics endpoint for monitoring runners #96

Open
opened 2026-01-22 13:48:22 +00:00 by francorbacho · 3 comments

How to use this feature request

  • Please describe your first hand experience in a comment to show why you are interested into the same feature.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.

First hand experience

Sometimes my self hosted instance queues a lot of jobs. Since I only have a few runners, they are not enough and jobs start queuing up.

Needs and benefits

It would provide:

  • Real time visibility into runner health, job queue status, performance metrics
  • Healthchecks
  • Alerts for failing and dying runners via industry standard solutions (e.g. Grafana)
  • Insights into job congestion

Feature Description

The following metrics:

  • Amount of currently running runners
  • Amount of currently running jobs
  • Amount of currently queued jobs
  • Amount of total ran jobs

Up for ideas about other metrics.

What needs to happen before a feature request is ready to be implemented?

Users can complete the first step (accumulating first and experience) on their own, even if this feature request did not catch the eye of someone with the necessary skills to implement it. And when it reaches that point, it will stand out and have a much higher chance of being implemented.

  1. A few other users contributed their own first hand experience.
    To fully grasp the scope of a feature request, and to brainstorm possible solutions, a feature request will generally wait until several users have provided their perspective.
    Thumbs-up reactions help gauge popularity, but do not provide the same amount of useful information.
  2. The "Needs and benefit" and "Feature description" are finalized.
    Results from discussions and additional user experiences are incorporated into a final summary to provide a single reference for the developers working on this change.
    This can be done by the author of the issue or anyone else in a followup comment.
  3. The label Stage/Idea is changed to Stage/Ready.
  4. Feature request is created in the repository where the code resides.
    Depending on the feature request it can be in Forgejo or Forgejo runner.
    A copy/paste of the "Needs and benefit" and "Feature description" should be used, with link to this issue so the developer knows where to find more details if they need to.
### How to use this feature request * Please describe your first hand experience in a comment to show why you are interested into the same feature. * Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue. * Subscribe to receive notifications on status change and new comments. ### First hand experience Sometimes my self hosted instance queues a lot of jobs. Since I only have a few runners, they are not enough and jobs start queuing up. ### Needs and benefits It would provide: - Real time visibility into runner health, job queue status, performance metrics - Healthchecks - Alerts for failing and dying runners via industry standard solutions (e.g. Grafana) - Insights into job congestion ### Feature Description The following metrics: - Amount of currently running runners - Amount of currently running jobs - Amount of currently queued jobs - Amount of total ran jobs Up for ideas about other metrics. ### What needs to happen before a feature request is ready to be implemented? Users can complete the first step (accumulating first and experience) on their own, even if this feature request did not catch the eye of someone with the necessary skills to implement it. And when it reaches that point, it will stand out and have a much higher chance of being implemented. 1. **A few other users contributed their own first hand experience.** To fully grasp the scope of a feature request, and to brainstorm possible solutions, a feature request will generally wait until several users have provided their perspective. Thumbs-up reactions help gauge popularity, but do not provide the same amount of useful information. 1. **The "Needs and benefit" and "Feature description" are finalized.** Results from discussions and additional user experiences are incorporated into a final summary to provide a single reference for the developers working on this change. This can be done by the author of the issue or anyone else in a followup comment. 1. **The label `Stage/Idea` is changed to `Stage/Ready`.** 1. **Feature request is created in the repository where the code resides.** Depending on the feature request it can be in [Forgejo](https://codeberg.org/forgejo/forgejo/issues/new?template=.forgejo%2fissue_template%2ffeature-request.yaml) or [Forgejo runner](https://code.forgejo.org/forgejo/runner/issues/new?template=.forgejo%2fissue_template%2ffeature-request.yaml). A copy/paste of the "Needs and benefit" and "Feature description" should be used, with link to this issue so the developer knows where to find more details if they need to.
Member

This would be great. In our setup i ended up creating another service that queries https://code.forgejo.org/api/swagger#/admin/adminSearchRunJobs and transform the result into Prometheus metrics. It would be great if it was native.

This would be great. In our setup i ended up creating another service that queries https://code.forgejo.org/api/swagger#/admin/adminSearchRunJobs and transform the result into Prometheus metrics. It would be great if it was native.
Member

Makes sense.

I expect that this feature will need a lot of research/planning.

I haven't checked whether Forgejo has any metrics capabilities. If it hasn't, we need a suitable tool. Is there something like Micrometer for Go?

How forgiving are tools like Prometheus when it comes to format/variable changes?

What would really help a lot: examples. Develop the charts/alerts you want to have. What data is required to generate them? How does it have to be structured? Let us know. Without that information, this feature will take forever.

Makes sense. I expect that this feature will need a lot of research/planning. I haven't checked whether Forgejo has any metrics capabilities. If it hasn't, we need a suitable tool. Is there something like [Micrometer](https://micrometer.io/) for Go? How forgiving are tools like Prometheus when it comes to format/variable changes? What would really help a lot: examples. Develop the charts/alerts you want to have. What data is required to generate them? How does it have to be structured? Let us know. Without that information, this feature will take forever.
Member

@aahlenst Prometheus is already supported in forgejo at least according to the cheat sheet. I think we would just have to make a new collector

Biggest question is what metrics is actually generally applicable

@aahlenst Prometheus is already supported in forgejo at least according to the [cheat sheet](https://forgejo.org/docs/latest/admin/config-cheat-sheet/#metrics-metrics). I think we would just have to make a new [collector]( https://codeberg.org/forgejo/forgejo/src/commit/26d6b484feb9330921bc57ccffee7a71418a1736/modules/metrics/collector.go#L52) Biggest question is what metrics is actually generally applicable
Sign in to join this conversation.
No labels
Stage
Idea
Stage
Ready
No milestone
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
forgejo/forgejo-actions-feature-requests#96
No description provided.