fix: implement idempotent FetchTask API calls to reduce risk of lost tasks #1393
No reviewers
Labels
No labels
FreeBSD
Kind/Breaking
Kind/Bug
Kind/Chore
Kind/DependencyUpdate
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
Windows
linux-powerpc64le
linux-riscv64
linux-s390x
run-end-to-end-tests
run-forgejo-tests
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo/runner!1393
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "mfenniak/forgejo-runner:fetchtask-idempotent"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When the FetchTask() API is invoked to create a task, unpreventable environmental errors may occur; for example, network disconnects and timeouts. It's possible that these errors occur after the server-side has assigned a task to the runner during the API call, in which case the error would cause that task to be lost between the two systems -- the server will think it's assigned to the runner, and the runner never received it. This can cause jobs to appear stuck at "Set up job" (#1391).
The solution implemented here is idempotency in the FetchTask() API call, which means that the "same" FetchTask() API call is expected to return the same values. Specifically, the runner creates a unique identifier
requestKeywhich is transmitted to the server along with each FetchTask() invocation which defines the sameness of the call, and the runner retains therequestKeyvalue until the API call receives a successful response. If the server implements idempotency, it can use this key to identify repeated invocations of FetchTask() and when the same request is received, the same response is provided.Runner's responsibility is to send the same request key consistently if any error occurred, and, change it to a new key when a successful call is received.
A separate Forgejo PR will be coming to implement the server-side portion of this, and I expect to hold this PR until the server-side is reviewed to ensure the system design doesn't change during that review and invalidate any work here.
cf9dba7ba57b3facef737b3facef73a57a4664e3cascading-pr updated at actions/setup-forgejo#903
runner.capacityw/ multiple Forgejo connections and idempotent requests #1453