SIGSEGV on forgejo-runner daemon -c /etc/forgejo-runner.yaml
#146
Labels
No labels
Kind/Breaking
Kind/Bug
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: forgejo/runner#146
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Observed
Eunning
forgejo-runner daemon -c /etc/forgejo-runner.yaml
(which points to the default file; same result without-c
) on my server crashed every time (and at the same line). It does that for all tried versions, on bare metal, and even in a fully virtualized qemu. I am thinking of hiring an excorcistIf I run the same command and config on another system it starts up flawlessly.
What and where
Versions used
forgejo-runner 3.0.0
/forgejo-runner 3.10
/forgejo-runner 3.20
/forgejo-runner 3.3.0
Environments
On this specific server all attempts fail in the same way and 100% of the time.
"Bare metal" (Ubuntu Mantic)
podman
hardware virtualized
Tried both Ubuntu Jammy and Mantic. Running plain binary, container in
podman
, or as container indocker-ce
.fully virtualized
Running on a fully virtualized qemu with emulated
CPU
. This was my last try to rule out CPU specifics.😄
It should gracefully show an error message instead of crashing like that. The root cause is that
resp.Msg.Runner.Name, resp.Msg.Runner.Version, resp.Msg.Runner.Labels
has an invalid address somewhere. But the code does not verify any of that, it assumesresp
is always good.Since it runs well on other machines, could it be that the network is interfering? These are the very first network paquets exchanged between the runner and the server, it may be worth taking a quick look at what tcpdump or wireshark sees.
I would recommend trying to recompile on the machine to be 100% sure it is not a binary generation problem and rule that out entirely. Note that the binary is static and does not rely on any shared library. But it does rely on the kernel ABI so it is worth a shot.
After recompiling, if that still fails, you will have the option of adding extra verification to get more clues.
I must say it is puzzling.
Looking into networking was the solution!
The server delivered different data depending on from where I connected. The test cases show that for curling
/
curl
from my laptop getsHTTP 200
with content (cookies, html)curl
from the server itself getsHTTP 200
with 0 bytes dataA
HTTP 200
response without content is parsed byforgejo-runner
in a way that leaves one of these asnil
, successively crashing the formatter.resp
resp.Msg
resp.Msg.Runner
resp.Msg.Runner.Name
resp.Msg.Runner.Version
resp.Msg.Runner.Labels
Where did the empty response come from?
The
forgejo
instance runs in a container on a server and is exposed via a reverse proxy. In this case the proxy was Caddy.This Caddy instance had geo blocking (any other filtering would have done the same) configured: It only proxied to the
forgejo
instance for certain countries.Caddy has a quite peculiar behavior: If a server is configured (
my.example.com
) but has no rule on what to do with the requests, it returns an emptyHTTP 200
response.In the cases where
forgejo-runner
crashed, it accessed the instance either from the host-ip (which is in one of the blocked countries - don’t ask), or, in the running from a VM case, anrfc1918
address, which also is not an allowed country. This madeCaddy
return the emptyHTTP 200
response, which madeforgejo-runner
crash.Solution for my situation
I added an IP filter that allowed the local addresses to the
Caddy
configuration.Fixing the root cause
The reponse object needs to be validated for empty fields. I have no
golang
experience (and no compiler installed), but I would think of something like this: