Running GitHub Actions on RISC-V64: A Production Journey with Docker Builds

Running GitHub Actions on RISC-V64: A Production Journey with Docker Builds

Setting up continuous integration for RISC-V64 is not as straightforward as x86_64 or ARM64. The official GitHub Actions runner doesn’t support RISC-V64, leading to a fascinating journey of finding alternatives, configuring hardware, and debugging production issues. This article documents my complete experience setting up a self-hosted runner on a BananaPi F3 for automated Docker Engine builds, including the real-world bugs I discovered and fixed after weeks of production use.

The Problem: No Official RISC-V64 Support

When I started building Docker Engine binaries for RISC-V64, I quickly hit a wall. GitHub Actions, the de facto standard for CI/CD, doesn’t provide RISC-V64 runners. The official runner relies on .NET, which only has experimental RISC-V64 support. For a production build pipeline, “experimental” isn’t good enough.

I needed:

  • Reliable automation: Weekly builds without manual intervention
  • Native compilation: Cross-compilation for Docker is complex and error-prone
  • GitHub integration: Seamless workflow with existing GitHub Actions
  • Production stability: Must survive reboots and run continuously

The solution? A self-hosted runner on real RISC-V64 hardware.

Why github-act-runner?

After researching alternatives, I discovered github-act-runner, a Go-based implementation of the GitHub Actions runner protocol. This was perfect because:

  • Go has excellent RISC-V64 support: No experimental flags, it just works
  • Drop-in replacement: Compatible with GitHub Actions workflows
  • Lightweight: ~25MB binary, minimal resource overhead
  • Active development: Community-maintained with regular updates

The key insight: while the official runner requires the entire .NET runtime, github-act-runner only needs Go, Node.js (for JavaScript actions), and Docker. All three have first-class RISC-V64 support.

Hardware: BananaPi F3

I chose the BananaPi F3 as my build server:

  • SpacemiT K1 processor: 8-core RISC-V64 (RV64GCVB) X60, octa-core design
  • 16GB RAM: LPDDR4 memory providing ample headroom for Docker builds
  • Armbian Trixie: Debian-based with excellent RISC-V support
  • Gigabit Ethernet: Stable connectivity for 24/7 operation
  • Low power consumption: ~10W under load

This isn’t bleeding-edge hardware, but that’s the point. I wanted to prove that CI/CD on RISC-V64 doesn’t require exotic setups.

Initial Setup: The Happy Path

Prerequisites

First, I installed the essential dependencies on the BananaPi F3:

# Update system
sudo apt-get update
sudo apt-get upgrade -y

# Install Node.js (required for JavaScript-based GitHub Actions)
sudo apt-get install -y nodejs npm

# Verify versions
node --version  # v20.19.2
npm --version   # 9.2.0
go version      # go1.24.4 linux/riscv64
docker --version # 28.5.2

The Node.js requirement surprised me initially. Many GitHub Actions use JavaScript under the hood (like actions/checkout@v4), so the runner needs Node.js to execute them.

Building github-act-runner

cd ~
git clone https://github.com/ChristopherHX/github-act-runner.git github-act-runner-test
cd github-act-runner-test

# Build takes ~2-3 minutes on BananaPi F3
go build -v -o github-act-runner .

# Verify
./github-act-runner --help

The build was surprisingly fast. Go’s cross-platform nature really shines here - no special flags, no configuration, it just compiles.

Runner Configuration

Getting the registration token from GitHub:

# Visit: https://github.com/YOUR_USERNAME/docker-for-riscv64/settings/actions/runners/new
# Copy the token from the displayed command

./github-act-runner configure \
  --url https://github.com/gounthar/docker-for-riscv64 \
  --token YOUR_TOKEN_HERE \
  --name bananapi-f3-runner \
  --labels riscv64,self-hosted,linux \
  --work _work

The labels are crucial. Workflows target self-hosted runners with:

jobs:
  build:
    runs-on: [self-hosted, riscv64]

This ensures jobs only run on my RISC-V64 runner, not on GitHub’s x86_64 hosts.

Critical: Systemd Service Configuration

Here’s where I learned an important lesson. Initially, I ran the runner manually in a terminal. This worked fine until the first power outage. When the BananaPi rebooted, no runner. Builds failed. Users complained.

The solution: a proper systemd service.

sudo tee /etc/systemd/system/github-runner.service << 'EOF'
[Unit]
Description=GitHub Actions Runner (RISC-V64)
After=network.target docker.service
Wants=network.target

[Service]
Type=simple
User=poddingue
WorkingDirectory=/home/poddingue/github-act-runner-test
ExecStart=/home/poddingue/github-act-runner-test/github-act-runner run
Restart=always
RestartSec=10
KillMode=process
KillSignal=SIGTERM
TimeoutStopSec=5min

[Install]
WantedBy=multi-user.target
EOF

# Enable and start
sudo systemctl daemon-reload
sudo systemctl enable github-runner
sudo systemctl start github-runner

# Verify
sudo systemctl status github-runner

Key configuration choices:

  • After=docker.service: Don’t start until Docker is ready
  • Restart=always: Auto-restart on failures
  • TimeoutStopSec=5min: Give builds time to clean up
  • User=poddingue: Never run as root (security)

After this change, the runner survived reboots, network hiccups, and even Docker daemon restarts.

Production Workflows: The Real Test

With the runner configured, I created three automated workflows:

Weekly Docker Engine Builds

name: Weekly Docker RISC-V64 Build

on:
  schedule:
    - cron: '0 2 * * 0'  # Every Sunday at 02:00 UTC
  workflow_dispatch:
    inputs:
      moby_ref:
        description: 'Moby ref to build'
        required: false
        default: 'master'

jobs:
  build-docker:
    runs-on: [self-hosted, riscv64]

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          submodules: true

      - name: Build Docker binaries
        run: |
          cd moby
          docker build \
            --build-arg BASE_DEBIAN_DISTRO=trixie \
            --build-arg GO_VERSION=1.25.3 \
            --target=binary \
            -f Dockerfile .

      # ... containerd, runc builds ...

      - name: Create release
        env:
          GH_TOKEN: $
        run: |
          gh release create "${RELEASE_VERSION}" \
            --title "${RELEASE_TITLE}" \
            --notes-file release-notes.md \
            release-$DATE/*

This workflow runs every Sunday morning, building:

  • dockerd: Docker Engine daemon
  • docker-proxy: Network proxy
  • containerd: Container runtime (v1.7.28)
  • runc: OCI runtime (v1.3.0)
  • containerd-shim-runc-v2: Containerd shim

Build time: 35-40 minutes on the BananaPi F3. Not blazing fast, but acceptable for weekly automation.

Release Tracking

name: Track Moby Releases

on:
  schedule:
    - cron: '0 6 * * *'  # Daily at 06:00 UTC
  workflow_dispatch:

jobs:
  check-releases:
    runs-on: ubuntu-latest  # No native hardware needed!

    steps:
      - name: Get latest Moby release
        run: |
          LATEST=$(gh api repos/moby/moby/releases/latest --jq .tag_name)
          # Check if we've already built it...

Notice this workflow uses ubuntu-latest, not the self-hosted runner. Why? Because it’s just GitHub API calls - no compilation needed. This reduces load on my BananaPi and provides faster execution.

APT Repository Updates

The final piece of automation: after building binaries, automatically create Debian packages and update the APT repository hosted on GitHub Pages.

name: Update APT Repository

on:
  workflow_run:
    workflows: ["Build Debian Package", "Build Docker Compose Debian Package", "Build Docker CLI Debian Package"]
    types: [completed]

This workflow downloads all packages (Engine, CLI, Compose), signs them with GPG, and updates the repository using reprepro. The result: users can install Docker with a simple apt-get install.

Weeks Later: Production Issues Emerge

After three weeks of smooth operation, I started noticing strange behavior. Users reported that apt-get upgrade sometimes worked, sometimes didn’t. The APT repository seemed to randomly “forget” packages. And when I checked my latest release, I found duplicate RPM files - why did v28.5.2-riscv64 contain both moby-engine-28.5.1 AND moby-engine-28.5.2?

Time to debug.

Bug #1: The Vanishing Packages Mystery

Symptom: APT repository would update successfully, but only one package type would be present. Install docker-cli and suddenly docker.io disappeared.

Investigation: I examined the workflow logs:

# In update-apt-repo.yml
gh release download "$RELEASE_TAG" -p 'docker.io_*.deb'
reprepro -b . includedeb trixie docker.io_*.deb

Ah. The workflow only downloaded packages from the triggering release. If the Docker CLI build triggered the workflow, it only downloaded docker-cli_*.deb. Previous packages (docker.io, containerd, runc) were ignored.

Root cause: Each package type has its own release tag:

  • Engine releases: v28.5.2-riscv64
  • CLI releases: cli-v28.5.2-riscv64
  • Compose releases: compose-v2.40.1-riscv64

When the APT workflow ran, it would:

  1. Download packages from the triggering release
  2. Rebuild the repository from scratch
  3. Upload - but only with the newly downloaded packages

Solution: Download ALL packages on every run.

- name: Download all latest .deb packages
  env:
    GH_TOKEN: $
  run: |
    # Find latest Engine release
    DOCKER_RELEASE=$(gh release list --repo gounthar/docker-for-riscv64 \
      --limit 50 --json tagName | \
      jq -r '.[] | select(.tagName | test("^v[0-9]+\\.[0-9]+\\.[0-9]+-riscv64$")) | .tagName' | \
      head -1)

    # Find latest CLI release
    CLI_RELEASE=$(gh release list --repo gounthar/docker-for-riscv64 \
      --limit 50 --json tagName | \
      jq -r '.[] | select(.tagName | test("^cli-v[0-9]+\\.[0-9]+\\.[0-9]+-riscv64$")) | .tagName' | \
      head -1)

    # Find latest Compose release
    COMPOSE_RELEASE=$(gh release list --repo gounthar/docker-for-riscv64 \
      --limit 50 --json tagName | \
      jq -r '.[] | select(.tagName | test("^compose-v[0-9]+\\.[0-9]+\\.[0-9]+-riscv64$")) | .tagName' | \
      head -1)

    # Download from each
    gh release download "$DOCKER_RELEASE" -p 'docker.io_*.deb' --clobber
    gh release download "$DOCKER_RELEASE" -p 'containerd_*.deb' --clobber
    gh release download "$DOCKER_RELEASE" -p 'runc_*.deb' --clobber
    gh release download "$CLI_RELEASE" -p 'docker-cli_*.deb' --clobber
    gh release download "$COMPOSE_RELEASE" -p 'docker-compose-plugin_*.deb' --clobber

Now the repository always contains all packages, regardless of which build triggered the update.

Verification step added:

- name: Verify all packages present
  run: |
    EXPECTED_PACKAGES=(
      "containerd"
      "docker-cli"
      "docker.io"
      "runc"
    )

    MISSING_PACKAGES=()
    for pkg in "${EXPECTED_PACKAGES[@]}"; do
      if reprepro -b . list trixie | grep -q "^trixie|main|riscv64: $pkg "; then
        echo "✅ $pkg found"
      else
        echo "❌ $pkg MISSING"
        MISSING_PACKAGES+=("$pkg")
      fi
    done

    if [ ${#MISSING_PACKAGES[@]} -gt 0 ]; then
      echo "⚠️  ${#MISSING_PACKAGES[@]} package(s) missing!"
      exit 1
    fi

This catches regressions immediately.

Bug #2: The jq Syntax Catastrophe

After fixing the package downloading, I ran into a new error:

Error: jq parse error: Invalid escape at line 1, column 45

Investigation: I had recently “fixed” a line length issue by adding a backslash:

CLI_RELEASE=$(gh release list --repo gounthar/docker-for-riscv64 \
              --limit 50 --json tagName | \
              jq -r '.[] | select(.tagName | test("^cli-v[0-9]+\\.[0-9]+\\.[0-9]+-riscv64$")) | \  # ← BAD!
              .tagName' | \
              head -1)

The backslash was inside the jq expression. jq interpreted it as an escape sequence, not as a shell line continuation.

Solution: Move the backslash outside the jq expression:

CLI_RELEASE=$(gh release list --repo gounthar/docker-for-riscv64 \
              --limit 50 --json tagName | \
              jq -r '.[] | select(.tagName | test("^cli-v[0-9]+\\.[0-9]+\\.[0-9]+-riscv64$")) | .tagName' | \  # ← GOOD!
              head -1)

Lesson learned: when piping to jq, keep the entire jq expression on one logical line, even if you split the bash command with backslashes.

Bug #3: The Persistent RPM Problem

The most subtle bug involved RPM packaging. Users reported that downloading moby-engine-28.5.2-1.riscv64.rpm sometimes gave them the old version (28.5.1).

Investigation: I checked the release assets:

$ gh release view v28.5.2-riscv64
...
moby-engine-28.5.1-1.riscv64.rpm  25MB
moby-engine-28.5.2-1.riscv64.rpm  25MB
containerd-1.7.28-1.riscv64.rpm   30MB
runc-1.3.0-1.riscv64.rpm          8MB

Two versions of moby-engine! But why?

The RPM build workflow runs on the self-hosted runner. Unlike GitHub’s ephemeral runners, my BananaPi has persistent state. The ~/rpmbuild/RPMS/riscv64/ directory survives between builds.

Timeline:

  1. Build v28.5.1 → creates moby-engine-28.5.1-1.riscv64.rpm
  2. Upload all files in ~/rpmbuild/RPMS/riscv64/
  3. Two weeks later: build v28.5.2 → creates moby-engine-28.5.2-1.riscv64.rpm
  4. Upload all files in ~/rpmbuild/RPMS/riscv64/ → uploads BOTH versions!

Solution: Clean the build directory before building.

Added to all RPM workflows:

- name: Clean previous RPM builds
  if: steps.release.outputs.has-new-release == 'true'
  run: |
    # Remove any existing RPM files to prevent uploading old versions
    rm -f ~/rpmbuild/RPMS/riscv64/moby-engine-*.rpm
    rm -f ~/rpmbuild/RPMS/riscv64/containerd-*.rpm
    rm -f ~/rpmbuild/RPMS/riscv64/runc-*.rpm
    echo "Cleaned previous Engine RPM files"

This is specific to self-hosted runners. On GitHub’s ephemeral runners, each build starts with a clean filesystem. On self-hosted runners, you are responsible for cleanup.

Manual cleanup: I also had to manually remove the duplicate files from the existing releases:

# List all assets
gh release view v28.5.2-riscv64 --json assets --jq '.assets[].name'

# Delete the old versions
gh release delete-asset v28.5.2-riscv64 moby-engine-28.5.1-1.riscv64.rpm
gh release delete-asset v28.5.2-riscv64 docker-cli-28.5.1-1.riscv64.rpm

Performance Characteristics

After weeks of production use, here are the real-world performance numbers:

Build Times (BananaPi F3)

Component Time Notes
Test workflow ~5 seconds Simple architecture checks
Docker Engine (complete) 35-40 minutes Includes dockerd, containerd, runc
Docker CLI 12-15 minutes Lighter build, fewer dependencies
Docker Compose 8-10 minutes Pure Go, fast compilation
Tini 2-3 minutes Small C project

Resource Usage

During a full Docker build:

  • CPU: 8 cores at 80-95% utilization
  • RAM: Peak 3.5GB (out of 16GB total)
  • Disk I/O: ~200MB/s read during image builds
  • Network: ~50Mbps for downloading Go modules

The BananaPi F3 handles these builds comfortably. It’s not fast by modern standards, but it’s reliable.

Reliability Metrics

Since implementing the systemd service (3 weeks ago):

  • Uptime: 99.2% (only offline during power outages)
  • Successful builds: 47 out of 48 (one failure due to disk space)
  • Average weekly builds: 4-5 (one scheduled, others manual/tracking)

Lessons Learned

Self-Hosted Runners Are Different

The biggest mental shift: self-hosted runners have state. Every assumption you have from using GitHub’s ephemeral runners needs to be re-examined:

  • Cleanup is your responsibility: Files persist between runs
  • Dependencies don’t auto-update: You manage Node.js, Go, Docker versions
  • Disk space accumulates: Docker images, build caches, logs
  • Reboots happen: Systemd services are mandatory

Architecture-Specific Challenges

Some issues are unique to RISC-V64:

  • Limited pre-built images: Many Docker images don’t have riscv64 variants
  • Longer build times: 8 RISC-V cores at lower clock speeds vs 8+ x86_64 cores at 3+GHz
  • Beta software: Some tools (like the runner itself) are community projects
  • Documentation gaps: Fewer people have solved these problems before

But none of these are dealbreakers. They just require more attention.

Automation Complexity

The more automated your pipeline, the more places for subtle bugs:

  • Multi-package repositories: Need careful orchestration to avoid race conditions
  • Concurrent workflows: APT repository updates can conflict if two packages build simultaneously
  • Release tag conventions: Different prefixes (v, cli-v, compose-v*) require regex matching
  • Error handling: Silent failures are worse than loud failures

I added retry logic to the APT repository update:

# Push with retry logic for concurrent workflow handling
MAX_RETRIES=5
RETRY_COUNT=0
while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
  if git push origin apt-repo; then
    echo "✅ Successfully pushed changes"
    break
  else
    RETRY_COUNT=$((RETRY_COUNT + 1))
    # Fetch, rebase, retry...

This handles the case where two packages finish building within seconds of each other.

Testing Is Critical

After the “vanishing packages” bug, I added comprehensive verification:

  1. Package presence checks: Verify all expected packages exist
  2. Version checks: Ensure new versions actually update
  3. Installation tests: Actually install and run the packages
  4. Regression tests: Keep test cases for past bugs

The verification step catches 90% of issues before users see them.

Recommendations for Others

If you’re setting up RISC-V64 CI/CD, here’s what I’d recommend:

Hardware Choices

Minimum viable:

  • 4 cores (RISC-V64)
  • 4GB RAM
  • 64GB storage
  • Gigabit Ethernet

Ideal (BananaPi F3 with 16GB):

  • 8 cores (RISC-V64)
  • 16GB RAM
  • 128GB eMMC + NVMe expansion via M.2
  • UPS for power stability

The BananaPi F3 with 8 cores and 16GB RAM exceeds minimum requirements and provides comfortable headroom for concurrent builds. The 8-core processor handles compilation efficiently without becoming a bottleneck.

Software Stack

Required:

  • Debian Trixie or Ubuntu 24.04 (riscv64)
  • Go 1.24+
  • Node.js 18+
  • Docker 28+
  • github-act-runner

Recommended:

  • Systemd for runner management
  • Monitoring (Prometheus, Grafana)
  • Log aggregation (journalctl is enough to start)
  • Backup strategy for runner state

Workflow Design Principles

  1. Use ubuntu-latest for non-compilation tasks: Save your RISC-V64 runner for actual builds
  2. Add cleanup steps: Especially for RPM/DEB builds
  3. Implement verification: Check that automation actually worked
  4. Handle concurrency: Retry logic for repository updates
  5. Document everything: Future you will thank you

Monitoring and Alerts

I monitor:

  • Runner online/offline status (GitHub API)
  • Disk space (alert at 80% full)
  • Build success rate (alert on 2+ failures)
  • Docker daemon health

Simple monitoring catches problems early.

Current Status and Future Plans

Today, the build infrastructure is solid:

  • 3 automated workflows: Docker Engine, CLI, Compose
  • 2 package formats: DEB (APT) and RPM
  • 3 weeks uptime: 99%+ reliability
  • Zero manual intervention: Fully automated builds

But there’s more to do:

Short Term

  • Gentoo overlay generation: Create ebuilds automatically
  • Binary verification: Add checksums and signatures to releases
  • Build caching: Reduce builds from 40 minutes to 20 minutes

Long Term

  • Multi-architecture support: Add ARM64, x86_64 for comparison
  • Runner auto-update: Detect new github-act-runner versions
  • High availability: Second runner for redundancy
  • Performance profiling: Identify bottlenecks in build process

Conclusion

Setting up CI/CD for RISC-V64 is more complex than mainstream architectures, but it’s absolutely achievable. The key insights:

  1. Use the right tools: github-act-runner works where official runner fails
  2. Embrace self-hosted: Persistent state requires different thinking
  3. Test thoroughly: Automation bugs hide in production for weeks
  4. Monitor everything: Catch problems before users do

The RISC-V64 ecosystem is maturing rapidly. A year ago, this setup would have been significantly harder. Today, it’s straightforward if you know the gotchas.

Most importantly: after three weeks of production use, with 47 successful builds serving real users upgrading their Docker installations, I can confidently say that RISC-V64 is ready for production CI/CD. Not “experimental.” Not “beta.” Actually ready.

Now go build something.

References

Appendix: Complete Systemd Service File

[Unit]
Description=GitHub Actions Runner (RISC-V64)
After=network.target docker.service
Wants=network.target

[Service]
Type=simple
User=poddingue
WorkingDirectory=/home/poddingue/github-act-runner-test
ExecStart=/home/poddingue/github-act-runner-test/github-act-runner run
Restart=always
RestartSec=10
KillMode=process
KillSignal=SIGTERM
TimeoutStopSec=5min

# Environment variables (optional)
# Environment="RUNNER_WORKDIR=/home/poddingue/github-act-runner-test/_work"

# Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=github-runner

[Install]
WantedBy=multi-user.target

Appendix: Useful Maintenance Commands

# Check runner status
systemctl status github-runner

# View runner logs (real-time)
sudo journalctl -u github-runner -f

# View recent logs (last 100 lines)
sudo journalctl -u github-runner -n 100

# Restart runner
sudo systemctl restart github-runner

# Update runner
cd ~/github-act-runner-test
git pull
go build -v -o github-act-runner .
sudo systemctl restart github-runner

# Check disk space
df -h ~
docker system df

# Clean Docker
docker system prune -a -f

# Check GitHub runner status via API
gh api repos/gounthar/docker-for-riscv64/actions/runners --jq '.runners[] | {name, status, busy}'

# List recent workflow runs
gh run list --limit 10

# View specific workflow run
gh run view RUN_ID

# Manually trigger workflow
gh workflow run docker-weekly-build.yml

Appendix: Troubleshooting Common Issues

Runner Shows Offline

Check service:

systemctl status github-runner

Check logs:

sudo journalctl -u github-runner -n 50

Common causes:

  • Network connectivity lost
  • Docker daemon not running
  • Authentication token expired (after 90 days)
  • Disk full

Solution:

# Restart
sudo systemctl restart github-runner

# If token expired, reconfigure
cd ~/github-act-runner-test
./github-act-runner remove
./github-act-runner configure --url ... --token NEW_TOKEN
sudo systemctl start github-runner

Build Failures

Check workflow logs:

gh run list --limit 5
gh run view RUN_ID --log

Common causes:

  • Disk space exhausted
  • Docker image pull failures
  • Network timeout
  • Missing dependencies

Solution:

# Clean disk
docker system prune -a
rm -rf ~/github-act-runner-test/_work/_temp/*

# Check available space
df -h ~

# Verify Docker works
docker run --rm hello-world

Duplicate Package Versions

Symptom: Release contains multiple versions of same package.

Cause: Self-hosted runner persistence.

Solution:

# Clean RPM build directory
rm -f ~/rpmbuild/RPMS/riscv64/*.rpm

# For Debian packages
rm -f ~/docker-for-riscv64/debian-build/*.deb

# Add cleanup to workflow (see Bug #3 above)

APT Repository Missing Packages

Symptom: apt-get install docker.io fails, package not found.

Diagnosis:

# Check repository contents
gh api repos/gounthar/docker-for-riscv64/contents/dists/trixie/main/binary-riscv64 --jq '.[] | .name'

# Check what packages exist
curl -s https://gounthar.github.io/docker-for-riscv64/dists/trixie/main/binary-riscv64/Packages | grep "Package:"

Solution: See Bug #1 - ensure all packages are downloaded on every repository update.