Dealing with concurrent bridge-network creates & host-port races in Docker

I’ve been moving our CI off GitHub-hosted runners onto our own arm64 hardware. The plan was straightforward: a pool of ephemeral runners on a dedicated CI box, and each test shard spins up its own MySQL, Redis, and ClickHouse as service containers, all under rootless Docker.

Then I tried to run sixteen of those shards at the same time.

Tl;dr: rootless Docker on a single host can’t reliably stand up sixteen service stacks at once when each one publishes host ports. You hit two separate races, one creating the bridge networks and one binding the published host ports. The network one I could tame with config. The port one I couldn’t, until I stopped publishing host ports at all.

Two races, not one#

Each CI job, the way GitHub Actions models service containers, does roughly this: create a dedicated bridge network, then start MySQL/Redis/ClickHouse on it, publishing each service’s port to a host port so the job (running on the host) can reach it on localhost. Multiply that by sixteen jobs landing within the same second and two things start failing.

The first is the host port. With dynamically assigned ports, two containers race for the same one:

Error response from daemon: failed to set up container networking: driver failed
programming external connectivity on endpoint ..._mysql80_...: error while calling
RootlessKit PortManager.AddPort(): listen tcp4 0.0.0.0:32768: bind: address already in use

32768 is the first port in the ephemeral range, so it’s the one everyone reaches for first. The daemon hands the same port to two containers before either has actually bound it.

The second is the network. Creating sixteen bridge networks at once trips RootlessKit’s network-namespace handling:

failed to restore thread's network namespace error="operation not permitted"
libnetwork: restoring thread network namespace failed error="operation not permitted"

And once you’ve created enough networks without cleaning them up, you run out of address space entirely:

Error response from daemon: all predefined address pools have been fully subnetted

None of this shows up at one, two, or even eight concurrent jobs. It shows up at sixteen, which is exactly the number that made self-hosting worth the effort in the first place.

The mitigations that only half worked#

I went down the list of obvious fixes. Each one helped but none of them really fixed it.

Dynamic ports instead of fixed ports. Fixed 6379:6379-style mappings collide instantly across jobs, so dynamic ports are mandatory. But dynamic ports still race on that first ephemeral port. Necessary, not sufficient.

Publish fewer ports. The ClickHouse image exposes three ports (8123, 9000, 9009) and our service config published all three, but the tests only use the HTTP one. Cutting it to just 8123 is sixteen fewer host-port allocations per wave. It moved the needle. It didn’t stop the race.

Give Docker more networks to work with. The “fully subnetted” error is real exhaustion: the default address pool only carves out around 31 networks. In the rootless daemon’s daemon.json:

{
  "default-address-pools": [
    { "base": "10.200.0.0/16", "size": 24 }
  ]
}

That’s 256 /24 networks instead of ~31. The exhaustion error went away. The namespace race did not.

Serialize the network creates. Rootless can’t do sixteen docker network create calls at once, so I stopped it from trying. A small shim in front of docker that takes a flock only around the contended subcommands:

#!/usr/bin/env bash
# Serialize the rootless-contended docker operations (bridge-network creation)
# across all CI runners; rootless can't handle the simultaneous burst.
real=/usr/bin/docker
lock="/run/user/$(id -u)/docker-serialize.lock"
if [ "$1 $2" = "network create" ]; then
  exec flock "$lock" "$real" "$@"
fi
exec "$real" "$@"

This fixed the network-create race; the operation not permitted errors stopped. The port race didn’t budge, though. The host port gets allocated and bound by the daemon, on its own port-add path, not by the docker CLI call I was wrapping a lock around. A lock on the client doesn’t serialize what the daemon does after it. Different layer.

So I had a stack of real, defensible fixes, and CI was still losing two or three of sixteen shards on most runs. At this point, I went looking for ways of avoiding ports altogether.

The actual fix: stop publishing host ports#

The port race comes from one specific thing: publishing a service’s port onto the host so a host-side process can reach it on localhost. (The network-create and address-pool errors are a separate, IPAM-and-namespace problem; the flock and the bigger address pool already dealt with those.) Rootless Docker’s port forwarding is what’s racing here, and it only exists because the job runs on the host and needs localhost:<port> to work.

So don’t run the job on the host. Run it inside a container on the same network as the services, and reach them by name. No published ports means no RootlessKit port forwarding, so the port race is simply gone.

In GitHub Actions terms, that’s adding a container: to the job and dropping ports: from every service:

jobs:
  run-tests:
    runs-on: [self-hosted, linux, ARM64]
    container:
      image: localhost:5000/ci-toolchain:php8.4
    services:
      mysql:
        image: mysql:8.0
        # no `ports:` block at all
      redis:
        image: redis
      clickhouse:
        image: clickhouse/clickhouse-server:latest
    steps:
      - run: vendor/bin/pest
        env:
          DB_HOST: mysql          # the service name, not 127.0.0.1
          DB_PORT: 3306           # the container port, not a host port
          REDIS_HOST: redis
          CLICKHOUSE_HOST: clickhouse

The job container and the service containers share a network, so mysql:3306 resolves to the MySQL container directly. Nothing gets published to the host, so there’s no host port to bind and nothing for RootlessKit to race on. The jobs still each create a bridge network, so you keep the flock and the wider address pool; this kills the port half of the problem, not the network half.

Proving it#

I didn’t want to trust this on vibes, so I wrote a small harness that does what the runner does (create a network, then create + start three published-port service containers) and runs sixteen of them concurrently. With host ports, on a freshly-restarted daemon:

=== N=16 host-ports, burst=20s ===
     14 OK
      2 START_FAIL_m   (RootlessKit PortManager.AddPort: address already in use)

Two out of sixteen, reliably, run after run. Now the same sixteen with no published ports, services reachable by name only:

=== N=16 noports=1 burst=3s ===
     16 OK

Sixteen out of sixteen, every time. And notice the burst time: three seconds instead of twenty, because there’s no port-forwarding setup to do. Removing the host ports didn’t just fix the race, it made the whole thing faster.

Disclaimer time ;-)#

This isn’t free. Two honest caveats.

The job now runs inside a container, so that container needs every tool your job shells out to. For our test suite that meant baking PHP and its extensions, Composer, Node, the MySQL client, dig, whois, and a browser into the image. A job that “just worked” on the host because the box happened to have those tools will fail in a container until you put them there. That’s a one-time effort.

And some jobs genuinely need a published port. Ours has one browser-based test that starts a mock SAML identity provider with its own docker run, which means it needs Docker access and a reachable port. That one stays a host job, where the host is the Docker host and a single published port can’t collide with anything because nothing else publishes one. Everything else runs in a container. The split is fine; you don’t have to pick one model for the whole pipeline.

The lesson I’d hand to anyone running dense CI on rootless Docker: the network-create side you can config your way out of, with a bigger address pool and a lock around the creates. The host-port side you can’t, not on rootless at this concurrency. So if a job doesn’t genuinely need a published host port, don’t give it one, and most of the pain never starts.

Tested on Docker 29.6.0 (rootless, fuse-overlayfs) on Ubuntu 24.04.4, kernel 6.8, arm64.