Skip to content

Infrastructure and Tooling

The choices in this chapter cover the inner loop — the seconds an engineer waits between saving a file and seeing the result. Optimising that loop is the single largest productivity lever available to a Go + TypeScript service team. The outer loop (CI, registry pushes, cluster rollouts) inherits the same conventions but is addressed in the chapters on Security (06-security.md) and Air-Gap (09-airgap.md).

TL;DR

  • Go services MUST live-reload through Air with a dedicated .air.<binary>.toml per binary. Watched extensions MUST be enumerated explicitly; generated directories MUST be excluded.
  • Frontend MUST use Vite for HMR with a dev-server proxy to the API; environment values stamped at dev-server boot SHOULD surface in the running bundle.
  • A Makefile at the repository root MUST be the single orchestrator for make dev, make build, make test, make lint, make proto, and make migrate. Shell aliases or untracked scripts MUST NOT duplicate these targets.
  • Production container images MUST be multi-stage, with the runtime stage on scratch or gcr.io/distroless/static. The Go toolchain MUST NOT appear in the runtime stage.
  • The Dockerfile Go version MUST be derived from go.mod at build time; it MUST NOT be hardcoded to a value that drifts from go.mod.
  • Local Kubernetes work SHOULD use k3s on the developer host; full clusters (kind, k3d, minikube) MAY be substituted for multi-node experiments.
  • Lint MUST run golangci-lint v2 with the STANDARD preset declared in .golangci.yml. CLI flag overrides for enabled linters MUST NOT be used.

Why this choice

Inner-loop tooling is where engineering tastes are loudest and least defensible. The selections below are deliberately boring: each tool is the most widely deployed open-source option in its niche, each tool ships a configuration file format that survives version bumps, and each tool composes with the others through process management rather than through plugins or shared state.

Two principles drive the slate:

  1. Filesystem-driven configuration. Every tool in this chapter reads a checked-in file (.air.<bin>.toml, vite.config.ts, Makefile, Dockerfile.<bin>, .golangci.yml). Engineers MUST be able to read the file and reproduce the behavior; "it works on my machine because I have a global config" is a class of defect this slate forecloses.
  2. Single orchestrator. The Makefile is the canonical orchestrator for build, test, lint, code generation, and dev-stack lifecycle. IDE-specific run configurations MAY exist for editor ergonomics, but they MUST delegate to the Makefile targets so a fresh clone needs only make and the language toolchains in PATH.

External anchors:

Prescriptive

Air for Go live reload

  • Each Go binary that participates in the inner loop MUST have its own Air configuration file named .air.<binary>.toml at the repository root. Sharing a single Air config across multiple binaries is prohibited because Air rebuilds the entire watched root and exits with non-zero on any participant's compile failure.
  • The Air configuration file MUST set root = "." and a binary-specific tmp_dir (for example, tmp/<binary>/). The tmp_dir MUST be gitignored.
  • The [build] section MUST set cmd to the exact go build invocation used in production, including the -ldflags block that stamps the version, commit, and date. Hardcoding -ldflags="" in dev produces empty version strings in /local/infrastructure-style diagnostic endpoints and SHOULD NOT happen.
  • The [build] section MUST set entrypoint to a vector that includes any environment variables required for dev mode (for example, ["env", "<APP>_DEV_MODE=true", "tmp/<binary>/<binary>", "serve"]). Setting environment variables in the shell that launches Air leaks them into every Air-spawned child and is harder to reason about.
  • include_ext MUST enumerate the extensions Air watches (["go", "yaml"] is a sensible default). Watching all extensions causes spurious rebuilds on editor swap files.
  • exclude_dir MUST exclude every generated, vendored, and documentation directory in the repository. New top-level directories introduced after the file is authored MUST be added to exclude_dir in the same commit that introduces them.
  • exclude_regex = ["_test\\.go$"] MUST be set; rebuilding the binary on a test file change is wasted work because make test already recompiles the package under test.
  • delay = 1000 (milliseconds) SHOULD be the default; lower values cause double-builds when editors save through a temp-rename cycle.
  • stop_on_error = true MUST be set; Air MUST NOT relaunch a stale binary if the most recent rebuild failed.
  • send_interrupt = true and kill_delay (in milliseconds) MUST be set together. kill_delay MUST be at least 500 for short-lived HTTP servers and SHOULD be 3000 for binaries that manage Kubernetes controllers or long-running watches; these need a graceful shutdown window before SIGKILL.

Vite for frontend HMR

  • The frontend MUST use Vite for dev-server HMR. The dev server MUST proxy API routes to the local Go server so the browser never sees a CORS preflight in dev.
  • The Vite config MUST stamp build-time environment values (commit, version, build date) into define so the running bundle reports a real version in its /about or /infrastructure view. The stamping MUST read from process.env at config-load time, not from a client-side fetch.
  • The dev server's --host 0.0.0.0 and --port MUST be set explicitly in the launch command. Letting Vite pick an ephemeral port breaks the proxy contract and the documented developer URL.

Make-driven dev orchestration

  • The Makefile at the repository root MUST declare at least these phony targets:
  • dev — start the full local stack (Postgres, server, operator, web)
  • build — produce all production binaries under bin/
  • test — run the unit test suite
  • test-integration — run integration tests behind a build tag
  • lint — run golangci-lint run ./...
  • vet — run go vet ./...
  • fmt — verify gofmt -l . is empty; do NOT modify files in CI
  • proto (or generate) — regenerate sqlc, buf, and type bindings
  • migrate — apply pending database migrations
  • check — composite target: lint vet fmt test
  • Every target MUST be declared in .PHONY. Targets that depend on generated artifacts (for example, build depending on proto) MUST declare those dependencies explicitly so a fresh clone produces a working binary from make build alone.
  • The Makefile MUST source build-time variables (VERSION, COMMIT, DATE) from git with ?= so CI can override them via the environment.
  • Per-binary targets MUST follow the pattern bin/<name>: ; go build -ldflags="$(GO_LDFLAGS)" -o bin/$@ ./cmd/$@ so adding a new binary is a one-line BINARIES := edit.

Multi-stage Docker images

  • Every production image MUST use a multi-stage Dockerfile with at least two stages: builder and runtime.
  • The builder stage MUST use the official golang:<version> image pinned to the same minor as go.mod. The image MUST be addressed by digest in production; addressing by mutable tag is acceptable only in dev compose stacks.
  • The runtime stage MUST use scratch (preferred for pure-Go static binaries) or gcr.io/distroless/static-debian12:nonroot (preferred when the binary needs /etc/ssl/certs/ca-certificates.crt and a nonroot user). The runtime stage MUST NOT contain apt, apk, bash, or any package manager.
  • USER MUST be set to a non-zero numeric UID (for example, 65532 for distroless nonroot). The container MUST NOT run as root.
  • HEALTHCHECK SHOULD be defined where the runtime image supports it. scratch does not include a shell; for scratch images, runtime health MUST be exercised by the orchestrator probe (Kubernetes liveness/readiness) rather than by HEALTHCHECK.
  • The image MUST set LABEL org.opencontainers.image.source and LABEL org.opencontainers.image.version so the registry can resolve back to the source revision.

Dockerfile Go version pinning

  • The Dockerfile's golang:<version> builder stage MUST be derived from go.mod at build time. The toolchain version MUST NOT be hardcoded in the Dockerfile where it could drift from go.mod.
  • The recommended pattern is a Makefile target that reads the go <version> line from go.mod and passes it as a --build-arg GO_VERSION=<value> to docker build, with the Dockerfile declaring ARG GO_VERSION and FROM golang:${GO_VERSION}-alpine. An equivalent pattern is a generated .go-version stamp file that both the Dockerfile and CI matrix read from.
  • CI MUST fail the build when go.mod toolchain version and the Dockerfile builder stage disagree. A single shell line in the make build target — grep "^go " go.mod | awk '{print $2}' — is sufficient to extract the canonical version.

scripts/dev.sh entrypoint

  • A single shell script (conventionally scripts/dev.sh) MUST start the full local stack as background processes and write a PID file for later teardown. Engineers MUST be able to run one command and have a working stack.
  • The script MUST verify that required tools (docker, air, go, node, npx) are on PATH before doing any work, and MUST print install hints (URLs or go install commands) when any are missing.
  • The script MUST trap SIGINT and SIGTERM and terminate every child process cleanly. Leaked Air processes from a prior session produce ping-pong restarts in the new session and MUST NOT happen.
  • The script MUST source .env.dev (tracked, canonical defaults) and then .env.dev.local (gitignored, per-developer override). A missing .env.dev.local MUST be auto-created from .env.dev so a fresh clone has an editable file without touching the tracked template.
  • The script MUST rotate log files on start: stale logs from the previous session MUST be renamed with a timestamp suffix so the current session's logs are not commingled with history.
  • The script MUST nohup background processes; without nohup, SSH disconnection sends SIGHUP to Air and the dev stack mysteriously exits.
  • Subcommands MUST include at least up, down, status, logs, migrate, and migrate-down. Subcommand dispatch MUST validate the argument and print usage on an unknown command.

k3s for local clusters

  • Local Kubernetes work SHOULD use k3s installed via the upstream get.k3s.io script. Engineers SHOULD NOT install k3s through per-distro package managers because the upstream script handles systemd unit setup deterministically.
  • The dev script MUST make /etc/rancher/k3s/k3s.yaml readable by the developer's user account (chmod 644 is acceptable on a single-user workstation; production hosts MUST NOT relax this).
  • The dev script MUST uncordon the local node on startup if k3s's InvalidDiskCapacity race left it cordoned. The check is kubectl get node -o jsonpath='{.items[0].spec.unschedulable}' and the remediation is kubectl uncordon <name>.
  • For multi-node experiments, k3d MAY be used; it spins up k3s inside Docker and is friendly to laptops. Production deployments MUST NOT depend on either k3s or k3d.

golangci-lint v2 STANDARD pack

  • Lint MUST run via golangci-lint run ./... with the v2 series. v1 is end-of-life as of golangci-lint v2.0; new projects MUST start on v2.
  • The STANDARD preset (~50 linters) MUST be the baseline. Disabling a STANDARD linter MUST be done in .golangci.yml with a comment explaining the rationale and the spec ID that authorizes the exception. CLI flags MUST NOT be used to disable a STANDARD linter because flags are invisible to code review.
  • Complexity thresholds (cyclop.max-complexity, gocognit.min-complexity, funlen.lines, funlen.statements, nestif.min-complexity) MUST be set in .golangci.yml to values aligned with the community "golden config" baseline rather than the linter defaults. The defaults are stricter than every surveyed major OSS Go codebase and produce churn without correctness benefit.
  • run.timeout MUST be at least 15 minutes for repositories with more than 5,000 Go files. The default 5-minute timeout is exhausted in CI by the linter's package-loading phase before any linter runs.
  • run.modules-download-mode: readonly MUST be set so a lint run never mutates go.mod or go.sum.
  • The formatters block MUST enable gofmt and goimports. The goimports.local-prefixes setting MUST be set to the module's canonical import path so first-party imports are grouped separately from third-party imports.
  • Path-level exclusions (linters.exclusions.paths) MUST be enumerated for scratch directories (tmp/) and intentionally-out-of-scope helpers. Per-path linter relaxations (for example, allowing init() in cmd/) MUST be expressed in linters.exclusions.rules rather than via build tags.

Reference Implementation: Pioneer

Concrete files in the Pioneer donor codebase that implement the prescriptions above:

  • Air configs/home/ubuntu/pioneer/.air.server.toml and /home/ubuntu/pioneer/.air.operator.toml. Each binary has its own file; both stamp buildinfo.Version / Commit / Date via -ldflags in the [build] cmd line; both set PIONEER_DEV_MODE=true in entrypoint; both enumerate the same exclude_dir list covering tmp, bin, web, node_modules, e2e, e2e-results, deploy, docs, scripts, and a long tail of generated paths. kill_delay is 500 ms for the server (HTTP) and 3000 ms for the operator (long-running controller watches).
  • golangci-lint config/home/ubuntu/pioneer/.golangci.yml (version: "2", 15-minute timeout, STANDARD pack, complexity thresholds at the golden-config baseline: cyclop.max-complexity: 30, gocognit.min-complexity: 20, funlen.lines: 100, funlen.statements: 50, nestif.min-complexity: 5; formatters block enables gofmt and goimports with local-prefixes: github.com/AlphaBravoCompany/pioneer).
  • Makefile/home/ubuntu/pioneer/Makefile. The BINARIES := list drives a single rule that builds every cmd; GO_LDFLAGS is computed once and reused; check is the composite quality gate (lint vet fmt test security); docker-build iterates over a fixed list of per-binary Dockerfiles with VERSION/COMMIT/DATE --build-args; fips-build is a separate target that flips GOEXPERIMENT=boringcrypto and adds a -tags fips build tag.
  • scripts/dev.sh/home/ubuntu/pioneer/scripts/dev.sh. The script implements require_dev_tools, load_env (with .env.dev / .env.dev.local precedence), start_pg, start_server, start_operator, start_web, ensure_k3s, ensure_kubeconfig_readable, ensure_node_schedulable, migrate_db, and cmd_up / cmd_down / cmd_status / cmd_logs / cmd_migrate / cmd_migrate_down. Air is started via nohup air -c .air.<bin>.toml. Log rotation in rotate_log() preserves prior-session logs with a timestamp suffix.

Pinned versions

The table below records the versions a fresh project SHOULD adopt as of the chapter's last-reviewed date. Bumping a row in this table MUST include reviewing the corresponding upstream release notes and updating the relevant last-reviewed field.

Component Version pinned Rationale
Go toolchain 1.26.1 Latest stable; matches the donor mise baseline.
Air (live reload) v1.62.0 or later (the air-verse/air fork) The cosmtrek/air repository moved; air-verse is the maintained successor.
Node.js 25.8.1 Matches the donor mise baseline; aligns with Vite 7 system requirements.
Vite 7.x Current major; supports vite.config.ts define for build-time stamping.
Docker (engine) 27.x or later Required for multi-platform builds and --platform=$BUILDPLATFORM.
golangci-lint v2.x STANDARD pack; v1 EOL.
k3s v1.32.x (LTS channel) Matches the upstream Kubernetes LTS cadence.
gotestsum v1.12.x JUnit XML output (covered in 07-testing.md).

Pitfalls

The bullets below describe common shapes of failure and the countermeasure SHOULD/MUST that this chapter prescribes.

  • Hardcoded go.mod Go version in Dockerfile. Pin the builder stage version through a --build-arg GO_VERSION derived from go.mod so the Dockerfile cannot drift. A CI step MUST compare the two values.
  • Air rebuilding on test-file edits. Set exclude_regex = ["_test\\.go$"]. Test changes are exercised by make test, not by Air.
  • Missing nohup around background dev processes. Without nohup, SSH disconnection sends SIGHUP to Air and the dev stack exits silently. Wrap every Air launch and the Vite launch in nohup and >> their stdout into rotated log files.
  • Disabling a STANDARD linter via the CLI. A --disable flag in CI is invisible to code review and to local runs. SHOULD encode all policy in .golangci.yml; the file is the contract.
  • Single Air config shared across binaries. Air rebuilds the watched root and exits non-zero on any participant failure, so a shared config causes one binary's compile error to take down every binary. SHOULD give each binary its own .air.<bin>.toml.
  • Root user in the runtime image. Set USER to a non-zero numeric UID. Distroless nonroot uses 65532; scratch images SHOULD declare a numeric UID explicitly with USER 65532:65532.
  • Mutable Docker tag references in production. The builder stage MAY use golang:<version> by tag for developer ergonomics, but production base images MUST be addressed by digest so a registry republishing the tag cannot inject an unreviewed layer.
  • PATH-missing tools surfacing as cryptic log entries. The require_dev_tools check in the dev script SHOULD fail fast with install hints; without it, missing tools produce empty log files several minutes into a debugging session.
  • k3s node cordoned after restart. k3s has an InvalidDiskCapacity race on restart that can leave the local node unschedulable. The dev script SHOULD uncordon the node on startup rather than expect engineers to discover the symptom by hand.

See also

  • RFC 2119 keywords — every MUST/SHOULD/MAY in this chapter follows the canonical definitions.
  • Twelve-Factor App — Factor III (Config), Factor V (Build / Release / Run), Factor X (Dev/Prod Parity).
  • Air documentation — full TOML schema reference.
  • Vite Configuration Reference — the define and server.proxy blocks underpin the dev-stack contract.
  • golangci-lint v2 docs — STANDARD preset, formatters, and linters.exclusions schema.
  • k3s documentation — installation and kubeconfig handling.
  • Distroless images — the static-debian12:nonroot image documented above.
  • Chapter 05-observability.md — the dev compose stack referenced in scripts/dev.sh includes Prometheus, Tempo, Loki, and Grafana containers wired to the OTel exports prescribed there.
  • Chapter 06-security.md — image hardening, gosec/govulncheck, and FIPS-mode build tags are tied to the multi-stage builds prescribed here.
  • Chapter 07-testing.mdmake test / make test-integration split, build-tag separation, and the JUnit XML pipeline.
  • Future ADRs — toolchain version source-of-truth (go.mod vs. separate stamp file) is a candidate ADR; STANDARD-pack scope is a candidate ADR.