Testing¶

Tests are the executable specification of a system's behavior. They MUST be deterministic, isolatable, and runnable on a developer laptop without a privileged backend. This chapter prescribes the Go-side test idioms, the integration-test plumbing, the browser-level end-to-end runner, and the CI artifact format.

TL;DR¶

Go tests MUST be table-driven: one outer test function, one slice of named cases, one t.Run(name, func(t *testing.T) { ... }) per case, and t.Parallel() inside each subtest.
Integration tests MUST exercise a real PostgreSQL via testcontainers-go; per-test database isolation MUST be provided by pgtestdb using template-clone snapshots.
Mocking at the database boundary MUST NOT happen. The query layer MUST be exercised against a real Postgres in integration tests; only layers ABOVE the query interface MAY be unit-tested with fakes.
Browser-level end-to-end tests MUST use Playwright. The configuration file MUST be checked in; auth state MUST be captured via a setup project and reused via storageState.
CI output MUST be JUnit XML produced by gotestsum so the CI platform's test-report widget can render per-test results.
Unit tests and integration tests MUST live in separate build-tag partitions (//go:build integration for integration); make test runs unit, make test-integration runs integration.

Why this choice¶

Three forces shape the slate.

The DB boundary is the most defect-prone layer of a Go service. Mocking it hides the bugs that motivate integration testing in the first place. Real-Postgres integration tests catch type coercions, constraint violations, transaction-isolation surprises, and SQL syntax errors that mocks paper over.
Isolation cost MUST be amortized. A naive "spin up a container per test" model is too slow to run on every save. testcontainers provides the container; pgtestdb provides per-test template clones from a single warmed schema; together they amortise the container-start cost across the entire suite.
CI artifacts MUST be machine-readable. Plain go test output is opaque to GitHub Actions' "Tests" tab and to most reporting tools. JUnit XML is the lingua franca.

External anchors:

Go Test Documentation — t.Run, t.Parallel, t.Cleanup, testing.M semantics.
testcontainers-go documentation — the canonical container-management library.
pgtestdb README — the template-clone pattern.
Playwright Documentation — fixtures, projects, dependencies, retries.
gotestsum README — JUnit XML and live-output formats.

Prescriptive¶

Table-driven tests¶

Every test function that exercises more than one input case MUST use the table-driven form:

func TestParse(t *testing.T) {
    cases := []struct {
        name    string
        input   string
        want    Value
        wantErr bool
    }{
        {name: "empty input", input: "", wantErr: true},
        {name: "well-formed", input: "ok", want: Value{...}},
    }
    for _, tc := range cases {
        t.Run(tc.name, func(t *testing.T) {
            t.Parallel()
            got, err := Parse(tc.input)
            if tc.wantErr {
                if err == nil { t.Fatalf("want err, got nil") }
                return
            }
            if err != nil { t.Fatalf("unexpected err: %v", err) }
            if got != tc.want { t.Fatalf("got %v, want %v", got, tc.want) }
        })
    }
}

t.Parallel() MUST be the first call inside every subtest body unless the subtest specifically depends on serial state (a writable filesystem path, an environment variable). The Go 1.22 loop-variable semantics make capturing tc safe; earlier Go releases MUST capture with tc := tc and a // keep test cases independent comment.
Each case MUST have a stable, descriptive name field. The t.Run name becomes the JUnit subtest name and the -run filter; magic strings produce unfilterable failures.
t.Cleanup MUST be used in preference to defer inside test helpers; Cleanup runs even when t.FailNow was called.
Helper functions that call t.Fatalf MUST call t.Helper() first so the failure line points at the caller, not the helper.

testcontainers + pgtestdb for real-DB integration tests¶

Integration tests MUST stand up a real PostgreSQL container. The canonical wiring is testcontainers-go/modules/postgres with the WithDatabase, WithUsername, and WithPassword options.
The container MUST be started once per test process, not once per test. The recommended pattern is a TestMain that starts the container, captures the connection string, and passes it through a shared package-level variable to pgtestdb.New(t, ...).
Per-test database isolation MUST be provided by pgtestdb. pgtestdb uses the container's template-database mechanism to clone a freshly migrated database for each test; clones complete in single-digit milliseconds because Postgres copies pages from the template.
Each test MUST call pgtestdb.New(t, conn, migrator) and receive a fresh *sql.DB. Shared mutable state across tests MUST NOT happen.
The migrator function passed to pgtestdb.New MUST be the same goose Migrate implementation production uses. Forking the migration logic for tests defeats the value of integration testing.
The container image tag MUST match the production Postgres version to the minor. Cross-minor differences in default pg_hba.conf and in error messages are a recurring source of pass-on-laptop / fail-in-CI flake.

No mocks for the DB boundary¶

The query layer (the sqlc-generated Queries interface and any hand-written wrappers around pgx) MUST be exercised against a real Postgres in integration tests. A mock implementation of the query interface MUST NOT be used in integration coverage.
Layers ABOVE the query interface (handlers, business-logic services) MAY accept a Queries interface and be unit-tested with a hand-written fake. The fake MUST live in _test.go files; it MUST NOT be exported.
The reason for this rule is empirical: mocked Queries fakes drift from the real query semantics (NULL handling, type coercion, error shape) and accumulate "passes in CI, fails in prod" defects. The cost of a real-Postgres test (single-digit milliseconds per template clone) is far below the cost of debugging a mock-divergence defect.
Repository tests (the layer that wraps Queries with retry, caching, or transaction-management logic) MUST run in the integration partition with real Postgres. Stubbing the underlying Queries interface in repository tests MUST NOT happen.

Playwright for browser E2E¶

Browser-level end-to-end tests MUST use Playwright. The configuration file (web/playwright.config.ts or equivalent) MUST be checked in.
The Playwright config MUST set testDir and outputDir explicitly; defaulting to the working directory creates surprises when CI's working directory shifts.
fullyParallel SHOULD default to false until the suite is proven order-independent; setting fullyParallel: true on a suite that mutates shared state introduces flake. Tests MUST be written to be order-independent and the team MUST flip fullyParallel: true once that is true.
retries MUST be zero in PR builds. Retries hide flake; flake is a defect. CI MAY set retries: 1 on the main branch as a safety net, but new failures MUST be triaged within the same SLA as test failures.
Auth state MUST be captured by a setup project that logs in once and writes storageState to a known path (for example, e2e/.auth/admin.json). Subsequent projects MUST list the setup project under dependencies and reference the saved storageState via use.storageState.
trace: "retain-on-failure", screenshot: "only-on-failure", and video: "retain-on-failure" MUST be set. The CI artifact bundle MUST upload playwright-report/ and the outputDir so a reviewer can replay a failed run from the PR.
The baseURL MUST be configurable via an environment variable with a sensible localhost default (for example, process.env.URL ?? "http://localhost:8080"). Hardcoding the URL forecloses running the suite against a staging environment.

`gotestsum` for JUnit XML¶

make test and make test-integration MUST shell out to gotestsum rather than to go test directly. The canonical invocation is gotestsum --format pkgname --junitfile test-results.xml -- -race -count=1 ./....
The JUnit XML output MUST be uploaded as a CI artifact and MUST be fed into the CI platform's test-report widget (GitHub Actions' dorny/test-reporter, GitLab's JUnit ingestion, or equivalent).
gotestsum --format pkgname MUST be used for the live console output; the verbose go test -v output is unreadable on a multi-package suite.
-race MUST be set on every test run. Race detector overhead is acceptable on the dev laptop and required for catching the concurrency defects this slate's architecture invites.
-count=1 MUST be set to disable Go's test result cache; a cached pass on stale source defeats the CI gate.

Build-tag separation of unit and integration¶

Integration test files MUST carry a //go:build integration build tag in the first line of the file:

//go:build integration

package foo_test

make test MUST run unit tests only (go test -race -count=1 ./...). It MUST complete in tens of seconds on a developer laptop without Docker running.
make test-integration MUST run integration tests (go test -tags integration -race -count=1 ./...). It MUST be skippable in PR builds for branches that do not touch the query layer, and MUST be required on the merge-to-main gate.
The Docker daemon MUST NOT be a prerequisite for make test. Engineers without Docker installed MUST still be able to run the unit suite.
The same package MAY contain both unit and integration test files; the build tag is the discriminant. The team MUST NOT split integration tests into a sibling directory because import paths and helper visibility diverge across directories.

Reference Implementation: Pioneer

The Pioneer donor codebase implements the browser E2E layer above in /home/ubuntu/pioneer/web/playwright.config.ts. The config sets testDir: "./e2e", outputDir: "./e2e-results", fullyParallel: false, retries: 0, a 30-second test timeout with a 10-second expect timeout, baseURL from process.env.PIONEER_URL with a http://localhost:8080 default, trace: "retain-on-failure", screenshot: "only-on-failure", video: "retain-on-failure", and a two-project setup: a setup project matching global-setup.ts and a smoke project that depends on setup and loads storageState: "e2e/.auth/admin.json". The shape is the canonical pattern for a Playwright suite with auth-state reuse, and adopters SHOULD mirror the projects/dependencies/storageState wiring.

The donor's go.mod also pins the testcontainers-go module at a current minor (v0.42.0) and includes the postgres submodule at the same version. Adopters SHOULD treat the testcontainers version as informational — testcontainers-go is on a stable v0.4x line with frequent minor bumps; tracking it within one minor of upstream is the practical guidance.

Pinned versions¶

Component	Version pinned	Rationale
Go toolchain	1.26.1	Loop-variable semantics simplify table-driven tests.
`github.com/testcontainers/testcontainers-go`	v0.42.x	Stable v0.4x line; Postgres module ships in lockstep.
`github.com/testcontainers/testcontainers-go/modules/postgres`	v0.42.x	Matches the testcontainers core minor.
`github.com/peterldowns/pgtestdb`	v0.1.x	Template-clone pattern; small surface, stable.
`gotestsum`	v1.12.x	JUnit XML output and live console format.
`@playwright/test`	^1.59.1	Latest stable; matches the donor `package.json` baseline.
Postgres container image	`postgres:16-alpine`	Matches the production minor; alpine for footprint.

Pitfalls¶

Mocking the DB. Mocks drift from real semantics. SHOULD run the query layer against a real Postgres in integration tests.
Shared mutable test state. A package-level *sql.DB reused across tests guarantees flake. Each test MUST own a fresh template-cloned database from pgtestdb.
Missing t.Parallel. Serial tests artificially lengthen CI wall-clock. Every subtest body SHOULD call t.Parallel() unless it specifically depends on serial state.
-count=1 omitted from CI. The Go test cache MAY return a pass for a regressed source tree. -count=1 disables the cache.
Setting retries: 1 in Playwright PR builds. A flake hidden by a retry stays in the codebase. SHOULD set retries: 0 in PRs and triage every failure.
fullyParallel: true on an order-dependent suite. Flake appears as cross-test data leakage. SHOULD prove order independence first, then flip the flag.
Hardcoded baseURL in playwright.config.ts. Forecloses running against staging. Source from process.env with a localhost default.
Sharing the production migration logic only "in spirit" with pgtestdb. Two divergent migrators means tests no longer cover production behavior. SHOULD pass the same migrator to pgtestdb.
Mixing unit and integration tests without the //go:build integration tag. Developers without Docker can no longer run make test. Tag the integration files.
No JUnit upload from CI. Engineers cannot see which test failed without scrolling raw logs. SHOULD pipe gotestsum --junitfile and upload it.
Cross-minor Postgres mismatch (laptop Postgres 17, CI Postgres 16). Default pg_hba.conf, error wording, and a handful of query plans differ across minors. Pin the test container to the production minor.