Data¶
This chapter prescribes the data layer for a Go service: the
database engine, the query toolchain, the migration tool, the
encryption-at-rest column shape, the discipline that keeps JSONB
fields tractable, soft-delete and tenant-scoping invariants, and
the LISTEN / NOTIFY pattern used for in-process eventing. The
RFC 2119 keywords used here (MUST, MUST NOT, SHOULD, SHOULD NOT,
MAY) carry the senses defined in chapter 00-introduction.md. The
prescriptions in this chapter populate the store.go layer
defined in 01-architecture.md.
TL;DR¶
- PostgreSQL is the canonical relational engine. Adopting orgs MAY swap engines but MUST account for the loss of advisory locks, LISTEN / NOTIFY, and JSONB.
- sqlc generates typed accessors from annotated
.sqlfiles; hand-written query strings SHOULD NOT appear in domain code. - goose runs migrations from an
embed.FSand MUST acquire a PostgreSQL advisory lock before applying any migration. - Encrypted columns MUST carry a
(key_id, algorithm, ciphertext)shape so that keys MAY be rotated and algorithms MAY evolve without rewriting the storage layer. - Soft-deleted rows MUST be filtered with
deleted_at IS NULLin every read path; multi-tenant tables MUST be filtered byorg_id(or the equivalent tenant key) in every read path.
Why this choice¶
- PostgreSQL offers JSONB, LISTEN / NOTIFY, advisory locks, proper transactional DDL, and a mature ecosystem of pure-Go drivers. These features simplify the rest of the stack and remove the need for several auxiliary services.
- sqlc turns annotated SQL into typed Go accessors. The generated code keeps domain logic free of string-interpolated SQL, avoids ORM overhead, and surfaces type drift at compile time rather than at runtime.
- goose is a thin migration tool with first-class support for
embed.FS, advisory locks, and timestamped migration files. It MUST be paired with the advisory-lock guard so that two replicas starting simultaneously cannot race to apply the same migration. (key_id, algorithm, ciphertext)columns make key rotation a row-by-row rewrite rather than a downtime event. A new key MAY be introduced with a newkey_idwhile the prior key is still available for decrypting historical rows.- JSONB discipline prevents the field from becoming an ungoverned dumping ground. A documented schema per JSONB column keeps the field as ergonomic as a typed column without losing flexibility.
- LISTEN / NOTIFY offers a low-latency in-process event channel without standing up a separate message broker for workloads where messages MAY be lost on restart.
Prescriptive guidance¶
Engine and driver¶
- Services MUST default to PostgreSQL 16 or newer.
- Services MUST use
pgx(v5) as the driver and connection pool. The standard librarydatabase/sqladapter SHOULD NOT be used for new code because it loses pgx's typed-parameter benefits. - Services SHOULD wrap the pool with an OpenTelemetry tracer
(such as
otelpgx) and a query-level tracer that records the call site for slow-query diagnosis. The two tracers MAY be composed via a multi-tracer adapter. - The database pool MUST be constructed in the composition root
and injected into stores; package-level
*pgxpool.Poolglobals MUST NOT be used.
sqlc workflow¶
- Queries MUST be authored as annotated SQL files under
<repo>/internal/server/database/queries/<domain>.sql. - Each query MUST carry an
-- name: <Name> :<kind>annotation (:one,:many,:exec,:execrows, etc.) so sqlc generates a typed Go function with the correct return shape. sqlc generateMUST run as part of the build pipeline. The generated*.sql.gofiles MUST be checked in so that consumers who do not runsqlcstill build the project.- Domain code MUST NOT contain hand-written query strings.
Dynamic queries (for example, optional filter combinations)
MAY use a SQL builder such as
squirrel, but the use MUST be justified in code review and SHOULD be replaced with sqlc- generated queries when the variant set is bounded.
goose migrations¶
- Migrations MUST live under
<repo>/internal/server/database/migrations/. - Migration files MUST follow goose's
NNNNNNNNNNNN_<slug>.sqlnaming with explicit-- +goose Upand-- +goose Downblocks. - The migration runner MUST embed the migrations directory via
//go:embedso the binary ships migrations without a separate artifact. - The migration runner MUST acquire a PostgreSQL advisory lock
(for example,
pg_advisory_lock(<numeric_key>)) before applying migrations. The lock MUST be released after the run whether the run succeeds or fails. - Migrations MUST be reversible when feasible. A destructive
Downstep MAY be omitted only when documented in the migration body and reviewed. - Migrations MUST run before the HTTP / RPC server starts accepting traffic; a failed migration MUST cause the binary to exit non-zero.
Encryption-at-rest column shape¶
Columns holding encrypted user data MUST be stored as
(key_id, algorithm, ciphertext) triples. The minimum shape:
CREATE TABLE secrets (
id uuid PRIMARY KEY,
org_id uuid NOT NULL,
key_id text NOT NULL,
algorithm text NOT NULL,
ciphertext bytea NOT NULL,
nonce bytea NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
deleted_at timestamptz
);
key_idMUST identify the wrapping key version so that a new key MAY be introduced without losing the ability to decrypt historical rows.algorithmMUST identify the algorithm (for example,AES-256-GCM) so that algorithms MAY evolve over time.nonceMUST be stored when the algorithm requires it; nonces MUST be unique per(key_id, ciphertext)pair.- Encryption SHOULD use envelope encryption with a key-management service holding the key-encryption key; the data-encryption keys SHOULD be stored alongside the ciphertext or derived from the wrapping key on demand.
JSONB discipline¶
- Each JSONB column MUST have a documented schema, ideally
authored as a JSON Schema file under
<repo>/internal/server/database/jsonb-schemas/<column>.schema.json. - Domain code MUST validate JSONB writes against the schema before insert. Stale rows MAY violate the schema; readers MUST be prepared either to migrate the row in place or to reject it with a typed error.
- Indexes on JSONB fields MUST be created with the appropriate
GIN operator class (
jsonb_path_opsfor containment queries, the default operator class for?/?&/?|queries). - JSONB columns MUST NOT be used as a replacement for missing schema discipline. A field that is always read with a fixed shape MUST be promoted to a typed column.
Soft delete and tenant scoping¶
- Tables holding user data MUST include a
deleted_at timestamptz NULLcolumn. Deletes MUST setdeleted_at = now()rather than removing the row. - Every read query against a soft-deletable table MUST include
WHERE deleted_at IS NULL(or a sqlc-annotated equivalent). Reviewers MUST treat the filter as part of the query's signature; a missing filter is a defect, not a stylistic preference. - Multi-tenant tables MUST include an
org_idcolumn (or the organization's equivalent tenant key). Every read and write MUST scope byorg_id. Cross-tenant reads MUST be explicit, justified, and reviewed. - Foreign-key constraints across tenants SHOULD be denied at the schema level; cross-tenant references SHOULD use opaque identifiers that are validated at the application layer rather than enforced by referential integrity.
LISTEN / NOTIFY for in-process eventing¶
- Services that need low-latency intra-cluster eventing MAY use
PostgreSQL
LISTEN/NOTIFY. - Notifications MUST be treated as best-effort; consumers MUST reconcile state from the canonical row on receipt rather than rely on the notification payload as the source of truth.
- A dedicated listener connection MUST be held outside the main
pool so that a long-running
LISTENdoes not exhaust the application's pool. - Channel names MUST be namespaced by domain (for example,
credentials_changedrather thanchanged) to avoid collision across services that share a database.
sequenceDiagram
participant Writer as Service (writer)
participant DB as PostgreSQL
participant Listener as Service (listener)
Writer->>DB: BEGIN; UPDATE row ...; NOTIFY channel, 'row_id'
DB-->>Writer: COMMIT
DB-->>Listener: NOTIFY channel, payload
Listener->>DB: SELECT row WHERE id = payload AND deleted_at IS NULL AND org_id = ?
DB-->>Listener: canonical row
Listener->>Listener: apply side effect (idempotent)
Reference Implementation: Pioneer
The data prescriptions are exercised end to end in the canonical reference implementation:
pioneer/sqlc.yamlconfigures sqlc to generate accessors from annotated SQL withpgx/v5-compatible types, explicitint64handling, and per-package output directories that align with the three-layer domain shape.pioneer/internal/server/database/pool.gobuilds thepgxpool, layers anotelpgxtracer on top, and composes a project-localDBQueryTracerfor query-level instrumentation via a multi-tracer adapter.pioneer/internal/server/database/queries/credentials.sqlshows the-- name: ... :one/:manyannotation convention, tenant scoping viaorg_id, and soft-delete filters viadeleted_at IS NULLapplied uniformly across read paths.
Pinned versions (snapshot 2026-05-08)¶
| Component | Version | Notes |
|---|---|---|
| PostgreSQL | 16.x | Logical replication and JSONB stable |
| pgx | v5.x | Pure-Go driver and pool |
| otelpgx | latest stable | OpenTelemetry instrumentation for pgx |
| sqlc | v1.27.x or newer | pgx/v5 target |
| goose | v3.x | Embedded migrations, advisory lock |
| squirrel | v1.5.x | Dynamic filtering when sqlc variants explode |
Adopting orgs MUST validate these versions against their own
toolchain constraints at adoption time. The snapshot date above
matches last-reviewed in this chapter's frontmatter.
Pitfalls¶
- Missing advisory lock during goose run. Two replicas
starting concurrently MAY race to apply the same migration. The
runner MUST acquire
pg_advisory_lock(<key>)for the duration of the run, including the case where the runner exits exceptionally. - Encrypting without a
key_idcolumn. Storing only the ciphertext forecloses key rotation. New tables MUST carry(key_id, algorithm, ciphertext)(plusnoncewhen the algorithm requires it) from the first migration. - JSONB as a free-for-all. Without a documented schema, JSONB drifts to the point where readers cannot make assumptions. Each JSONB column MUST have a schema file and writer-side validation; readers MUST handle stale rows explicitly.
- Forgetting
deleted_at IS NULLon a single query. A read path that omits the filter quietly resurrects deleted rows. Reviewers MUST treat the filter as part of the type signature of any query against a soft-deletable table. - Cross-tenant reads without an explicit override. A query
that omits
org_idreturns rows for every tenant. Reviewers MUST require an explicit, justified override on any cross-tenant read, and MUST verify that the override is logged. - Holding LISTEN connections inside the main pool. A
LISTENpins a connection for its lifetime; left in the main pool it starves application queries. The listener MUST hold a dedicated connection outside the application pool. - Notification payloads treated as the source of truth. Notifications are best-effort; consumers MUST refetch the canonical row before acting.
See also¶
00-introduction.md— RFC 2119 keyword definitions and doc conventions used throughout this chapter.01-architecture.md— the three-layer pattern whosestore.gofiles call into the accessors prescribed here.03-surface.md— the RPC surface that calls into the domain services backed by this data layer.06-security.md— authentication and authorization that tie into theorg_idscoping prescribed in this chapter (sibling casting).- sqlc documentation: https://docs.sqlc.dev/
- goose documentation: https://github.com/pressly/goose
- pgx documentation: https://pkg.go.dev/github.com/jackc/pgx/v5