Skip to main content

Enforcing RLS with the restricted align_app role

Align isolates tenants at two layers: the application always scopes queries to a single tenant (withTenant / SET LOCAL app.current_tenant), and PostgreSQL Row-Level Security (RLS) is a hard backstop that filters every row by tenant_id.

RLS only protects you if the app connects as a non-owner role - PostgreSQL lets a table owner bypass RLS. Out of the box the app connects as the database owner, so the RLS backstop is inert. This runbook switches the running gateway (and optionally brain) to a restricted align_app role so the backstop is active.

The migrations job keeps using the owner connection, because it creates roles, tables and FORCE ROW LEVEL SECURITY policies.

If you use RDS Proxy (read this first)

When the app connects through RDS Proxy (DATABASE_APP_URL points at a *.proxy-* endpoint), two things must be true before the cutover works - both were real failures caught during the preview rollout:

  1. Register align_app with the proxy. RDS Proxy keeps its own credential registry; a role it doesn't know is rejected (This RDS proxy has no credentials for the role align_app) even though the role authenticates fine directly. In align-infra this is the additional_roles input on the rds-proxy module (creates a Secrets Manager secret + an extra proxy auth block + IAM access). Apply that before re-sealing.

  2. No libpq options startup parameter. RDS Proxy rejects command-line options (Feature not supported: RDS Proxy currently doesn't support command-line options). The gateway sets statement_timeout per-connection via a pool connect handler (not options: '-c statement_timeout=...'); make sure your gateway image includes that fix before pointing DATABASE_APP_URL at the proxy, or every query fails FATAL.

Direct-to-RDS deployments (no proxy) are unaffected by both.

One-time cutover

1. Create the role (automatic)

Migration 073_create_align_app_role.sql creates align_app as a NOLOGIN role with least-privilege grants. It runs with every migration job - nothing to do.

2. Give the role a password (in-cluster)

RDS is typically private, so this runs from inside the cluster. Exec into a pod that has psql and can reach the database (the migrations image, or any app pod), then:

DATABASE_URL="postgres://<owner-user>:<owner-pass>@<host>:5432/<db>" \
ALIGN_APP_PASSWORD="<align_app password>" \
./scripts/set-align-app-password.sh
  • On managed infra (align-infra), the align_app password is generated by Terraform and stored in AWS Secrets Manager at align/<env>/database under APP_DB_PASSWORD.
  • Self-hosters can pick any strong password; just use the same value in step 3.

The script binds the password as a parameter (it is never interpolated into SQL) and fails loudly if the role does not exist yet.

3. Re-seal the database secret with app-url

Add an app-url key (the align_app connection string) to the align-database secret. The seal targets prompt for it:

make seal-db-preview # or: make seal-db-prod
# ... enter host/port/db, the OWNER username+password (for migrations),
# then the align_app password when prompted.

Leaving the align_app password blank skips the cutover and reproduces the previous owner-only secret, so the target is safe to run either way.

Commit the regenerated sealedsecrets/<env>/db.yaml and let GitOps sync it.

4. Roll the gateway

The gateway prefers DATABASE_APP_URL (wired from the optional app-url secret key) over DATABASE_URL. Once the key exists, restart the gateway:

kubectl rollout restart deploy/align-gateway -n align-<env>

Until app-url exists, DATABASE_APP_URL is unset and the gateway falls back to the owner DATABASE_URL, so deploying the Helm change ahead of the secret is non-breaking.

PgBouncer

DATABASE_APP_URL is only wired on the direct-connection path. When the gateway's PgBouncer sidecar is enabled (gateway.pgbouncer.enabled: true), the chart does not emit app-url, because it points at the database directly and the gateway prefers it over the PgBouncer connection - so the cutover requires PgBouncer disabled until the app role can be routed through the sidecar (a separate change).

5. Verify

As the owner, confirm RLS is enabled AND forced:

SELECT relname, relrowsecurity, relforcerowsecurity
FROM pg_class WHERE relname = 'decision_snapshots';
-- relrowsecurity = t, relforcerowsecurity = t

Then connect directly as align_app (using the app-url connection string, not SET ROLE - the owner is usually not a member of align_app, and SET ROLE would fail on a typical RDS setup). Confirm both the effective user and that a tenant-scoped read is filtered:

psql "postgres://align_app:<align_app-pass>@<host>:5432/<db>"
SELECT current_user; -- align_app
SELECT set_config('app.current_tenant', '<tenant-uuid>', false);
SELECT count(*) FROM decision_snapshots; -- only this tenant's rows

Brain (optional follow-up)

The brain service connects via discrete POSTGRES_USER / POSTGRES_PASSWORD keys and also sets app.current_tenant, so it benefits from the same backstop. Pointing brain at align_app needs a small change to read app-specific credentials; it is tracked separately. Brain remains on the owner connection until then - the gateway cutover above is independent and safe on its own.

Rollback

Remove (or blank) the app-url key and re-seal; the gateway falls back to the owner DATABASE_URL on its next restart. No data migration is involved.