Enforcing RLS with the restricted align_app role
Align isolates tenants at two layers: the application always scopes queries to a
single tenant (withTenant / SET LOCAL app.current_tenant), and PostgreSQL
Row-Level Security (RLS) is a hard backstop that filters every row by tenant_id.
RLS only protects you if the app connects as a non-owner role - PostgreSQL lets
a table owner bypass RLS. Out of the box the app connects as the database owner, so
the RLS backstop is inert. This runbook switches the running gateway (and optionally
brain) to a restricted align_app role so the backstop is active.
The migrations job keeps using the owner connection, because it creates roles,
tables and FORCE ROW LEVEL SECURITY policies.
If you use RDS Proxy (read this first)
When the app connects through RDS Proxy (DATABASE_APP_URL points at a *.proxy-*
endpoint), two things must be true before the cutover works - both were real failures
caught during the preview rollout:
-
Register
align_appwith the proxy. RDS Proxy keeps its own credential registry; a role it doesn't know is rejected (This RDS proxy has no credentials for the role align_app) even though the role authenticates fine directly. In align-infra this is theadditional_rolesinput on therds-proxymodule (creates a Secrets Manager secret + an extra proxyauthblock + IAM access). Apply that before re-sealing. -
No libpq
optionsstartup parameter. RDS Proxy rejects command-line options (Feature not supported: RDS Proxy currently doesn't support command-line options). The gateway setsstatement_timeoutper-connection via a poolconnecthandler (notoptions: '-c statement_timeout=...'); make sure your gateway image includes that fix before pointingDATABASE_APP_URLat the proxy, or every query fails FATAL.
Direct-to-RDS deployments (no proxy) are unaffected by both.
One-time cutover
1. Create the role (automatic)
Migration 073_create_align_app_role.sql creates align_app as a NOLOGIN role with
least-privilege grants. It runs with every migration job - nothing to do.
2. Give the role a password (in-cluster)
RDS is typically private, so this runs from inside the cluster. Exec into a pod that
has psql and can reach the database (the migrations image, or any app pod), then:
DATABASE_URL="postgres://<owner-user>:<owner-pass>@<host>:5432/<db>" \
ALIGN_APP_PASSWORD="<align_app password>" \
./scripts/set-align-app-password.sh
- On managed infra (align-infra), the
align_apppassword is generated by Terraform and stored in AWS Secrets Manager atalign/<env>/databaseunderAPP_DB_PASSWORD. - Self-hosters can pick any strong password; just use the same value in step 3.
The script binds the password as a parameter (it is never interpolated into SQL) and fails loudly if the role does not exist yet.
3. Re-seal the database secret with app-url
Add an app-url key (the align_app connection string) to the align-database
secret. The seal targets prompt for it:
make seal-db-preview # or: make seal-db-prod
# ... enter host/port/db, the OWNER username+password (for migrations),
# then the align_app password when prompted.
Leaving the align_app password blank skips the cutover and reproduces the previous
owner-only secret, so the target is safe to run either way.
Commit the regenerated sealedsecrets/<env>/db.yaml and let GitOps sync it.
4. Roll the gateway
The gateway prefers DATABASE_APP_URL (wired from the optional app-url secret key)
over DATABASE_URL. Once the key exists, restart the gateway:
kubectl rollout restart deploy/align-gateway -n align-<env>
Until app-url exists, DATABASE_APP_URL is unset and the gateway falls back to the
owner DATABASE_URL, so deploying the Helm change ahead of the secret is non-breaking.
DATABASE_APP_URL is only wired on the direct-connection path. When the gateway's
PgBouncer sidecar is enabled (gateway.pgbouncer.enabled: true), the chart does not
emit app-url, because it points at the database directly and the gateway prefers it
over the PgBouncer connection - so the cutover requires PgBouncer disabled until the
app role can be routed through the sidecar (a separate change).
5. Verify
As the owner, confirm RLS is enabled AND forced:
SELECT relname, relrowsecurity, relforcerowsecurity
FROM pg_class WHERE relname = 'decision_snapshots';
-- relrowsecurity = t, relforcerowsecurity = t
Then connect directly as align_app (using the app-url connection string,
not SET ROLE - the owner is usually not a member of align_app, and SET ROLE
would fail on a typical RDS setup). Confirm both the effective user and that a
tenant-scoped read is filtered:
psql "postgres://align_app:<align_app-pass>@<host>:5432/<db>"
SELECT current_user; -- align_app
SELECT set_config('app.current_tenant', '<tenant-uuid>', false);
SELECT count(*) FROM decision_snapshots; -- only this tenant's rows
Brain (optional follow-up)
The brain service connects via discrete POSTGRES_USER / POSTGRES_PASSWORD keys and
also sets app.current_tenant, so it benefits from the same backstop. Pointing brain at
align_app needs a small change to read app-specific credentials; it is tracked
separately. Brain remains on the owner connection until then - the gateway cutover above
is independent and safe on its own.
Rollback
Remove (or blank) the app-url key and re-seal; the gateway falls back to the owner
DATABASE_URL on its next restart. No data migration is involved.