Files
Kwaku Danso 59b8781659 feat: ship Tier 1 — auth, authz, rate limits, real notifications, CSV import, billing, backups/DR, privacy
Closes every block in docs/TIER1_PLAN.md from the Claude-scope side. The
homelab / cloud setup steps (SES verification, restore drill, lawyer-
drafted ToS) remain operator-owned but are unblocked.

Block A — Authentication
- Migration 0003: password_hash, email_verified, email_verification_tokens,
  password_reset_tokens, refresh_tokens (with replaced_by family chain).
- Bcrypt hasher, HS256 JWT signer, single-use refresh tokens with rotation
  + replay-detection (revokes the family on reuse).
- /auth/signup, /login, /refresh, /logout, /verify-email,
  /forgot-password, /reset-password — enumeration-safe.
- requireAuth middleware + GET /me.
- Frontend useAuth/useApi with auto-refresh-on-401, login/signup/verify/
  forgot/reset pages, route-guard middleware.

Block B — Authorisation
- EventRepo.GetForHost; Update/Delete scoped by host_id.
- All host routes behind requireAuth + ownership; cross-tenant returns
  404 (no enumeration). ?host_id removed.
- WS auth via short-lived single-use tickets (POST /auth/ws-ticket).
- Tests: TestCrossTenantIsolation — 9 probes.

Block C — Rate limiting
- Redis sliding-window via Lua (atomic ZADD+ZCARD+PEXPIRE).
- Per-route limits matching the plan (signup IP, login IP+email, RSVP/
  access by token, events/guests/tokens by user_id).
- 429 with Retry-After header and JSON body.
- Auth lockout: 5 failed logins → account locked, only password reset
  clears it.
- Frontend: useErrMessage normalises 429 + locked messaging.

Block D — Real notifications
- Migration 0004: provider_message_id, bounce_type, complained columns
  + unsubscribes (CITEXT) suppression table.
- Branded HTML + plaintext templates for verification, reset, invitation,
  confirmation, reminder. Per-page templates avoid html/template's
  contextual-escape collisions.
- Senders: SESv2, Twilio (SMS), SMTP (Mailpit-friendly), Resend HTTP.
- PickEmailSender priority Resend > SMTP > SES > Log — system boots
  cleanly in dev with Mailpit; production flips one env var.
- Webhook endpoints (Twilio status + SES SNS) — bounces add to suppression;
  signature verification stubbed pending creds.
- Auto-send: POST /tokens publishes invitation.send; notifier renders +
  delivers via the configured backend; suppression list honoured.
- Bulk + per-row invitation flow: POST /events/{id}/guests/invitations/bulk
  returns per-guest tokens so phone-only guests can be SMS'd manually.
- Unsubscribe: signed HMAC token (no TTL) + /unsubscribe/[token] page.
- WhatsApp Option A+: wa.me click-to-chat wizard with per-guest progress
  tracking, isLikelyE164 validation, edit-from-wizard.
- Token rotate (POST /tokens/rotate) invalidates the old URL — used by
  the regenerate-link flow.
- Mailpit added to docker-compose for dev inbox.

Block E — CSV import
- Streaming parser: tolerant header detection, UTF-8 BOM + UTF-16 LE/BE
  decoding, row-level validation, 5,000-row cap.
- Strict E.164 phone validation with helpful error message.
- POST /preview + /import + GET /template; preview UI on event page;
  atomic per-batch with dedup on existing emails.

Phone capture across UI
- PhoneInput component: country picker (~50 ISO codes) + national input +
  live E.164 preview + inline length validation.
- Used in Add Guest and Edit Guest modals. Smart paste-handling extracts
  country code from full E.164 strings.

Block F — Billing (Stripe)
- Migration 0005: subscriptions table (user_id → tier/status/period_end +
  Stripe customer/sub ids). Partial unique index keeps one granting sub
  per user.
- internal/billing: Tier + Limits model (Free 1/50, Pro 10/1000, Business
  ∞/5000), Stripe SDK wrapper with IgnoreAPIVersionMismatch for newer
  account API versions.
- /billing/checkout-session, /billing/portal, /billing/status,
  /webhooks/stripe (signature-verified, lifecycle events).
- Tier enforcement: 402 on POST /events, /guests, /import with
  {error, reason, tier, used, limit, upgrade_url} body.
- Frontend: useBilling composable, /dashboard/billing page (current plan,
  usage bars, tier cards), global UpgradeModal triggered by useApi's
  402 interceptor.
- Customer portal kept for self-service cancel/payment-method changes.

Block G — Backups & DR (application side)
- Every migration has a tested .down.sql.
- TestMigrationRoundtrip applies all ups → all downs → all ups against a
  fresh container; catches asymmetric down migrations.
- cmd/restore-verify: 28-check post-restore invariant tool (schema
  presence, no orphans across 10 FK relationships, email uniqueness,
  single-active subscription, row-count snapshot).
- docs/RUNBOOK_RESTORE.md: 9-step restore procedure with RTO/RPO
  targets, drill instructions, rollback path.

Block H — Privacy compliance (application side)
- Migration 0006: deleted_at + terms_accepted_at + privacy_policy_accepted_at
  on users. Partial index on email for live-only uniqueness.
- GET /me/data-export — synchronous JSON dump (user, events, guests,
  tokens, rsvps, access_logs, notifications).
- DELETE /me — soft-delete with PII scrub + refresh-token revocation;
  re-signup with same email works.
- POST /me/accept-terms — idempotent consent recording.
- Frontend /privacy + /terms placeholder pages with substantive (pending
  legal review) copy; footer links; signup terms checkbox; TermsGateModal
  for accounts created before the rollout; export + delete buttons on
  /dashboard/billing.

Tests
- All migrations verified up/down/up.
- Integration suite: TestE2EHappyPath, TestAuthFlow, TestCrossTenantIsolation,
  TestRateLimitSignup, TestLoginLockout, TestUnsubscribeFlow,
  TestSESBounceWebhook, TestTwilioStatusWebhook, TestCsvImportFlow,
  TestCsvImportAtomicRollback, TestBulkIssueInvitations, TestBulkIssueExplicitSubset,
  TestTokenIssuePublishesInvitation, TestTokenIssueWithoutGuestEmailSkipsInvitation,
  TestGuestUpdate, TestGuestDelete, TestTokenRotate, TestSMTPSenderAgainstMailpit,
  TestFreeTierEventLimit, TestFreeTierGuestLimit, TestBusinessTierBypassesLimits,
  TestDataExport, TestDeleteMe, TestAcceptTerms, TestMigrationRoundtrip.
  Full suite runs in ~120s against real Postgres + NATS + Redis + Mailpit.
- Unit suite green across internal/auth, internal/csvimport,
  internal/notification, internal/ratelimit, internal/domain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 23:54:22 +01:00

268 lines
7.5 KiB
Go

// restore-verify is a post-restore sanity tool. Point it at a freshly
// restored Postgres instance and it asserts that the schema and data
// are coherent — no orphan rows, expected uniqueness invariants hold,
// every table is present. Exits non-zero on any failure so it slots
// into a "restore drill" runbook step.
//
// Usage:
// GG_DATABASE_URL=postgres://... ./restore-verify [--verbose]
//
// The intent is "would I bet my Sunday on this restore being usable?".
// Failing fast here keeps a bad restore from being promoted to traffic.
package main
import (
"context"
"errors"
"flag"
"fmt"
"os"
"strings"
"time"
"github.com/jackc/pgx/v5/pgxpool"
)
func main() {
verbose := flag.Bool("verbose", false, "print every check's result, not just failures")
flag.Parse()
dsn := os.Getenv("GG_DATABASE_URL")
if dsn == "" {
fmt.Fprintln(os.Stderr, "restore-verify: GG_DATABASE_URL is required")
os.Exit(2)
}
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
pool, err := pgxpool.New(ctx, dsn)
if err != nil {
fmt.Fprintf(os.Stderr, "restore-verify: connect: %v\n", err)
os.Exit(2)
}
defer pool.Close()
if err := pool.Ping(ctx); err != nil {
fmt.Fprintf(os.Stderr, "restore-verify: ping: %v\n", err)
os.Exit(2)
}
fmt.Println("restore-verify: checking", maskDSN(dsn))
fmt.Println()
checks := allChecks()
var failed []string
for _, c := range checks {
result, err := c.fn(ctx, pool)
if err != nil {
failed = append(failed, c.name)
fmt.Printf(" ✗ %-50s FAIL: %v\n", c.name, err)
continue
}
if *verbose {
fmt.Printf(" ✓ %-50s %s\n", c.name, result)
}
}
fmt.Println()
if len(failed) > 0 {
fmt.Printf("FAILED: %d check%s — %s\n",
len(failed), pluralS(len(failed)), strings.Join(failed, ", "))
os.Exit(1)
}
fmt.Printf("OK: all %d checks passed\n", len(checks))
}
type check struct {
name string
fn func(ctx context.Context, pool *pgxpool.Pool) (string, error)
}
func allChecks() []check {
return []check{
// --- schema presence ---
tableExists("users"),
tableExists("events"),
tableExists("guests"),
tableExists("tokens"),
tableExists("rsvps"),
tableExists("access_logs"),
tableExists("notifications"),
tableExists("schema_migrations"),
tableExists("email_verification_tokens"),
tableExists("password_reset_tokens"),
tableExists("refresh_tokens"),
tableExists("unsubscribes"),
tableExists("subscriptions"),
// --- migrations applied ---
{
name: "schema_migrations: 5+ migrations applied",
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
var n int
if err := pool.QueryRow(ctx,
`SELECT count(*) FROM schema_migrations`).Scan(&n); err != nil {
return "", err
}
if n < 5 {
return "", fmt.Errorf("only %d migrations recorded — incomplete restore", n)
}
return fmt.Sprintf("%d migrations", n), nil
},
},
// --- referential integrity (FK constraints catch most of this,
// but a bad logical dump or partial restore can sneak rows in) ---
noOrphans("events", "host_id", "users", "id"),
noOrphans("guests", "event_id", "events", "id"),
noOrphans("tokens", "guest_id", "guests", "id"),
noOrphans("rsvps", "guest_id", "guests", "id"),
noOrphans("access_logs", "guest_id", "guests", "id"),
noOrphans("notifications", "guest_id", "guests", "id"),
noOrphans("email_verification_tokens", "user_id", "users", "id"),
noOrphans("password_reset_tokens", "user_id", "users", "id"),
noOrphans("refresh_tokens", "user_id", "users", "id"),
noOrphans("subscriptions", "user_id", "users", "id"),
// --- domain invariants ---
{
name: "users.email is unique (case-insensitive)",
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
var dupes int
err := pool.QueryRow(ctx, `
SELECT count(*) FROM (
SELECT lower(email) FROM users GROUP BY lower(email) HAVING count(*) > 1
) t
`).Scan(&dupes)
if err != nil {
return "", err
}
if dupes > 0 {
return "", fmt.Errorf("%d duplicate email(s) found", dupes)
}
return "no duplicate emails", nil
},
},
{
name: "guests with rsvp_response have an existing rsvp row",
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
var n int
err := pool.QueryRow(ctx, `
SELECT count(*) FROM rsvps r
WHERE NOT EXISTS (SELECT 1 FROM guests g WHERE g.id = r.guest_id)
`).Scan(&n)
if err != nil {
return "", err
}
if n > 0 {
return "", fmt.Errorf("%d rsvp(s) reference a missing guest", n)
}
return "0 orphan rsvps", nil
},
},
{
name: "at most one granting subscription per user",
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
var n int
err := pool.QueryRow(ctx, `
SELECT count(*) FROM (
SELECT user_id FROM subscriptions
WHERE status IN ('active','past_due','trialing')
GROUP BY user_id HAVING count(*) > 1
) t
`).Scan(&n)
if err != nil {
return "", err
}
if n > 0 {
return "", fmt.Errorf("%d user(s) have multiple granting subscriptions", n)
}
return "all users single-active", nil
},
},
// --- soft constraints worth noticing (not failures, but logged) ---
{
name: "row counts snapshot",
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
parts := []string{}
for _, t := range []string{
"users", "events", "guests", "tokens", "rsvps",
"access_logs", "notifications", "subscriptions",
} {
var n int
if err := pool.QueryRow(ctx,
fmt.Sprintf("SELECT count(*) FROM %s", t)).Scan(&n); err != nil {
return "", err
}
parts = append(parts, fmt.Sprintf("%s=%d", t, n))
}
return strings.Join(parts, " "), nil
},
},
}
}
func tableExists(name string) check {
return check{
name: fmt.Sprintf("table %q exists", name),
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
var exists bool
err := pool.QueryRow(ctx,
`SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema='public' AND table_name=$1)`,
name,
).Scan(&exists)
if err != nil {
return "", err
}
if !exists {
return "", errors.New("missing")
}
return "ok", nil
},
}
}
func noOrphans(childTable, childFK, parentTable, parentPK string) check {
return check{
name: fmt.Sprintf("no orphans: %s.%s -> %s.%s", childTable, childFK, parentTable, parentPK),
fn: func(ctx context.Context, pool *pgxpool.Pool) (string, error) {
q := fmt.Sprintf(`
SELECT count(*) FROM %s c
WHERE c.%s IS NOT NULL
AND NOT EXISTS (SELECT 1 FROM %s p WHERE p.%s = c.%s)
`, childTable, childFK, parentTable, parentPK, childFK)
var n int
if err := pool.QueryRow(ctx, q).Scan(&n); err != nil {
return "", err
}
if n > 0 {
return "", fmt.Errorf("%d orphan row(s)", n)
}
return "clean", nil
},
}
}
func maskDSN(dsn string) string {
// Crude: redact password between '://user:' and '@'.
at := strings.LastIndex(dsn, "@")
scheme := strings.Index(dsn, "://")
if at < 0 || scheme < 0 || at <= scheme {
return dsn
}
userInfo := dsn[scheme+3 : at]
if colon := strings.Index(userInfo, ":"); colon >= 0 {
userInfo = userInfo[:colon] + ":****"
}
return dsn[:scheme+3] + userInfo + dsn[at:]
}
func pluralS(n int) string {
if n == 1 {
return ""
}
return "s"
}