From 4bf0dab853a8b08f2a9bce59310b1f719f1528c9 Mon Sep 17 00:00:00 2001 From: Kwaku Danso <72142185+cloud-dev101@users.noreply.github.com> Date: Mon, 11 May 2026 21:09:25 +0100 Subject: [PATCH] docs: add 4-tier production roadmap and detailed Tier 1 plan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - CLAUDE.md: 4-tier feature roadmap appended after the build-order section (launch blockers → moat features). Future sessions reference this to know which tier a new feature belongs to. - docs/TIER1_PLAN.md: detailed sequencing for the 8 blocks of Tier 1 work (auth, authz, rate limiting, notifications, CSV import, billing, backups, privacy) with schema changes, endpoints, tests, and effort estimates per block. Co-Authored-By: Claude Opus 4.7 --- CLAUDE.md | 69 +++++ docs/TIER1_PLAN.md | 624 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 693 insertions(+) create mode 100644 docs/TIER1_PLAN.md diff --git a/CLAUDE.md b/CLAUDE.md index 22def97..d218f10 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -190,3 +190,72 @@ When building this project, follow this sequence: 8. Real-time monitor page (WebSocket for live RSVPs + fraud alerts) 9. docker-compose.yml for full local development 10. End-to-end testing + +## Production roadmap + +Phases 1–3 above were the initial build and shipped a feature-complete demo. +What follows is the trajectory from demo to a product that can be sold. +Future sessions: read this section before starting new feature work so you know +which tier you're contributing to. + +The detailed sequencing for Tier 1 lives in `docs/TIER1_PLAN.md`. + +### Tier 1 — Launch blockers + +Cannot ship to paying customers without these. + +| Feature | What "done" means | +|---|---| +| **Real authentication** | Email + password (bcrypt) or magic links, email verification, password reset, JWT-based sessions with refresh tokens. The `useHost()` localStorage bootstrap must be removed. | +| **Authorisation / multi-tenancy** | Every host-facing endpoint behind session middleware. Row-level authz: a host only sees their own events, guests, tokens. No more `?host_id=...` query params. | +| **Real notifications** | Twilio SMS and AWS SES email actually wired, branded HTML templates, retry/backoff, bounce + complaint webhook handling, unsubscribe links. | +| **CSV guest import** | Drag-drop upload, server-side validation, dedup by email, preview/confirm step. The marketing page already promises this. | +| **Billing** | Stripe integration, free tier limits enforced in code, paid tiers, webhook handling for payment events, customer portal. | +| **Rate limiting + abuse controls** | Redis-backed sliding-window limits on the bootstrap, RSVP-submit, token-issue, and event-create endpoints. CAPTCHA on user-creation flow. | +| **Backups + disaster recovery** | Daily automated Postgres backups, point-in-time recovery via WAL archiving, documented + tested restore drill. (Claude builds restore docs; infra is human-owned.) | +| **Privacy compliance** | Data export endpoint, right-to-erasure endpoint, privacy policy + ToS pages, retention policy, signed DPAs with subprocessors (Twilio/SES/Stripe). | + +### Tier 2 — Customer expectations (first 3 months post-launch) + +Features hosts will ask for almost immediately. + +| Feature | Notes | +|---|---| +| **Smarter fraud detection** | Current heuristic scorer has false positives (same guest scoring `0` then `61` on consecutive opens). Add geolocation (MaxMind), per-event tunable thresholds, "actually legit" feedback loop, allowlists, eventual ML model. | +| **Reminders + broadcasts** | Auto-reminders 7-day / 1-day / day-of, "last call" to non-responders, custom announcements when details change. Killer feature for wedding planners. | +| **Editable RSVPs** | Guests can change "attending" → "declined" via the same link. | +| **Multi-host / collaborators** | Owner / Editor / Viewer roles per event, invitation flow. | +| **Event branding** | Custom colours, logo upload, optional custom domain for RSVP pages. | +| **Day-of check-in** | QR codes on confirmations, PWA scanner, live arrival count, walk-in handling, plus-one verification. | +| **Calendar integration** | "Add to Google / Outlook / Apple" on confirmation page. | +| **Host analytics** | Response-rate over time, who hasn't opened, source attribution. | + +### Tier 3 — Operations & polish + +Required for running this at scale, not for first launch. + +| Area | Notes | +|---|---| +| **Observability** | Prometheus `/metrics`, OpenTelemetry tracing across API↔fraud-engine↔NATS, Sentry, uptime monitoring, alert routing. | +| **CI/CD** | Gitea/GitHub Actions: tests on PR, lint, security scans (gosec, trivy), staging auto-deploy, blue/green prod, automated rollback. | +| **Accessibility** | WCAG 2.1 AA audit + fixes. Particular attention to focus states, contrast, reduced-motion respect for the float animations. | +| **i18n** | Vue i18n, translated email/SMS templates, date/time/currency localisation, RTL support. | +| **Mobile** | PWA + push notifications first, native apps later. | +| **Secrets management** | Vault or AWS Secrets Manager, rotation, no secrets in images. | +| **Performance** | Actually use Redis (cache hot queries), read replicas, CDN, query-plan audits, load tests. | + +### Tier 4 — Moat & enterprise + +Differentiators that justify enterprise pricing. + +| Feature | Notes | +|---|---| +| **SSO (SAML, OIDC)** | For corporate hosts. | +| **White-label** | For event planners running GuestGuard for *their* clients. | +| **Public API + webhooks** | So customers can build on top. | +| **Zapier integration** | Non-negotiable for SMB segment. | +| **CRM sync** | Salesforce, HubSpot — for corporate events teams. | +| **AI setup assistant** | Paste an invitation email, get an event auto-created with guest list extracted. | +| **Marketplace integrations** | Caterers, photographers, venues. | +| **Biometric / face check-in** | High-end events only, opt-in. | +| **SLAs + regional data residency** | EU-only deployment option, signed SLA contracts. | diff --git a/docs/TIER1_PLAN.md b/docs/TIER1_PLAN.md new file mode 100644 index 0000000..9a3a141 --- /dev/null +++ b/docs/TIER1_PLAN.md @@ -0,0 +1,624 @@ +# Tier 1 Production Plan + +> This document sequences the work to take GuestGuard from feature-complete demo +> to a product that can be sold to event hosts. +> +> Read `CLAUDE.md` for project conventions and the full 4-tier roadmap. +> This document is purely the **what** and the **in what order** for Tier 1. + +--- + +## TL;DR + +Eight work blocks (A–H), grouped into three waves that respect dependencies. +Estimated effort: **~8–10 weeks for one engineer**, **~5–6 weeks for two**. + +``` +Wave 1 (foundation, must finish before anything else): + A. Authentication ──┐ + ├── B. Authorisation + └── C. Rate limiting (parallel) + +Wave 2 (depends on auth being real): + D. Notifications ───┐ + ├── E. CSV import (parallel) + └── F. Billing + +Wave 3 (ops + legal, can run alongside Wave 2): + G. Backups & DR + H. Privacy compliance +``` + +--- + +## Block A — Authentication + +> **Why first**: every other Tier 1 item depends on knowing who's calling. + +### Goal + +Replace the `useHost()` localStorage bootstrap with real auth: email + password, +verified emails, password reset, JWT-based sessions with refresh tokens. The +existing `users` table is reused. + +### Schema changes + +Migration `0003_auth.up.sql`: + +```sql +ALTER TABLE users + ADD COLUMN password_hash TEXT, -- bcrypt; nullable for OAuth-only users later + ADD COLUMN email_verified BOOLEAN NOT NULL DEFAULT FALSE, + ADD COLUMN email_verified_at TIMESTAMPTZ; + +CREATE TABLE email_verification_tokens ( + token_hash TEXT PRIMARY KEY, + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + expires_at TIMESTAMPTZ NOT NULL, + consumed_at TIMESTAMPTZ +); + +CREATE TABLE password_reset_tokens ( + token_hash TEXT PRIMARY KEY, + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + expires_at TIMESTAMPTZ NOT NULL, + consumed_at TIMESTAMPTZ +); + +CREATE TABLE refresh_tokens ( + token_hash TEXT PRIMARY KEY, + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + expires_at TIMESTAMPTZ NOT NULL, + revoked_at TIMESTAMPTZ, + user_agent TEXT, + ip_address INET, + created_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +CREATE INDEX idx_refresh_tokens_user ON refresh_tokens(user_id) WHERE revoked_at IS NULL; +``` + +### Backend (`internal/auth`, `internal/api`) + +- New `auth.PasswordHasher` (bcrypt via `golang.org/x/crypto/bcrypt`, cost 12) +- New `auth.JWTSigner` issuing access tokens (15min TTL) signed with `GG_JWT_SECRET` +- New repos for verification, reset, and refresh tokens (token *hashes* stored, never raw) +- New handlers: + - `POST /auth/signup` — creates unverified user, emits verification email + - `POST /auth/login` — verifies password, requires verified email, returns access + refresh + - `POST /auth/refresh` — rotates refresh token (single-use), returns new pair + - `POST /auth/logout` — revokes the refresh token + - `POST /auth/verify-email` — consumes verification token, sets `email_verified` + - `POST /auth/forgot-password` — emits reset email (no-op if email unknown — don't leak existence) + - `POST /auth/reset-password` — consumes reset token, updates `password_hash`, revokes all refresh tokens +- New middleware `requireAuth` that pulls `Authorization: Bearer …`, validates, attaches `userID` to request context +- Delete `POST /users` (the demo bootstrap) + +### Frontend + +- Delete `useHost()` composable; replace with `useAuth()` (access token in memory, refresh token in httpOnly cookie set by server) +- New pages: `/login`, `/signup`, `/verify-email`, `/forgot-password`, `/reset-password/:token` +- `useApi()` composable adds `Authorization` header; on 401, calls `/auth/refresh`; on refresh failure, redirects to `/login` +- Dashboard route guard: redirect to `/login` if no session +- Sign-out button calls `/auth/logout`, clears state, redirects to `/` + +### Notifications dependency + +Verification + reset emails need real email delivery. Until Block D lands, +**use a stub `EmailSender` that prints the link to the API server logs** so +developers and the test environment can complete the flow without a Twilio/SES +account. Document this in the block's README. + +### Tests + +- Unit: password hashing round-trip, JWT signing + parsing with expiry, token-hash storage +- Integration: signup → verify-email → login → refresh → use-protected-endpoint → logout +- Integration: forgot-password → reset-password → old refresh tokens revoked +- Security: rate-limit signup (deferred to Block C, document the dependency) + +### Definition of done + +- [ ] Migration `0003_auth.up.sql` applied +- [ ] All `/auth/*` endpoints return appropriate status codes (verified against `httpstatus.dev` conventions) +- [ ] Refresh-token rotation enforced (reusing a refresh token revokes the family — token-replay defence) +- [ ] Email verification mandatory before first login +- [ ] Frontend has working signup → verify → login → dashboard flow end-to-end +- [ ] `useHost()` and `POST /users` removed from the codebase +- [ ] No localhost-only assumptions in code paths + +### Effort: ~2 weeks for one engineer. + +--- + +## Block B — Authorisation + +> **Why now**: same PR cluster as Block A. Adding new endpoints without authz +> bakes in security debt. + +### Goal + +Every host-facing endpoint enforces "this caller can only touch their own data". +Audit the current API surface and add authz checks to each endpoint. + +### Schema changes + +None — `events.host_id` already exists. We just need to start trusting the +session-derived `userID` instead of the query parameter. + +### Backend + +- Apply `requireAuth` middleware to every route except: `/health`, `/auth/*`, + the guest-facing `/access/{token}`, `/rsvp/{token}`, and the WS endpoint + (note: WS auth needs its own design — see open questions) +- For each event-scoped endpoint, derive `hostID` from session and reject if + the event's `host_id` doesn't match: + - `GET /events` → list only events where `host_id = session.userID` + - `GET /events/{id}` → 404 (not 403, to avoid leaking existence) if owner mismatch + - All `PATCH/DELETE /events/{id}` → same + - `POST /events/{id}/guests`, `GET /events/{id}/guests`, `POST /events/{id}/guests/{guest_id}/tokens`, `GET /events/{id}/activity` → same +- Remove the `?host_id=...` query parameter from `GET /events` — derive from session +- Update the integration test to authenticate first + +### Frontend + +- All host-facing API calls include the access token (already handled if `useApi()` was updated in Block A) +- Update `GET /events` calls to drop the `host_id` query param + +### WebSocket auth (open question) + +The WS endpoint `/ws/events/{id}` is currently anonymous. Options: + +1. **Pass JWT as query param** (`?token=...`) — browsers can't send `Authorization` headers on WS handshake +2. **Cookie-based session** (httpOnly cookie set by `/auth/login`) +3. **Short-lived WS ticket**: client calls `POST /auth/ws-ticket` (auth required), receives a single-use 60s ticket, passes as `?ticket=...` to the WS handshake + +Recommend option 3 — most secure, no token in URL beyond a single request. Document the choice. + +### Tests + +- Unit: authz middleware accepts/rejects/redirects appropriately +- Integration: host A cannot list, read, modify host B's events (verify 404) +- Integration: WS ticket flow works end-to-end + +### Definition of done + +- [ ] Every host route requires a valid session +- [ ] Cross-tenant data access returns 404, not 403 (don't leak existence) +- [ ] WS authentication implemented (option 3 recommended) +- [ ] `?host_id=...` query parameter removed everywhere +- [ ] Pen-test pass: try to read/modify another user's event with their event_id but your own token + +### Effort: ~3–4 days, assuming Block A laid the middleware groundwork. + +--- + +## Block C — Rate limiting + abuse controls + +> **Why now**: small block, no dependency on auth other than knowing the +> `userID` for per-user limits. Redis is already provisioned but unused — +> this finally puts it to work. + +### Goal + +Stop trivial abuse: someone scripting `POST /auth/signup` 10k times, +brute-forcing the RSVP page, spamming token issuance, etc. + +### Schema changes + +None — Redis only. + +### Backend + +- New `internal/ratelimit` package with a sliding-window limiter backed by Redis + (use Redis `INCR` + `EXPIRE` or a Lua script for atomicity) +- Apply per-route, per-key limits via middleware: + +| Endpoint | Key | Limit | +|---|---|---| +| `POST /auth/signup` | IP | 5 / hour | +| `POST /auth/login` | IP + email | 10 / 5 min (lock on consecutive failures) | +| `POST /auth/forgot-password` | IP + email | 3 / hour | +| `POST /rsvp/{token}` | token | 10 / hour | +| `GET /access/{token}` | token | 60 / hour | +| `POST /events` | userID | 20 / day | +| `POST /events/{id}/guests` | userID | 1000 / day | +| `POST /events/{id}/guests/{guest_id}/tokens` | userID | 500 / day | + +- Return `429 Too Many Requests` with `Retry-After` header on limit +- CAPTCHA (hCaptcha or Cloudflare Turnstile) on `POST /auth/signup` and `POST /auth/forgot-password` +- Lockout: after 5 consecutive failed logins, require password reset to unlock + +### Frontend + +- Render CAPTCHA widget on signup + forgot-password forms +- On `429`, show "You're going too fast — please try again in a minute" instead of generic error + +### Tests + +- Unit: limiter increments correctly, expires at window boundary +- Integration: 6th signup from the same IP within an hour returns 429 +- Integration: CAPTCHA token validated server-side before processing signup + +### Definition of done + +- [ ] Redis `MULTI/EXEC` or Lua script confirms atomicity of the limiter +- [ ] All endpoints in the table above are limited +- [ ] CAPTCHA wired on signup + forgot-password +- [ ] Lockout flow tested end-to-end +- [ ] Limiter exposes Prometheus metrics (already implicit — `ratelimit_block_total` per endpoint) + +### Effort: ~3–4 days. + +--- + +## Block D — Real notifications + +> **Why now**: Block A's email verification + password reset need real +> delivery. Don't ship auth to production with a logger stub. + +### Goal + +Replace `LogSender` in `internal/notification` with real Twilio + SES adapters. +Branded HTML email templates. Bounce + complaint handling. Unsubscribe. + +### Schema changes + +```sql +ALTER TABLE notifications + ADD COLUMN provider_message_id TEXT, + ADD COLUMN bounce_type TEXT, -- 'permanent' | 'transient' | NULL + ADD COLUMN complained BOOLEAN NOT NULL DEFAULT FALSE, + ADD COLUMN delivered_at TIMESTAMPTZ; -- already exists per memory, confirm + +CREATE TABLE unsubscribes ( + email CITEXT PRIMARY KEY, + reason TEXT, + created_at TIMESTAMPTZ NOT NULL DEFAULT now() +); +``` + +### Backend (`internal/notification`, `cmd/notifier`) + +- `TwilioSender` (real `github.com/twilio/twilio-go` client) + - Retry with exponential backoff: 1s, 5s, 30s, 5m, 30m + - Permanent failure codes mapped to `bounce_type = 'permanent'` + - Cost tracking: log message segments per send +- `SESSender` (real `github.com/aws/aws-sdk-go-v2/service/sesv2`) + - HTML + plaintext multipart + - List-Unsubscribe header on every email + - Configuration set with SNS topic for bounces + complaints +- HTML templates (`internal/notification/templates/*.tmpl`): + - `invitation.html` — "You're invited to {event_name}" + - `confirmation.html` — RSVP recorded + - `verification.html` — verify your email + - `reset.html` — reset your password + - `reminder.html` — 1-day-before reminder +- Webhook endpoints (in `internal/api`, public, signed by provider): + - `POST /webhooks/twilio/status` — Twilio message status callbacks + - `POST /webhooks/ses/notifications` — SNS-delivered bounce/complaint notifications + - Both verify signatures before trusting the payload +- Check `unsubscribes` table before sending any email; refuse silently if present + +### Frontend + +- Unsubscribe page at `/unsubscribe/:token` — token signed so we know who's unsubscribing +- Host setting: from-name + reply-to email per event (Tier 2 polish, defer if rushed) + +### Configuration + +Required env vars (add to `internal/config`): + +``` +GG_TWILIO_ACCOUNT_SID +GG_TWILIO_AUTH_TOKEN +GG_TWILIO_FROM_NUMBER + +GG_SES_REGION +GG_SES_FROM_EMAIL # must be a verified identity +GG_SES_CONFIGURATION_SET + +GG_PUBLIC_BASE_URL # for unsubscribe + invitation links in templates +``` + +### Tests + +- Unit: template rendering produces expected HTML and text +- Unit: retry logic backs off correctly, surrenders after N attempts +- Integration (with stubs): bounce webhook marks notification, blocks future sends to that email +- Manual: actually send to a test inbox in a staging Twilio + SES account + +### Definition of done + +- [ ] Email verification email arrives in a real inbox (Gmail, Outlook) +- [ ] SMS arrives on a real phone +- [ ] DKIM + SPF + DMARC verified for sender domain (this is human-owned infra setup) +- [ ] Bounces and complaints recorded in `notifications` + `unsubscribes` +- [ ] Unsubscribe link in every email; clicking it adds the address to the suppression list +- [ ] Templates render correctly in Gmail web, Outlook web, iOS Mail, Apple Mail (litmus.com or equivalent) + +### Effort: ~1.5–2 weeks (mostly template polish + deliverability setup). + +--- + +## Block E — CSV guest import + +> **Why now**: highest user-visible impact of any Tier 1 item, no dependency +> on other blocks except Block B's authz. Marketing already promises it. + +### Goal + +A host can drag a `.csv` onto the dashboard and have hundreds of guests added +in seconds. Validation surfaces problems before commit. Dedup is automatic. + +### Schema changes + +None — uses existing `guests` table. + +### Backend + +- `POST /events/{id}/guests/import` — `multipart/form-data`, single CSV file + - Header detection: tolerant of `name|Name|guest_name`, `email|Email`, `phone|Phone|telephone`, `plus_ones|+1|plusones` + - Validation: name required, email format if present, phone E.164-ish if present, plus_ones non-negative integer + - Dedup: skip rows whose email matches an existing guest on the same event + - Returns: `{ added: int, skipped: int, errors: [{ row: int, reason: string }] }` + - Atomic per-batch: either all valid rows commit or none (transaction) + - Limit: 5,000 rows per import +- `POST /events/{id}/guests/import/preview` — same payload, but doesn't write; returns parsed rows for confirm UI +- Sample CSV download: `GET /events/{id}/guests/import/template` — returns a `.csv` with example rows + +### Frontend + +- New section on event detail page: "Import guests from a spreadsheet" +- Drag-drop zone (use `vue-file-pond` or native HTML5 drag-drop) +- After upload: hit `/preview`, show a sortable table of rows with row-level errors highlighted +- "Looks good — import" button calls `/import` +- Show success summary: "Imported 247 guests. Skipped 3 duplicates. 2 rows had errors." +- Help text linking to the template CSV + +### Tests + +- Unit: header detection accepts the listed variants and rejects unknown columns gracefully +- Unit: validation rejects bad emails, accepts blank emails (phone-only guests valid) +- Integration: dedup leaves existing guests untouched +- Integration: rolling back on mid-batch error doesn't leave partial state + +### Definition of done + +- [ ] Sample CSV downloadable from the import UI +- [ ] Preview always shown before commit +- [ ] Errors are row-level, not "the whole file is invalid" +- [ ] Encoding: handles UTF-8 with BOM (Excel exports), UTF-16 (Mac Numbers exports) +- [ ] File-size cap: 1MB / 5,000 rows enforced server-side +- [ ] No memory blow-up: parse rows as a stream, not into a `[]Row` of arbitrary size + +### Effort: ~3–5 days. + +--- + +## Block F — Billing + +> **Why last in Wave 2**: depends on real auth (Block A), real notifications +> (Block D, for receipts), and a stable data model. Don't build until those +> are solid. + +### Goal + +Stripe-based subscriptions. Free tier with hard limits. Paid tiers unlock +higher limits. Failed-payment dunning. Self-serve upgrade + downgrade. + +### Pricing model (decision required — see open questions) + +Recommended starter pricing (placeholder, validate with target market): + +| Tier | Price | Events/mo | Guests/event | SMS/mo | Branding | +|---|---|---|---|---|---| +| Free | $0 | 1 | 50 | 0 (email only) | No | +| Personal | $19/event | 1 per purchase | 500 | 100 | Logo | +| Pro | $49/mo | 10 | 1,000 | 1,000 | Full | +| Business | $199/mo | Unlimited | 5,000 | 5,000 | + custom domain | + +### Schema changes + +```sql +CREATE TABLE subscriptions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + stripe_customer_id TEXT NOT NULL, + stripe_subscription_id TEXT, + tier TEXT NOT NULL, -- 'free' | 'personal' | 'pro' | 'business' + status TEXT NOT NULL, -- 'active' | 'past_due' | 'canceled' | 'incomplete' + current_period_end TIMESTAMPTZ, + cancel_at_period_end BOOLEAN NOT NULL DEFAULT FALSE, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now() +); +CREATE UNIQUE INDEX ON subscriptions(user_id) WHERE status = 'active'; + +CREATE TABLE usage_counters ( + user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, + period_start DATE NOT NULL, + events_count INT NOT NULL DEFAULT 0, + sms_count INT NOT NULL DEFAULT 0, + PRIMARY KEY (user_id, period_start) +); +``` + +### Backend + +- `internal/billing` package wrapping the Stripe SDK +- `POST /billing/checkout-session` — returns a Stripe Checkout URL for the requested tier +- `POST /billing/portal` — returns a Stripe Customer Portal URL +- `POST /webhooks/stripe` — signature-verified, handles: + - `customer.subscription.created` / `.updated` / `.deleted` → upsert into `subscriptions` + - `invoice.payment_failed` → trigger dunning email (Block D) + - `invoice.payment_succeeded` → clear past-due state +- Enforcement: middleware checks usage against tier limits before allowing + `POST /events`, `POST /events/{id}/guests`, SMS triggers. Returns + `402 Payment Required` with the upgrade URL on limit. + +### Frontend + +- `/billing` page: current plan, usage bars, upgrade/downgrade buttons +- On `402`, show modal: "You've hit your plan limit. Upgrade?" +- Stripe Checkout opens in a new tab; on return, poll subscription state until updated by webhook + +### Tests + +- Integration: free user can create 1 event, second fails with 402 +- Integration: webhook signature verification rejects forged payloads +- Integration: cancellation flow keeps access until period end + +### Definition of done + +- [ ] Stripe in test mode end-to-end working +- [ ] Webhook signatures verified +- [ ] Usage counters reset monthly (cron or compute on-demand) +- [ ] Receipts emailed via Stripe (default behaviour, just confirm enabled) +- [ ] Refund policy documented (referenced from billing page) + +### Effort: ~2 weeks. + +--- + +## Block G — Backups & disaster recovery + +> **Mostly infra-owned**, but the application side has documentation work. + +### Claude's scope + +- All migrations have a `*.down.sql` that's been tested locally +- New `docs/RUNBOOK_RESTORE.md` documenting the restore procedure step-by-step +- Confirm Postgres connection string env var supports the recovery instance (no + hardcoded primary-only hostnames) +- Optional: a `cmd/restore-verify` tool that runs after a restore to assert + schema invariants (guest counts ≈ rsvp counts, no orphaned tokens, etc.) + +### Human / infra scope + +- `pg_basebackup` + WAL archiving to S3 +- Daily logical dump as a secondary safety net +- Cross-region replication of the S3 bucket +- Monthly restore drill scheduled +- Documented RTO (e.g. 1 hour) and RPO (e.g. 5 minutes) + +### Definition of done + +- [ ] Every existing migration has a tested down migration +- [ ] `docs/RUNBOOK_RESTORE.md` exists and a fresh engineer could follow it +- [ ] First restore drill completed successfully + +### Effort: ~2 days for the application-side work. + +--- + +## Block H — Privacy compliance + +> Legal documents are human-owned. Application-level support is Claude scope. + +### Claude's scope + +- `GET /me/data-export` — streams a JSON document with every record + (user, events, guests, tokens, RSVPs, access_logs, notifications) belonging + to the authenticated user. Long-running, so async: enqueue → email a link. +- `DELETE /me` — cascade-deletes the user and everything tied to them. + Soft-delete first (set `deleted_at`), hard-delete on a cron after 30 days + to honour any in-flight legal holds. +- `DELETE /events/{id}/guests/{guest_id}` (host-triggered) — already exists in + spirit; add a "forget this guest" action that removes RSVP/access-log rows + but keeps the aggregate counter for the event. +- Data retention: automated nightly job to soft-delete events whose + `event_date` is older than 18 months (configurable per host once Tier 2). +- Add `privacy_policy_accepted_at` and `terms_accepted_at` columns to `users`; + block first login until both are accepted. + +### Human / legal scope + +- Privacy policy + ToS drafted by a lawyer +- DPAs signed with Twilio, SES, Stripe, MaxMind, and any other subprocessor +- Public privacy page at `/privacy`, ToS at `/terms` +- Cookie banner (only required if analytics are added; currently we have none) +- GDPR Article 30 record of processing activities + +### Definition of done + +- [ ] `GET /me/data-export` produces a complete, parseable JSON dump +- [ ] `DELETE /me` cascades correctly with no orphan rows (verified by FK constraints) +- [ ] Privacy + ToS pages live and linked from the footer + signup form +- [ ] Acceptance enforced on first login after the launch date +- [ ] Retention cron job tested + +### Effort: ~3–4 days for the application work; legal work runs in parallel. + +--- + +## Cross-cutting concerns + +These touch most blocks above; bake them in as you go, not as a separate pass. + +### Logging + auditing + +Every state-changing endpoint logs: `userID`, `action`, `target_id`, `result`, +`request_id`. Use `slog` with a correlation ID middleware. Critical for +post-incident forensics. + +### Observability lite (Tier 3 scope, but minimum viable for launch) + +- Prometheus `/metrics` endpoint on the API exposing: request rate by + endpoint, latency percentiles, 4xx/5xx counts, `ratelimit_block_total` +- Sentry (or self-hosted GlitchTip) for unhandled errors, with release tagging + +### Feature flags + +Lightweight `feature_flags` table or env-var driven (no LaunchDarkly yet). +Useful for rolling out Block F's billing without exposing it to all users at +once. + +--- + +## Open questions + +Resolve before starting: + +1. **Final pricing tiers** — the table in Block F is a placeholder. Confirm with the target market (interview 10 wedding planners, 10 corporate event managers). +2. **Email provider** — SES vs Postmark vs SendGrid. SES is cheapest but has the harshest deliverability ramp; Postmark is best for transactional but pricier. +3. **2FA at launch or v1.1?** — Recommend v1.1; one less moving piece on the launch path. +4. **Custom domain for RSVP pages at launch or v1.1?** — Recommend v1.1 (Tier 2). Adds DNS + cert complexity. +5. **WebSocket auth mechanism** — Recommend Block B option 3 (short-lived ticket). +6. **EU data residency at launch?** — If targeting EU customers, this becomes Tier 1 (separate EU deployment). Otherwise defer to Tier 4. + +--- + +## Sequencing summary table + +| Wave | Block | Depends on | Effort (1 eng) | Can parallelise with | +|---|---|---|---|---| +| 1 | A. Auth | — | 2w | — | +| 1 | B. Authz | A | 4d | C | +| 1 | C. Rate limiting | A (for `userID`) | 4d | B | +| 2 | D. Notifications | A | 2w | E | +| 2 | E. CSV import | B | 4d | D, F | +| 2 | F. Billing | A, D | 2w | E | +| 3 | G. Backups | — (infra) | 2d (Claude) | any | +| 3 | H. Privacy | A | 3d | any | + +**One engineer, sequential**: ~9 weeks. +**Two engineers, parallel-where-possible**: ~5.5 weeks. + +--- + +## What's *not* in Tier 1 (deliberate) + +These are tempting but are Tier 2: + +- Editable RSVPs (guests can change response after submitting) +- Multi-host collaborators +- Event branding (logo, colours, custom domain) +- Day-of QR check-in +- Better fraud-engine thresholds (false-positive feedback loop) +- Calendar integration +- Auto-reminders (1-day before, etc.) +- Mobile push notifications + +Ship Tier 1 first. The launch story is "personal invitations + live tracking + +quiet fraud detection + works reliably + you can pay us money". Everything +else is the second release.