- CLAUDE.md: 4-tier feature roadmap appended after the build-order section (launch blockers → moat features). Future sessions reference this to know which tier a new feature belongs to. - docs/TIER1_PLAN.md: detailed sequencing for the 8 blocks of Tier 1 work (auth, authz, rate limiting, notifications, CSV import, billing, backups, privacy) with schema changes, endpoints, tests, and effort estimates per block. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
24 KiB
Tier 1 Production Plan
This document sequences the work to take GuestGuard from feature-complete demo to a product that can be sold to event hosts.
Read
CLAUDE.mdfor project conventions and the full 4-tier roadmap. This document is purely the what and the in what order for Tier 1.
TL;DR
Eight work blocks (A–H), grouped into three waves that respect dependencies. Estimated effort: ~8–10 weeks for one engineer, ~5–6 weeks for two.
Wave 1 (foundation, must finish before anything else):
A. Authentication ──┐
├── B. Authorisation
└── C. Rate limiting (parallel)
Wave 2 (depends on auth being real):
D. Notifications ───┐
├── E. CSV import (parallel)
└── F. Billing
Wave 3 (ops + legal, can run alongside Wave 2):
G. Backups & DR
H. Privacy compliance
Block A — Authentication
Why first: every other Tier 1 item depends on knowing who's calling.
Goal
Replace the useHost() localStorage bootstrap with real auth: email + password,
verified emails, password reset, JWT-based sessions with refresh tokens. The
existing users table is reused.
Schema changes
Migration 0003_auth.up.sql:
ALTER TABLE users
ADD COLUMN password_hash TEXT, -- bcrypt; nullable for OAuth-only users later
ADD COLUMN email_verified BOOLEAN NOT NULL DEFAULT FALSE,
ADD COLUMN email_verified_at TIMESTAMPTZ;
CREATE TABLE email_verification_tokens (
token_hash TEXT PRIMARY KEY,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
expires_at TIMESTAMPTZ NOT NULL,
consumed_at TIMESTAMPTZ
);
CREATE TABLE password_reset_tokens (
token_hash TEXT PRIMARY KEY,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
expires_at TIMESTAMPTZ NOT NULL,
consumed_at TIMESTAMPTZ
);
CREATE TABLE refresh_tokens (
token_hash TEXT PRIMARY KEY,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
expires_at TIMESTAMPTZ NOT NULL,
revoked_at TIMESTAMPTZ,
user_agent TEXT,
ip_address INET,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_refresh_tokens_user ON refresh_tokens(user_id) WHERE revoked_at IS NULL;
Backend (internal/auth, internal/api)
- New
auth.PasswordHasher(bcrypt viagolang.org/x/crypto/bcrypt, cost 12) - New
auth.JWTSignerissuing access tokens (15min TTL) signed withGG_JWT_SECRET - New repos for verification, reset, and refresh tokens (token hashes stored, never raw)
- New handlers:
POST /auth/signup— creates unverified user, emits verification emailPOST /auth/login— verifies password, requires verified email, returns access + refreshPOST /auth/refresh— rotates refresh token (single-use), returns new pairPOST /auth/logout— revokes the refresh tokenPOST /auth/verify-email— consumes verification token, setsemail_verifiedPOST /auth/forgot-password— emits reset email (no-op if email unknown — don't leak existence)POST /auth/reset-password— consumes reset token, updatespassword_hash, revokes all refresh tokens
- New middleware
requireAuththat pullsAuthorization: Bearer …, validates, attachesuserIDto request context - Delete
POST /users(the demo bootstrap)
Frontend
- Delete
useHost()composable; replace withuseAuth()(access token in memory, refresh token in httpOnly cookie set by server) - New pages:
/login,/signup,/verify-email,/forgot-password,/reset-password/:token useApi()composable addsAuthorizationheader; on 401, calls/auth/refresh; on refresh failure, redirects to/login- Dashboard route guard: redirect to
/loginif no session - Sign-out button calls
/auth/logout, clears state, redirects to/
Notifications dependency
Verification + reset emails need real email delivery. Until Block D lands,
use a stub EmailSender that prints the link to the API server logs so
developers and the test environment can complete the flow without a Twilio/SES
account. Document this in the block's README.
Tests
- Unit: password hashing round-trip, JWT signing + parsing with expiry, token-hash storage
- Integration: signup → verify-email → login → refresh → use-protected-endpoint → logout
- Integration: forgot-password → reset-password → old refresh tokens revoked
- Security: rate-limit signup (deferred to Block C, document the dependency)
Definition of done
- Migration
0003_auth.up.sqlapplied - All
/auth/*endpoints return appropriate status codes (verified againsthttpstatus.devconventions) - Refresh-token rotation enforced (reusing a refresh token revokes the family — token-replay defence)
- Email verification mandatory before first login
- Frontend has working signup → verify → login → dashboard flow end-to-end
useHost()andPOST /usersremoved from the codebase- No localhost-only assumptions in code paths
Effort: ~2 weeks for one engineer.
Block B — Authorisation
Why now: same PR cluster as Block A. Adding new endpoints without authz bakes in security debt.
Goal
Every host-facing endpoint enforces "this caller can only touch their own data". Audit the current API surface and add authz checks to each endpoint.
Schema changes
None — events.host_id already exists. We just need to start trusting the
session-derived userID instead of the query parameter.
Backend
- Apply
requireAuthmiddleware to every route except:/health,/auth/*, the guest-facing/access/{token},/rsvp/{token}, and the WS endpoint (note: WS auth needs its own design — see open questions) - For each event-scoped endpoint, derive
hostIDfrom session and reject if the event'shost_iddoesn't match:GET /events→ list only events wherehost_id = session.userIDGET /events/{id}→ 404 (not 403, to avoid leaking existence) if owner mismatch- All
PATCH/DELETE /events/{id}→ same POST /events/{id}/guests,GET /events/{id}/guests,POST /events/{id}/guests/{guest_id}/tokens,GET /events/{id}/activity→ same
- Remove the
?host_id=...query parameter fromGET /events— derive from session - Update the integration test to authenticate first
Frontend
- All host-facing API calls include the access token (already handled if
useApi()was updated in Block A) - Update
GET /eventscalls to drop thehost_idquery param
WebSocket auth (open question)
The WS endpoint /ws/events/{id} is currently anonymous. Options:
- Pass JWT as query param (
?token=...) — browsers can't sendAuthorizationheaders on WS handshake - Cookie-based session (httpOnly cookie set by
/auth/login) - Short-lived WS ticket: client calls
POST /auth/ws-ticket(auth required), receives a single-use 60s ticket, passes as?ticket=...to the WS handshake
Recommend option 3 — most secure, no token in URL beyond a single request. Document the choice.
Tests
- Unit: authz middleware accepts/rejects/redirects appropriately
- Integration: host A cannot list, read, modify host B's events (verify 404)
- Integration: WS ticket flow works end-to-end
Definition of done
- Every host route requires a valid session
- Cross-tenant data access returns 404, not 403 (don't leak existence)
- WS authentication implemented (option 3 recommended)
?host_id=...query parameter removed everywhere- Pen-test pass: try to read/modify another user's event with their event_id but your own token
Effort: ~3–4 days, assuming Block A laid the middleware groundwork.
Block C — Rate limiting + abuse controls
Why now: small block, no dependency on auth other than knowing the
userIDfor per-user limits. Redis is already provisioned but unused — this finally puts it to work.
Goal
Stop trivial abuse: someone scripting POST /auth/signup 10k times,
brute-forcing the RSVP page, spamming token issuance, etc.
Schema changes
None — Redis only.
Backend
- New
internal/ratelimitpackage with a sliding-window limiter backed by Redis (use RedisINCR+EXPIREor a Lua script for atomicity) - Apply per-route, per-key limits via middleware:
| Endpoint | Key | Limit |
|---|---|---|
POST /auth/signup |
IP | 5 / hour |
POST /auth/login |
IP + email | 10 / 5 min (lock on consecutive failures) |
POST /auth/forgot-password |
IP + email | 3 / hour |
POST /rsvp/{token} |
token | 10 / hour |
GET /access/{token} |
token | 60 / hour |
POST /events |
userID | 20 / day |
POST /events/{id}/guests |
userID | 1000 / day |
POST /events/{id}/guests/{guest_id}/tokens |
userID | 500 / day |
- Return
429 Too Many RequestswithRetry-Afterheader on limit - CAPTCHA (hCaptcha or Cloudflare Turnstile) on
POST /auth/signupandPOST /auth/forgot-password - Lockout: after 5 consecutive failed logins, require password reset to unlock
Frontend
- Render CAPTCHA widget on signup + forgot-password forms
- On
429, show "You're going too fast — please try again in a minute" instead of generic error
Tests
- Unit: limiter increments correctly, expires at window boundary
- Integration: 6th signup from the same IP within an hour returns 429
- Integration: CAPTCHA token validated server-side before processing signup
Definition of done
- Redis
MULTI/EXECor Lua script confirms atomicity of the limiter - All endpoints in the table above are limited
- CAPTCHA wired on signup + forgot-password
- Lockout flow tested end-to-end
- Limiter exposes Prometheus metrics (already implicit —
ratelimit_block_totalper endpoint)
Effort: ~3–4 days.
Block D — Real notifications
Why now: Block A's email verification + password reset need real delivery. Don't ship auth to production with a logger stub.
Goal
Replace LogSender in internal/notification with real Twilio + SES adapters.
Branded HTML email templates. Bounce + complaint handling. Unsubscribe.
Schema changes
ALTER TABLE notifications
ADD COLUMN provider_message_id TEXT,
ADD COLUMN bounce_type TEXT, -- 'permanent' | 'transient' | NULL
ADD COLUMN complained BOOLEAN NOT NULL DEFAULT FALSE,
ADD COLUMN delivered_at TIMESTAMPTZ; -- already exists per memory, confirm
CREATE TABLE unsubscribes (
email CITEXT PRIMARY KEY,
reason TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
Backend (internal/notification, cmd/notifier)
TwilioSender(realgithub.com/twilio/twilio-goclient)- Retry with exponential backoff: 1s, 5s, 30s, 5m, 30m
- Permanent failure codes mapped to
bounce_type = 'permanent' - Cost tracking: log message segments per send
SESSender(realgithub.com/aws/aws-sdk-go-v2/service/sesv2)- HTML + plaintext multipart
- List-Unsubscribe header on every email
- Configuration set with SNS topic for bounces + complaints
- HTML templates (
internal/notification/templates/*.tmpl):invitation.html— "You're invited to {event_name}"confirmation.html— RSVP recordedverification.html— verify your emailreset.html— reset your passwordreminder.html— 1-day-before reminder
- Webhook endpoints (in
internal/api, public, signed by provider):POST /webhooks/twilio/status— Twilio message status callbacksPOST /webhooks/ses/notifications— SNS-delivered bounce/complaint notifications- Both verify signatures before trusting the payload
- Check
unsubscribestable before sending any email; refuse silently if present
Frontend
- Unsubscribe page at
/unsubscribe/:token— token signed so we know who's unsubscribing - Host setting: from-name + reply-to email per event (Tier 2 polish, defer if rushed)
Configuration
Required env vars (add to internal/config):
GG_TWILIO_ACCOUNT_SID
GG_TWILIO_AUTH_TOKEN
GG_TWILIO_FROM_NUMBER
GG_SES_REGION
GG_SES_FROM_EMAIL # must be a verified identity
GG_SES_CONFIGURATION_SET
GG_PUBLIC_BASE_URL # for unsubscribe + invitation links in templates
Tests
- Unit: template rendering produces expected HTML and text
- Unit: retry logic backs off correctly, surrenders after N attempts
- Integration (with stubs): bounce webhook marks notification, blocks future sends to that email
- Manual: actually send to a test inbox in a staging Twilio + SES account
Definition of done
- Email verification email arrives in a real inbox (Gmail, Outlook)
- SMS arrives on a real phone
- DKIM + SPF + DMARC verified for sender domain (this is human-owned infra setup)
- Bounces and complaints recorded in
notifications+unsubscribes - Unsubscribe link in every email; clicking it adds the address to the suppression list
- Templates render correctly in Gmail web, Outlook web, iOS Mail, Apple Mail (litmus.com or equivalent)
Effort: ~1.5–2 weeks (mostly template polish + deliverability setup).
Block E — CSV guest import
Why now: highest user-visible impact of any Tier 1 item, no dependency on other blocks except Block B's authz. Marketing already promises it.
Goal
A host can drag a .csv onto the dashboard and have hundreds of guests added
in seconds. Validation surfaces problems before commit. Dedup is automatic.
Schema changes
None — uses existing guests table.
Backend
POST /events/{id}/guests/import—multipart/form-data, single CSV file- Header detection: tolerant of
name|Name|guest_name,email|Email,phone|Phone|telephone,plus_ones|+1|plusones - Validation: name required, email format if present, phone E.164-ish if present, plus_ones non-negative integer
- Dedup: skip rows whose email matches an existing guest on the same event
- Returns:
{ added: int, skipped: int, errors: [{ row: int, reason: string }] } - Atomic per-batch: either all valid rows commit or none (transaction)
- Limit: 5,000 rows per import
- Header detection: tolerant of
POST /events/{id}/guests/import/preview— same payload, but doesn't write; returns parsed rows for confirm UI- Sample CSV download:
GET /events/{id}/guests/import/template— returns a.csvwith example rows
Frontend
- New section on event detail page: "Import guests from a spreadsheet"
- Drag-drop zone (use
vue-file-pondor native HTML5 drag-drop) - After upload: hit
/preview, show a sortable table of rows with row-level errors highlighted - "Looks good — import" button calls
/import - Show success summary: "Imported 247 guests. Skipped 3 duplicates. 2 rows had errors."
- Help text linking to the template CSV
Tests
- Unit: header detection accepts the listed variants and rejects unknown columns gracefully
- Unit: validation rejects bad emails, accepts blank emails (phone-only guests valid)
- Integration: dedup leaves existing guests untouched
- Integration: rolling back on mid-batch error doesn't leave partial state
Definition of done
- Sample CSV downloadable from the import UI
- Preview always shown before commit
- Errors are row-level, not "the whole file is invalid"
- Encoding: handles UTF-8 with BOM (Excel exports), UTF-16 (Mac Numbers exports)
- File-size cap: 1MB / 5,000 rows enforced server-side
- No memory blow-up: parse rows as a stream, not into a
[]Rowof arbitrary size
Effort: ~3–5 days.
Block F — Billing
Why last in Wave 2: depends on real auth (Block A), real notifications (Block D, for receipts), and a stable data model. Don't build until those are solid.
Goal
Stripe-based subscriptions. Free tier with hard limits. Paid tiers unlock higher limits. Failed-payment dunning. Self-serve upgrade + downgrade.
Pricing model (decision required — see open questions)
Recommended starter pricing (placeholder, validate with target market):
| Tier | Price | Events/mo | Guests/event | SMS/mo | Branding |
|---|---|---|---|---|---|
| Free | $0 | 1 | 50 | 0 (email only) | No |
| Personal | $19/event | 1 per purchase | 500 | 100 | Logo |
| Pro | $49/mo | 10 | 1,000 | 1,000 | Full |
| Business | $199/mo | Unlimited | 5,000 | 5,000 | + custom domain |
Schema changes
CREATE TABLE subscriptions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
stripe_customer_id TEXT NOT NULL,
stripe_subscription_id TEXT,
tier TEXT NOT NULL, -- 'free' | 'personal' | 'pro' | 'business'
status TEXT NOT NULL, -- 'active' | 'past_due' | 'canceled' | 'incomplete'
current_period_end TIMESTAMPTZ,
cancel_at_period_end BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON subscriptions(user_id) WHERE status = 'active';
CREATE TABLE usage_counters (
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
period_start DATE NOT NULL,
events_count INT NOT NULL DEFAULT 0,
sms_count INT NOT NULL DEFAULT 0,
PRIMARY KEY (user_id, period_start)
);
Backend
internal/billingpackage wrapping the Stripe SDKPOST /billing/checkout-session— returns a Stripe Checkout URL for the requested tierPOST /billing/portal— returns a Stripe Customer Portal URLPOST /webhooks/stripe— signature-verified, handles:customer.subscription.created/.updated/.deleted→ upsert intosubscriptionsinvoice.payment_failed→ trigger dunning email (Block D)invoice.payment_succeeded→ clear past-due state
- Enforcement: middleware checks usage against tier limits before allowing
POST /events,POST /events/{id}/guests, SMS triggers. Returns402 Payment Requiredwith the upgrade URL on limit.
Frontend
/billingpage: current plan, usage bars, upgrade/downgrade buttons- On
402, show modal: "You've hit your plan limit. Upgrade?" - Stripe Checkout opens in a new tab; on return, poll subscription state until updated by webhook
Tests
- Integration: free user can create 1 event, second fails with 402
- Integration: webhook signature verification rejects forged payloads
- Integration: cancellation flow keeps access until period end
Definition of done
- Stripe in test mode end-to-end working
- Webhook signatures verified
- Usage counters reset monthly (cron or compute on-demand)
- Receipts emailed via Stripe (default behaviour, just confirm enabled)
- Refund policy documented (referenced from billing page)
Effort: ~2 weeks.
Block G — Backups & disaster recovery
Mostly infra-owned, but the application side has documentation work.
Claude's scope
- All migrations have a
*.down.sqlthat's been tested locally - New
docs/RUNBOOK_RESTORE.mddocumenting the restore procedure step-by-step - Confirm Postgres connection string env var supports the recovery instance (no hardcoded primary-only hostnames)
- Optional: a
cmd/restore-verifytool that runs after a restore to assert schema invariants (guest counts ≈ rsvp counts, no orphaned tokens, etc.)
Human / infra scope
pg_basebackup+ WAL archiving to S3- Daily logical dump as a secondary safety net
- Cross-region replication of the S3 bucket
- Monthly restore drill scheduled
- Documented RTO (e.g. 1 hour) and RPO (e.g. 5 minutes)
Definition of done
- Every existing migration has a tested down migration
docs/RUNBOOK_RESTORE.mdexists and a fresh engineer could follow it- First restore drill completed successfully
Effort: ~2 days for the application-side work.
Block H — Privacy compliance
Legal documents are human-owned. Application-level support is Claude scope.
Claude's scope
GET /me/data-export— streams a JSON document with every record (user, events, guests, tokens, RSVPs, access_logs, notifications) belonging to the authenticated user. Long-running, so async: enqueue → email a link.DELETE /me— cascade-deletes the user and everything tied to them. Soft-delete first (setdeleted_at), hard-delete on a cron after 30 days to honour any in-flight legal holds.DELETE /events/{id}/guests/{guest_id}(host-triggered) — already exists in spirit; add a "forget this guest" action that removes RSVP/access-log rows but keeps the aggregate counter for the event.- Data retention: automated nightly job to soft-delete events whose
event_dateis older than 18 months (configurable per host once Tier 2). - Add
privacy_policy_accepted_atandterms_accepted_atcolumns tousers; block first login until both are accepted.
Human / legal scope
- Privacy policy + ToS drafted by a lawyer
- DPAs signed with Twilio, SES, Stripe, MaxMind, and any other subprocessor
- Public privacy page at
/privacy, ToS at/terms - Cookie banner (only required if analytics are added; currently we have none)
- GDPR Article 30 record of processing activities
Definition of done
GET /me/data-exportproduces a complete, parseable JSON dumpDELETE /mecascades correctly with no orphan rows (verified by FK constraints)- Privacy + ToS pages live and linked from the footer + signup form
- Acceptance enforced on first login after the launch date
- Retention cron job tested
Effort: ~3–4 days for the application work; legal work runs in parallel.
Cross-cutting concerns
These touch most blocks above; bake them in as you go, not as a separate pass.
Logging + auditing
Every state-changing endpoint logs: userID, action, target_id, result,
request_id. Use slog with a correlation ID middleware. Critical for
post-incident forensics.
Observability lite (Tier 3 scope, but minimum viable for launch)
- Prometheus
/metricsendpoint on the API exposing: request rate by endpoint, latency percentiles, 4xx/5xx counts,ratelimit_block_total - Sentry (or self-hosted GlitchTip) for unhandled errors, with release tagging
Feature flags
Lightweight feature_flags table or env-var driven (no LaunchDarkly yet).
Useful for rolling out Block F's billing without exposing it to all users at
once.
Open questions
Resolve before starting:
- Final pricing tiers — the table in Block F is a placeholder. Confirm with the target market (interview 10 wedding planners, 10 corporate event managers).
- Email provider — SES vs Postmark vs SendGrid. SES is cheapest but has the harshest deliverability ramp; Postmark is best for transactional but pricier.
- 2FA at launch or v1.1? — Recommend v1.1; one less moving piece on the launch path.
- Custom domain for RSVP pages at launch or v1.1? — Recommend v1.1 (Tier 2). Adds DNS + cert complexity.
- WebSocket auth mechanism — Recommend Block B option 3 (short-lived ticket).
- EU data residency at launch? — If targeting EU customers, this becomes Tier 1 (separate EU deployment). Otherwise defer to Tier 4.
Sequencing summary table
| Wave | Block | Depends on | Effort (1 eng) | Can parallelise with |
|---|---|---|---|---|
| 1 | A. Auth | — | 2w | — |
| 1 | B. Authz | A | 4d | C |
| 1 | C. Rate limiting | A (for userID) |
4d | B |
| 2 | D. Notifications | A | 2w | E |
| 2 | E. CSV import | B | 4d | D, F |
| 2 | F. Billing | A, D | 2w | E |
| 3 | G. Backups | — (infra) | 2d (Claude) | any |
| 3 | H. Privacy | A | 3d | any |
One engineer, sequential: ~9 weeks. Two engineers, parallel-where-possible: ~5.5 weeks.
What's not in Tier 1 (deliberate)
These are tempting but are Tier 2:
- Editable RSVPs (guests can change response after submitting)
- Multi-host collaborators
- Event branding (logo, colours, custom domain)
- Day-of QR check-in
- Better fraud-engine thresholds (false-positive feedback loop)
- Calendar integration
- Auto-reminders (1-day before, etc.)
- Mobile push notifications
Ship Tier 1 first. The launch story is "personal invitations + live tracking + quiet fraud detection + works reliably + you can pay us money". Everything else is the second release.