feat(tier2): finish the finish line — Block H follow-ups, Block G geolocation, cross-cutting
Three threads of work land here together to close out Tier 2.
### Block H follow-ups — day-of check-in
- Scanner is now an "open on your phone" magic-link flow. Hosts on
desktop mint a scoped JWT via POST /events/{id}/scanner-ticket and
render its URL into a QR; phone scans it and lands on /scanner with
the ticket as bearer. The ticket carries Audience=scanner so it can
never substitute for a session token.
- Plus-one confirmation at the door: scan → POST /check-in/preview to
fetch guest + expected party size → confirm buttons ("Just them",
"Party of N", custom) → POST /check-in. No more silent arrival_count=1.
- Offline scan queue: failed POSTs go into an IndexedDB store and drain
on the 'online' event with poison-message protection.
- Day-of arrivals headline widget on the event overview, gated to the
host's local calendar date so it doesn't dominate the page weeks out.
- Tab nav restyled with inline heroicons + scrollable segmented control;
Check-in moves to the rightmost slot.
- PWA: manifest + service worker scoped to /scanner, generated 192/512
icons (Go scripted renderer in scripts/gen-scanner-icons.go).
- Confirmation email QR was rendering broken because html/template
rewrites data: URLs to #ZgotmplZ; mark the value as template.URL.
- Email "open your invitation" link 404'd because we had no token to
put after /rsvp/. Threaded AccessLink through the RSVPConfirmed NATS
event from the API at submit time.
### Block G remainder — geolocation + threshold preview
- Pluggable GeoResolver in the fraud engine (NullResolver, IPApiResolver
for the free ip-api.com fallback, MaxMindResolver behind GG_GEOIP_DB_PATH).
Wrapped in a Redis cache (30d TTL). Geo flows through both gRPC and
NATS scoring paths.
- geo_jump scoring feature: >500km in <1h flags ("accessed from Lagos
and Paris within 12 minutes"); >500km in <6h is a softer signal. The
existing single-signal cap keeps a lone geo_jump in MEDIUM.
- FraudScored event carries geo_country/city/lat/lon; ApplyScore uses
COALESCE so a later re-score without geo doesn't wipe earlier data.
- Threshold-slider live preview: GET /events/{id}/security/thresholds/preview
returns band counts the host's existing access events would have
fallen into under the proposed thresholds. Debounced (250ms) widget
under the Advanced sliders so the host gets concrete feedback instead
of guessing.
### Cross-cutting — audit, tier-gating, feature flags
- audit_log table + internal/audit.Recorder (async fire-and-forget on
detached context so an audit blip never fails the real action). Wired
into branding update, thresholds update, allowlist add/remove,
collaborator invite/role-change/remove, message create/send-now/cancel.
- Tier-gating: extended billing.Limits with MaxCollaborators,
CustomBranding, Scanner, Broadcasts. Free = none; Pro = 5 + all;
Business = unlimited. Gates the scanner-ticket, message create,
branding put, and collaborator invite endpoints with 402 +
structured upgrade payload. Auto-reminders, fraud detection, and
analytics deliberately stay on every tier — those are safety + visibility
features, not upsell levers.
- Feature flags: feature_flags table + internal/flags.Store with 30s
in-memory refresh, stable sha256(key + user_id) percent bucketing,
unknown-key-defaults-on. Six Tier 2 flags pre-seeded. Three handlers
(branding, broadcasts, scanner) check the kill switch ahead of the
tier gate so ops can pull a feature back without a redeploy.
### Verified
- go test ./... + fraud-engine pytest (12/12 incl. 3 new geo_jump tests + 5
new flags tests).
- docker compose build + up across api, fraud-engine, notifier, frontend.
- /health endpoints 200; migrations 0014 + 0015 applied; 6 flags
seeded; audit_log table + partial indexes confirmed.
- Fraud-engine logs confirm geo resolver kind=CachedGeoResolver provider=auto.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -12,6 +12,16 @@ class Settings(BaseSettings):
|
||||
stream_name: str = Field(default="GUESTGUARD")
|
||||
consumer_durable: str = Field(default="fraud-engine-access")
|
||||
|
||||
# Tier 2 Block G — geolocation enrichment.
|
||||
# provider: "auto" picks MaxMind when GG_GEOIP_DB_PATH points to an
|
||||
# existing .mmdb, else falls back to the free ip-api.com endpoint.
|
||||
# "null" turns geolocation off entirely (useful for tests).
|
||||
geoip_provider: str = Field(default="auto")
|
||||
geoip_db_path: str | None = Field(default=None)
|
||||
# Redis URL for caching geo lookups (30-day TTL). Empty means
|
||||
# uncached — every miss hits the upstream resolver.
|
||||
redis_url: str = Field(default="redis://redis:6379")
|
||||
|
||||
@property
|
||||
def host(self) -> str:
|
||||
return self.http_addr.split(":", 1)[0] or "0.0.0.0"
|
||||
|
||||
@@ -5,6 +5,7 @@ from datetime import UTC, datetime
|
||||
|
||||
from nats.aio.msg import Msg
|
||||
|
||||
from app.geo import GeoResolver
|
||||
from app.nats_bus import NatsBus
|
||||
from app.schemas import AccessAttempted, FraudScored
|
||||
from app.scoring import HeuristicScorer, risk_band
|
||||
@@ -16,10 +17,17 @@ SUBJECT_FRAUD_SCORED = "fraud.scored"
|
||||
|
||||
|
||||
class FraudConsumer:
|
||||
def __init__(self, bus: NatsBus, durable: str, scorer: HeuristicScorer) -> None:
|
||||
def __init__(
|
||||
self,
|
||||
bus: NatsBus,
|
||||
durable: str,
|
||||
scorer: HeuristicScorer,
|
||||
geo: GeoResolver | None = None,
|
||||
) -> None:
|
||||
self._bus = bus
|
||||
self._durable = durable
|
||||
self._scorer = scorer
|
||||
self._geo = geo
|
||||
self._subscription = None
|
||||
|
||||
async def start(self) -> None:
|
||||
@@ -45,7 +53,8 @@ class FraudConsumer:
|
||||
return
|
||||
|
||||
try:
|
||||
result = self._scorer.score(evt)
|
||||
geo = await self._geo.resolve(evt.ip_address) if self._geo else None
|
||||
result = self._scorer.score(evt, geo=geo)
|
||||
scored = FraudScored(
|
||||
event_id=evt.event_id,
|
||||
guest_id=evt.guest_id,
|
||||
@@ -55,6 +64,10 @@ class FraudConsumer:
|
||||
risk=risk_band(result.score),
|
||||
reasons=result.reasons,
|
||||
scored_at=datetime.now(UTC),
|
||||
geo_country=(geo.country if geo else None),
|
||||
geo_city=(geo.city if geo else None),
|
||||
geo_lat=(geo.lat if geo else None),
|
||||
geo_lon=(geo.lon if geo else None),
|
||||
)
|
||||
await self._bus.publish(
|
||||
SUBJECT_FRAUD_SCORED,
|
||||
|
||||
@@ -0,0 +1,287 @@
|
||||
"""Geolocation resolution for the fraud engine.
|
||||
|
||||
Tier 2 Block G wants two things here:
|
||||
|
||||
1. Enrich every scored access with (country, city, lat, lon) so the
|
||||
host UI can render "Sam opened from Lagos, Nigeria" rather than a
|
||||
raw IPv4 string nobody can read.
|
||||
|
||||
2. Surface large geographic jumps (>500 km in <1h) as a scoring
|
||||
signal — the geo_jump feature in scoring.py. That feature needs
|
||||
the *previous* access's coordinates, which we stash on the per-
|
||||
guest baseline alongside fingerprint + IP prefix.
|
||||
|
||||
Design choices:
|
||||
|
||||
* Pluggable resolvers. The spec mentions MaxMind GeoIP2 *or* a free
|
||||
HTTP API like ipapi.com. We support both: MaxMind reads a local
|
||||
GeoLite2 mmdb file (set `GG_GEOIP_DB_PATH`), and the default falls
|
||||
back to ip-api.com (free, no auth, 45 req/min/IP — fine for a
|
||||
homelab demo, sized to upgrade later).
|
||||
|
||||
* Redis cache wrapper. Lookups are stable for ~30 days; caching
|
||||
avoids hammering the upstream and keeps the synchronous gRPC scoring
|
||||
path fast on repeat opens of the same invitation.
|
||||
|
||||
* Private + invalid IPs short-circuit to None. Loopback, RFC1918, IPv6
|
||||
link-local etc. would just confuse the upstream and waste a Redis
|
||||
miss.
|
||||
|
||||
* Fail-open. A resolver error (network blip, malformed response) is
|
||||
*not* a scoring signal — we score with `geo=None` and move on.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import ipaddress
|
||||
import json
|
||||
import logging
|
||||
from dataclasses import dataclass
|
||||
from typing import Protocol
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class GeoLocation:
|
||||
country: str | None = None # ISO-3166 alpha-2 code, e.g. "NG"
|
||||
city: str | None = None
|
||||
lat: float | None = None
|
||||
lon: float | None = None
|
||||
|
||||
def __bool__(self) -> bool:
|
||||
return bool(self.country or self.city or self.lat is not None)
|
||||
|
||||
|
||||
class GeoResolver(Protocol):
|
||||
async def resolve(self, ip: str | None) -> GeoLocation | None: ...
|
||||
async def close(self) -> None: ...
|
||||
|
||||
|
||||
# --- helpers ---
|
||||
|
||||
|
||||
def _is_resolvable(ip: str | None) -> bool:
|
||||
if not ip:
|
||||
return False
|
||||
try:
|
||||
a = ipaddress.ip_address(ip)
|
||||
except ValueError:
|
||||
return False
|
||||
if a.is_loopback or a.is_private or a.is_link_local or a.is_multicast:
|
||||
return False
|
||||
if a.is_unspecified or a.is_reserved:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
# --- null (test / disabled) ---
|
||||
|
||||
|
||||
class NullResolver:
|
||||
"""Returns None for everything. Used in tests and when geolocation
|
||||
is explicitly disabled via `GG_GEOIP_PROVIDER=null`."""
|
||||
|
||||
async def resolve(self, ip: str | None) -> GeoLocation | None:
|
||||
return None
|
||||
|
||||
async def close(self) -> None:
|
||||
return None
|
||||
|
||||
|
||||
# --- ip-api.com (default for dev) ---
|
||||
|
||||
|
||||
class IPApiResolver:
|
||||
"""Resolves via http://ip-api.com — free, no auth, 45 req/min/IP.
|
||||
|
||||
We deliberately stay on HTTP (not HTTPS) because the free tier
|
||||
redirects HTTPS to a 403; the request carries no credentials so the
|
||||
cleartext-ness isn't a leak. Switch to a paid tier (ipapi.co, ipinfo,
|
||||
MaxMind) for production load.
|
||||
"""
|
||||
|
||||
def __init__(self, timeout_seconds: float = 1.5) -> None:
|
||||
# aiohttp is imported lazily so app.scoring can be unit-tested
|
||||
# without the optional HTTP dep installed.
|
||||
import aiohttp # noqa: PLC0415
|
||||
|
||||
self._aiohttp = aiohttp
|
||||
self._timeout = aiohttp.ClientTimeout(total=timeout_seconds)
|
||||
self._session = None
|
||||
|
||||
def _session_or_create(self):
|
||||
if self._session is None or self._session.closed:
|
||||
self._session = self._aiohttp.ClientSession(timeout=self._timeout)
|
||||
return self._session
|
||||
|
||||
async def resolve(self, ip: str | None) -> GeoLocation | None:
|
||||
if not _is_resolvable(ip):
|
||||
return None
|
||||
session = self._session_or_create()
|
||||
url = f"http://ip-api.com/json/{ip}?fields=status,country,countryCode,city,lat,lon"
|
||||
try:
|
||||
async with session.get(url) as resp:
|
||||
if resp.status != 200:
|
||||
logger.debug("geoip lookup non-200", extra={"ip": ip, "status": resp.status})
|
||||
return None
|
||||
data = await resp.json(content_type=None)
|
||||
except (self._aiohttp.ClientError, asyncio.TimeoutError) as exc:
|
||||
logger.debug("geoip lookup error", extra={"ip": ip, "err": str(exc)})
|
||||
return None
|
||||
if data.get("status") != "success":
|
||||
return None
|
||||
return GeoLocation(
|
||||
country=data.get("countryCode") or data.get("country"),
|
||||
city=data.get("city"),
|
||||
lat=data.get("lat"),
|
||||
lon=data.get("lon"),
|
||||
)
|
||||
|
||||
async def close(self) -> None:
|
||||
if self._session is not None and not self._session.closed:
|
||||
await self._session.close()
|
||||
|
||||
|
||||
# --- MaxMind GeoLite2-City (lazy import) ---
|
||||
|
||||
|
||||
class MaxMindResolver:
|
||||
"""Reads a local GeoLite2-City.mmdb. Lazy-imports `geoip2` so the
|
||||
base image doesn't carry it unless this resolver is actually
|
||||
selected. Synchronous reader; we call it in a thread to keep the
|
||||
asyncio loop unblocked."""
|
||||
|
||||
def __init__(self, db_path: str) -> None:
|
||||
import geoip2.database # noqa: PLC0415 — intentionally lazy
|
||||
|
||||
self._reader = geoip2.database.Reader(db_path)
|
||||
|
||||
async def resolve(self, ip: str | None) -> GeoLocation | None:
|
||||
if not _is_resolvable(ip):
|
||||
return None
|
||||
try:
|
||||
rec = await asyncio.to_thread(self._reader.city, ip)
|
||||
except Exception as exc: # noqa: BLE001 — generic per geoip2 raise hierarchy
|
||||
logger.debug("maxmind lookup error", extra={"ip": ip, "err": str(exc)})
|
||||
return None
|
||||
return GeoLocation(
|
||||
country=rec.country.iso_code,
|
||||
city=rec.city.name,
|
||||
lat=rec.location.latitude,
|
||||
lon=rec.location.longitude,
|
||||
)
|
||||
|
||||
async def close(self) -> None:
|
||||
try:
|
||||
self._reader.close()
|
||||
except Exception: # noqa: BLE001 — close is best-effort
|
||||
pass
|
||||
|
||||
|
||||
# --- Redis-cached wrapper ---
|
||||
|
||||
|
||||
class CachedGeoResolver:
|
||||
"""Wraps any resolver in a Redis cache. 30-day TTL because public
|
||||
IPs rarely change location, and dropping the wrong city for a few
|
||||
days is cheaper than re-querying ip-api.com on every page load."""
|
||||
|
||||
KEY_PREFIX = "gg:geo:v1:"
|
||||
TTL_SECONDS = 30 * 24 * 3600
|
||||
|
||||
def __init__(self, inner: GeoResolver, redis_client) -> None:
|
||||
self._inner = inner
|
||||
self._redis = redis_client
|
||||
|
||||
async def resolve(self, ip: str | None) -> GeoLocation | None:
|
||||
if not _is_resolvable(ip):
|
||||
return None
|
||||
key = self.KEY_PREFIX + ip # type: ignore[operator]
|
||||
try:
|
||||
cached = await self._redis.get(key)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
logger.debug("geo cache get failed", extra={"err": str(exc)})
|
||||
cached = None
|
||||
if cached:
|
||||
try:
|
||||
data = json.loads(cached)
|
||||
return GeoLocation(**data)
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
|
||||
result = await self._inner.resolve(ip)
|
||||
if result is not None:
|
||||
try:
|
||||
await self._redis.set(
|
||||
key,
|
||||
json.dumps(result.__dict__),
|
||||
ex=self.TTL_SECONDS,
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
logger.debug("geo cache set failed", extra={"err": str(exc)})
|
||||
return result
|
||||
|
||||
async def close(self) -> None:
|
||||
await self._inner.close()
|
||||
|
||||
|
||||
# --- factory ---
|
||||
|
||||
|
||||
async def make_resolver(
|
||||
*,
|
||||
provider: str,
|
||||
db_path: str | None,
|
||||
redis_url: str | None,
|
||||
) -> GeoResolver:
|
||||
"""Build the resolver stack from settings.
|
||||
|
||||
provider:
|
||||
- "null": NullResolver (geo disabled)
|
||||
- "ipapi": IPApiResolver
|
||||
- "maxmind": MaxMindResolver (requires db_path)
|
||||
- "auto": MaxMind if db_path file exists, else IPApi
|
||||
|
||||
Wraps in CachedGeoResolver when redis_url is set.
|
||||
"""
|
||||
inner: GeoResolver
|
||||
chosen = provider.lower()
|
||||
if chosen == "auto":
|
||||
chosen = "maxmind" if (db_path and _file_exists(db_path)) else "ipapi"
|
||||
|
||||
if chosen == "null":
|
||||
inner = NullResolver()
|
||||
elif chosen == "maxmind":
|
||||
if not db_path or not _file_exists(db_path):
|
||||
logger.warning("maxmind db missing — falling back to ipapi")
|
||||
inner = IPApiResolver()
|
||||
else:
|
||||
try:
|
||||
inner = MaxMindResolver(db_path)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
logger.warning("maxmind init failed — falling back to ipapi", extra={"err": str(exc)})
|
||||
inner = IPApiResolver()
|
||||
else:
|
||||
inner = IPApiResolver()
|
||||
|
||||
if not redis_url:
|
||||
return inner
|
||||
|
||||
try:
|
||||
import redis.asyncio as redislib # noqa: PLC0415
|
||||
|
||||
client = redislib.from_url(redis_url, decode_responses=True)
|
||||
await client.ping()
|
||||
logger.info("geo cache: redis connected", extra={"url": redis_url})
|
||||
return CachedGeoResolver(inner, client)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
logger.warning("geo cache: redis unavailable — running uncached", extra={"err": str(exc)})
|
||||
return inner
|
||||
|
||||
|
||||
def _file_exists(path: str) -> bool:
|
||||
import os # noqa: PLC0415
|
||||
|
||||
return os.path.isfile(path)
|
||||
@@ -7,6 +7,7 @@ from uuid import UUID
|
||||
|
||||
import grpc
|
||||
|
||||
from app.geo import GeoResolver
|
||||
from app.schemas import AccessAttempted
|
||||
from app.scoring import BLOCK, HIGH, LOW, MEDIUM, HeuristicScorer, risk_band
|
||||
from fraud.v1 import fraud_pb2, fraud_pb2_grpc
|
||||
@@ -22,8 +23,9 @@ _RISK_TO_PROTO = {
|
||||
|
||||
|
||||
class FraudServicer(fraud_pb2_grpc.FraudServiceServicer):
|
||||
def __init__(self, scorer: HeuristicScorer) -> None:
|
||||
def __init__(self, scorer: HeuristicScorer, geo: GeoResolver | None) -> None:
|
||||
self._scorer = scorer
|
||||
self._geo = geo
|
||||
|
||||
async def Score( # noqa: N802 — gRPC method
|
||||
self,
|
||||
@@ -46,7 +48,16 @@ class FraudServicer(fraud_pb2_grpc.FraudServiceServicer):
|
||||
await context.abort(grpc.StatusCode.INVALID_ARGUMENT, f"bad request: {exc}")
|
||||
raise # unreachable, abort raises
|
||||
|
||||
result = self._scorer.score(evt)
|
||||
# gRPC path is synchronous-from-the-caller's-perspective: the
|
||||
# API blocks the RSVP submit on this score. A 1.5s timeout on
|
||||
# the geo resolver keeps the worst case bounded even if the
|
||||
# upstream provider is slow. Fail-open: no geo → score as if
|
||||
# the feature weren't there.
|
||||
geo = None
|
||||
if self._geo is not None:
|
||||
geo = await self._geo.resolve(evt.ip_address)
|
||||
|
||||
result = self._scorer.score(evt, geo=geo)
|
||||
band = risk_band(result.score)
|
||||
return fraud_pb2.ScoreResponse(
|
||||
score=result.score,
|
||||
@@ -55,9 +66,13 @@ class FraudServicer(fraud_pb2_grpc.FraudServiceServicer):
|
||||
)
|
||||
|
||||
|
||||
async def serve_grpc(scorer: HeuristicScorer, addr: str) -> grpc.aio.Server:
|
||||
async def serve_grpc(
|
||||
scorer: HeuristicScorer,
|
||||
addr: str,
|
||||
geo: GeoResolver | None = None,
|
||||
) -> grpc.aio.Server:
|
||||
server = grpc.aio.server()
|
||||
fraud_pb2_grpc.add_FraudServiceServicer_to_server(FraudServicer(scorer), server)
|
||||
fraud_pb2_grpc.add_FraudServiceServicer_to_server(FraudServicer(scorer, geo), server)
|
||||
server.add_insecure_port(addr)
|
||||
await server.start()
|
||||
logger.info("grpc server started", extra={"addr": addr})
|
||||
|
||||
@@ -9,6 +9,7 @@ from fastapi import FastAPI
|
||||
|
||||
from app.config import load_settings
|
||||
from app.consumer import FraudConsumer
|
||||
from app.geo import make_resolver
|
||||
from app.grpc_server import serve_grpc, stop_grpc
|
||||
from app.nats_bus import NatsBus
|
||||
from app.scoring import HeuristicScorer
|
||||
@@ -29,15 +30,28 @@ async def lifespan(app: FastAPI):
|
||||
bus = NatsBus(settings.nats_url, settings.stream_name)
|
||||
await bus.connect()
|
||||
|
||||
geo = await make_resolver(
|
||||
provider=settings.geoip_provider,
|
||||
db_path=settings.geoip_db_path,
|
||||
redis_url=settings.redis_url,
|
||||
)
|
||||
logger.info(
|
||||
"geo resolver",
|
||||
provider=settings.geoip_provider,
|
||||
cached=bool(settings.redis_url),
|
||||
kind=type(geo).__name__,
|
||||
)
|
||||
|
||||
scorer = HeuristicScorer()
|
||||
consumer = FraudConsumer(bus, settings.consumer_durable, scorer)
|
||||
consumer = FraudConsumer(bus, settings.consumer_durable, scorer, geo=geo)
|
||||
await consumer.start()
|
||||
|
||||
grpc_server = await serve_grpc(scorer, settings.grpc_addr)
|
||||
grpc_server = await serve_grpc(scorer, settings.grpc_addr, geo=geo)
|
||||
|
||||
app.state.bus = bus
|
||||
app.state.consumer = consumer
|
||||
app.state.scorer = scorer
|
||||
app.state.geo = geo
|
||||
app.state.grpc = grpc_server
|
||||
app.state.settings = settings
|
||||
|
||||
@@ -46,6 +60,7 @@ async def lifespan(app: FastAPI):
|
||||
finally:
|
||||
await stop_grpc(grpc_server)
|
||||
await consumer.stop()
|
||||
await geo.close()
|
||||
await bus.close()
|
||||
|
||||
|
||||
|
||||
@@ -28,3 +28,11 @@ class FraudScored(BaseModel):
|
||||
risk: str
|
||||
reasons: list[str]
|
||||
scored_at: datetime
|
||||
|
||||
# Tier 2 Block G geolocation enrichment. All four optional because
|
||||
# private IPs, lookup failures, and disabled-provider mode all
|
||||
# leave them unset — the API consumer must handle null.
|
||||
geo_country: str | None = None
|
||||
geo_city: str | None = None
|
||||
geo_lat: float | None = None
|
||||
geo_lon: float | None = None
|
||||
|
||||
@@ -9,10 +9,13 @@ baseline established by the first one.
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import math
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
from uuid import UUID
|
||||
|
||||
from app.geo import GeoLocation
|
||||
from app.schemas import AccessAttempted
|
||||
|
||||
LOW = "low"
|
||||
@@ -36,12 +39,20 @@ class GuestBaseline:
|
||||
fingerprint_digest: str | None = None
|
||||
ip_prefix: str | None = None
|
||||
accesses: int = 0
|
||||
# Tier 2 Block G — geo_jump. Stash the most recent coordinates + the
|
||||
# timestamp of the access they came from so the next access can be
|
||||
# compared against them.
|
||||
last_lat: float | None = None
|
||||
last_lon: float | None = None
|
||||
last_geo_at: datetime | None = None
|
||||
last_country: str | None = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class ScoringResult:
|
||||
score: int
|
||||
reasons: list[str]
|
||||
geo: GeoLocation | None = None
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -53,11 +64,17 @@ class HeuristicScorer:
|
||||
"missing_signals": 0.10,
|
||||
"repeated_access": 0.10,
|
||||
"no_user_agent": 0.15,
|
||||
# Tier 2 Block G — geo_jump. Implausibly fast travel between
|
||||
# two accesses (>500 km in <1h) carries the heaviest weight
|
||||
# alongside fingerprint mismatch. Note the weights are not
|
||||
# required to sum to 1; the final score is clamped to
|
||||
# [0, 100] so the relative magnitudes are what matters.
|
||||
"geo_jump": 0.40,
|
||||
}
|
||||
)
|
||||
baselines: dict[UUID, GuestBaseline] = field(default_factory=dict)
|
||||
|
||||
def score(self, evt: AccessAttempted) -> ScoringResult:
|
||||
def score(self, evt: AccessAttempted, geo: GeoLocation | None = None) -> ScoringResult:
|
||||
reasons: list[str] = []
|
||||
sub: dict[str, int] = {}
|
||||
|
||||
@@ -97,7 +114,37 @@ class HeuristicScorer:
|
||||
else:
|
||||
sub["no_user_agent"] = 0
|
||||
|
||||
weighted = sum(sub[k] * self.weights[k] for k in self.weights)
|
||||
# geo_jump — implausibly fast travel between this access and the
|
||||
# previous one. 500 km in under an hour means either a private
|
||||
# jet or, far more likely, a stolen invitation being opened by
|
||||
# someone in a different country. Spec threshold from
|
||||
# docs/TIER2_PLAN.md Block G.
|
||||
sub["geo_jump"] = 0
|
||||
if (
|
||||
geo is not None
|
||||
and geo.lat is not None
|
||||
and geo.lon is not None
|
||||
and baseline.last_lat is not None
|
||||
and baseline.last_lon is not None
|
||||
and baseline.last_geo_at is not None
|
||||
):
|
||||
km = _haversine_km(
|
||||
baseline.last_lat, baseline.last_lon, geo.lat, geo.lon
|
||||
)
|
||||
dt = (evt.occurred_at - baseline.last_geo_at).total_seconds()
|
||||
if km > 500 and 0 < dt < 3600:
|
||||
sub["geo_jump"] = 100
|
||||
where_now = geo.city or geo.country or "elsewhere"
|
||||
where_before = baseline.last_country or "another location"
|
||||
mins = max(int(dt / 60), 1)
|
||||
reasons.append(
|
||||
f"accessed from {where_before} and {where_now} within {mins} minutes"
|
||||
)
|
||||
elif km > 500 and dt < 21600: # within 6h is still suspicious-but-possible
|
||||
sub["geo_jump"] = 50
|
||||
reasons.append(f"large geographic jump ({int(km)} km)")
|
||||
|
||||
weighted = sum(sub[k] * self.weights.get(k, 0) for k in sub)
|
||||
final = int(round(min(max(weighted, 0), 100)))
|
||||
|
||||
# Tier 2 Block G — tighten the consecutive-fingerprint false
|
||||
@@ -124,10 +171,24 @@ class HeuristicScorer:
|
||||
baseline.fingerprint_digest = current_digest
|
||||
if baseline.ip_prefix is None:
|
||||
baseline.ip_prefix = current_prefix
|
||||
if geo is not None and geo.lat is not None and geo.lon is not None:
|
||||
baseline.last_lat = geo.lat
|
||||
baseline.last_lon = geo.lon
|
||||
baseline.last_geo_at = evt.occurred_at
|
||||
baseline.last_country = geo.city or geo.country
|
||||
baseline.accesses += 1
|
||||
self.baselines[evt.guest_id] = baseline
|
||||
|
||||
return ScoringResult(score=final, reasons=reasons)
|
||||
return ScoringResult(score=final, reasons=reasons, geo=geo)
|
||||
|
||||
|
||||
def _haversine_km(lat1: float, lon1: float, lat2: float, lon2: float) -> float:
|
||||
"""Great-circle distance in kilometres. Earth radius 6371 km."""
|
||||
rlat1, rlat2 = math.radians(lat1), math.radians(lat2)
|
||||
dlat = math.radians(lat2 - lat1)
|
||||
dlon = math.radians(lon2 - lon1)
|
||||
a = math.sin(dlat / 2) ** 2 + math.cos(rlat1) * math.cos(rlat2) * math.sin(dlon / 2) ** 2
|
||||
return 2 * 6371.0 * math.asin(math.sqrt(a))
|
||||
|
||||
|
||||
def _fingerprint_digest(fp: dict[str, Any] | None) -> str | None:
|
||||
|
||||
@@ -12,6 +12,13 @@ dependencies = [
|
||||
"structlog>=24.4",
|
||||
"grpcio>=1.66",
|
||||
"protobuf>=5.28",
|
||||
# Tier 2 Block G — geolocation enrichment.
|
||||
# aiohttp drives the ip-api.com HTTP path; redis caches lookups.
|
||||
# geoip2 is intentionally NOT pinned here — when a homelab wants
|
||||
# to use MaxMind it can install it on top of the base image
|
||||
# (it's ~2.5MB) without forcing it on every deployment.
|
||||
"aiohttp>=3.10",
|
||||
"redis>=5.0",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
|
||||
@@ -108,4 +108,4 @@ async def test_invalid_uuid_returns_invalid_argument():
|
||||
|
||||
def test_servicer_constructs():
|
||||
# Ensures the servicer wires up against the generated stub.
|
||||
FraudServicer(HeuristicScorer())
|
||||
FraudServicer(HeuristicScorer(), None)
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
from datetime import UTC, datetime
|
||||
from datetime import UTC, datetime, timedelta
|
||||
from uuid import uuid4
|
||||
|
||||
from app.geo import GeoLocation
|
||||
from app.schemas import AccessAttempted
|
||||
from app.scoring import HeuristicScorer, risk_band
|
||||
|
||||
@@ -11,6 +12,7 @@ def _evt(
|
||||
fingerprint=None,
|
||||
ip=None,
|
||||
user_agent="Mozilla/5.0",
|
||||
occurred_at=None,
|
||||
):
|
||||
return AccessAttempted(
|
||||
event_id=uuid4(),
|
||||
@@ -20,7 +22,7 @@ def _evt(
|
||||
fingerprint=fingerprint,
|
||||
ip_address=ip,
|
||||
user_agent=user_agent,
|
||||
occurred_at=datetime.now(UTC),
|
||||
occurred_at=occurred_at or datetime.now(UTC),
|
||||
)
|
||||
|
||||
|
||||
@@ -79,3 +81,70 @@ def test_score_clamped_to_0_100():
|
||||
for i in range(12):
|
||||
res = scorer.score(_evt(guest_id=guest, fingerprint=None, ip=f"10.0.{i}.1", user_agent=None))
|
||||
assert 0 <= res.score <= 100
|
||||
|
||||
|
||||
# --- Tier 2 Block G: geo_jump ---
|
||||
|
||||
|
||||
_LAGOS = GeoLocation(country="NG", city="Lagos", lat=6.5244, lon=3.3792)
|
||||
_PARIS = GeoLocation(country="FR", city="Paris", lat=48.8566, lon=2.3522)
|
||||
|
||||
|
||||
def test_geo_jump_fires_on_implausible_travel():
|
||||
"""Two accesses 5,000+ km apart within 12 minutes is the textbook
|
||||
forwarded-link case the spec targets."""
|
||||
scorer = HeuristicScorer()
|
||||
guest = uuid4()
|
||||
t0 = datetime.now(UTC)
|
||||
|
||||
first = _evt(
|
||||
guest_id=guest,
|
||||
fingerprint={"ua": "Chrome"},
|
||||
ip="102.89.0.1",
|
||||
occurred_at=t0,
|
||||
)
|
||||
scorer.score(first, geo=_LAGOS)
|
||||
|
||||
twelve_mins_later = _evt(
|
||||
guest_id=guest,
|
||||
fingerprint={"ua": "Chrome"},
|
||||
ip="80.10.20.30",
|
||||
occurred_at=t0 + timedelta(minutes=12),
|
||||
)
|
||||
res = scorer.score(twelve_mins_later, geo=_PARIS)
|
||||
|
||||
assert any("Lagos" in r and "Paris" in r for r in res.reasons), res.reasons
|
||||
# geo_jump (100 × 0.40) + ip_change (80 × 0.20) = 56 weighted; the
|
||||
# single-signal cap (geo_jump + ip_change both ≥ 70, so 2 signals)
|
||||
# does NOT trigger and the score stays > the cap-at-55 floor.
|
||||
assert res.score >= 55, res.score
|
||||
|
||||
|
||||
def test_geo_jump_does_not_fire_for_local_movement():
|
||||
"""A train-trip distance shouldn't escalate."""
|
||||
scorer = HeuristicScorer()
|
||||
guest = uuid4()
|
||||
t0 = datetime.now(UTC)
|
||||
london = GeoLocation(country="GB", city="London", lat=51.5074, lon=-0.1278)
|
||||
brighton = GeoLocation(country="GB", city="Brighton", lat=50.8225, lon=-0.1372)
|
||||
|
||||
scorer.score(
|
||||
_evt(guest_id=guest, fingerprint={"ua": "Chrome"}, ip="80.0.0.1", occurred_at=t0),
|
||||
geo=london,
|
||||
)
|
||||
res = scorer.score(
|
||||
_evt(
|
||||
guest_id=guest,
|
||||
fingerprint={"ua": "Chrome"},
|
||||
ip="80.0.0.1",
|
||||
occurred_at=t0 + timedelta(hours=2),
|
||||
),
|
||||
geo=brighton,
|
||||
)
|
||||
assert not any("Lagos" in r or "minutes" in r.lower() for r in res.reasons)
|
||||
|
||||
|
||||
def test_geo_jump_carries_geo_back_on_result():
|
||||
scorer = HeuristicScorer()
|
||||
res = scorer.score(_evt(fingerprint={"ua": "Chrome"}, ip="102.89.0.1"), geo=_LAGOS)
|
||||
assert res.geo is _LAGOS
|
||||
|
||||
Reference in New Issue
Block a user