Decisions

This page captures the non-obvious architectural decisions NXT has made. Each entry: what was chosen, why, what was rejected, consequences.

For deferred work (decisions to NOT build something today), see Roadmap.

D1 · Deterministic matching engine, not RAG

Chosen: Deterministic computation over federal Scorecard fields. AI used only for the per-school blurb.

Rejected: RAG (V1's approach) — embeddings + vector store + retrieval-augmented generation.

Why: The signals that decide a match (GPA, SAT, admit rate, net price, completion rates) are numeric. Same input must produce the same output, every time, with an explainable line. RAG retrieval is non-deterministic in practice, has weaker coverage for institutions without rich text descriptions, and grows in cost per query. See AI and matching for the full history.

Consequences: Deterministic matching is auditable and fast. AI cost capped at one blurb per (user, school) per 30 days. Semantic search not available out of the box; if needed, build additive to the deterministic engine.

D2 · Convex as the entire backend

Chosen: Convex Cloud for database + functions + crons + real-time subscriptions.

Rejected: Separate Postgres + REST/GraphQL API server + Redis subscriptions.

Why: Types flow end-to-end via @app/data-contract. Subscriptions arrive for free. One deploy ships database, functions, and crons together. Atomic deploys eliminate half-deployed states.

Consequences: Vendor lock-in on Convex — accepted explicitly. @app/data-contract keeps the schema portable. Migration is a multi-month project; the trade-off is one fewer service to operate today.

D3 · WorkOS AuthKit, not a custom auth implementation

Chosen: WorkOS AuthKit for identity. Shared tenant across mobile + web + docs.

Rejected: Build-your-own (bcrypt + JWT + session table), Auth0, Clerk.

Why: WorkOS handles password hashes, MFA, magic links, OAuth, and session JWTs in one provider. Migration to another OIDC provider is a few days of auth wiring if ever needed. AuthKit Pro tier (SSO, SCIM) is available without re-architecting if NXT signs an enterprise customer.

Consequences: Single-tenant dependency on WorkOS. NXT does not store passwords. WorkOS sees every sign-in.

D4 · Mobile-first, marketing-only on web

Chosen: Product lives in apps/mobile. apps/web is a marketing site (home, About, Contact, privacy/terms, account-deletion request).

Rejected: A web product surface (authenticated routes on wearenxt.com).

Why: Adding a web product doubles the auth surface to harden, duplicates the design system, increases deploy blast radius. Mobile is where the user signal lives. The marketing site's job is App Store + Play Store installs, not to compete with the app.

Consequences: Students cannot use NXT on a desktop. If enterprise customers (districts, counselor admins) require a browser-based view, that's net-new work — scope estimate in Roadmap.

D5 · Live Scorecard with cache, not bulk CSV ingestion

Chosen: Read College Scorecard API live, cache results in colleges, refresh monthly via cron.

Rejected: V1's bulk CSV ingestion — download MERGED2023_24_PP.csv, parse + transform + insert + AI-enrich every school in a batch run.

Why: Ingestion runs were slow (2–3 sec/school × 6K institutions), expensive (per-school OpenAI + Serper + logo.dev calls), and fragile (federal field renames broke runs). Live + cache is always current and pays only for institutions users actually see.

Consequences: First-time fetch on a cold school adds a small latency cost (mitigated by the monthly warm refresh). The colleges table grows naturally based on demand, not pre-populated.

D6 · `v.any()` for the `colleges` table shape

Chosen: colleges: defineTable(v.any()) — no Convex validator on the shape.

Rejected: A typed Convex schema mirroring every Scorecard field.

Why: Scorecard's schema is wide (~100 fields, occasionally renamed) and NXT doesn't own it. Forcing a strict validator would break ingestion every federal release and duplicate types across mapper + schema.

Consequences: Writes are not Convex-validated. Mitigation: every write goes through colleges/internal.ts upsert, which uses a typed builder. All reads go through lib/collegeShape.ts helpers.

D7 · Wrapped Convex factories, lint-enforced

Chosen: userQuery / userMutation / userAction wrappers + custom/no-raw-convex-throw ESLint rule.

Rejected: Raw query() / mutation() / action() factories.

Why: Raw factories are publicly callable. A mutation that "only the admin should use" is a parameter-spoofing vulnerability waiting to happen. Wrappers pre-resolve ctx.user from the verified WorkOS session and bake authorization into the factory.

Consequences: Lint catches violations before they merge. New engineers must learn the wrapper pattern but the lint rule is loud enough.

D8 · One i18n catalog for mobile + web

Chosen: Single packages/i18n/locales/en.json. Mobile (i18next) and web (next-intl) read the same file. pnpm i18n:lint enforces no hardcoded strings.

Rejected: Per-app catalogs, dynamic translation services.

Why: A second language is a translation pass, not engineering work. Single catalog means a single rename PR (pnpm i18n:rename) updates both apps.

Consequences: Mobile + web copy stay in sync by default. Adding Spanish is one translation file.

D9 · No staging Convex deployment

Chosen: One production + one dev Convex deployment. Mobile staging (TestFlight) builds point at production Convex.

Rejected: A separate staging Convex deployment.

Why: Doubles the operator burden — two sets of crons firing, two sets of credentials, two webhook endpoints. The dev deployment already serves QA + integration testing. Signed-in test accounts cover the gap.

Consequences: A bad migration on production Convex has no staging tier to catch it. Convex deploys are atomic; rollback is fast. Re-evaluate if a production-only bug ships through dev testing.

D10 · No CI deploys for backend (yet)

Chosen: Production Convex deploy runs from a local operator machine. Web auto-deploys via Vercel on push to main.

Rejected: GitHub Actions-based Convex deploy on main.

Why: Today's local deploy gives a human checkpoint before code reaches production. Convex deploys are atomic; the cost of a bad deploy (revert + redeploy) is low.

Consequences: Bottlenecks on a single operator. Move to CI deploys when more than one engineer is regularly pushing — see Roadmap.

On this page