Running Tests
Purpose
How to run tests in the HUPH monorepo. The project uses different
test strategies per app — Jest for the Node.js API, and deliberate
"no test infrastructure" for the Next.js admin (verified via
tsc --noEmit + browser smoke). This page explains each.
Prerequisites
- Setup completed — you have
.venv, installed deps, and infrastructure running - Optional:
jqfor parsing smoke test JSON output
Test stacks per app
| App | Stack | How to run | How to verify changes |
|---|---|---|---|
apps/api |
Jest + ts-jest | npm run test -w apps/api |
Unit + integration tests |
apps/admin |
None | See "Admin verification" below | tsc + smoke + manual browser |
apps/crawler-worker |
Light integration only | docker-compose logs crawler-worker during local runs |
Trigger a KB crawl from admin and watch the log |
apps/landing |
None | Open in browser | — |
No apps/rag — the self-hosted RAG service was replaced by Dify in
April 2026. AI-pipeline tests (prompt assembly, retrieval quality,
reranker tuning) now happen inside Dify's own evaluation tooling,
not inside the HUPH repo.
API (Jest)
Run all API tests
Expected: ~720 tests passing across 62 suites (1 suite skipped, 3 tests intentionally skipped) covering:
intentRouter/— rules, cache, slots, orchestration, escalation rulesintentRouter/leadCapture/— regex + LLM extractor, gate, state machine, cluster resolver, lead storeleadScoring— scoring, clamping, label correction, hot notification, batchleadActivity— best-effort logging, timeline pagination, row mappingfaqKbSync— update/create, debounce, caching, Dify errors, markdown formatragClient— Dify chat dispatch, multi-question, persona classificationwebhookHelpers— dedup, escalation keywords, opt-out, save message, conversation upsertconversationContext— LLM context formatting, null leadcounselorAssignment— round-robin pickfunnelEngine— auto-transitions, manual validation, data completenessnotifications/— escalation notifierauth/— HMAC, JWT/JWE, auth typesmiddleware/— requireInternalSecret, requireForwardedSession, auditrealtime/— auth, rooms- Routes: leadsV2, analyticsV2, dashboardCounselor, webhookFollowUp
Run a single test file
Run by test name pattern
Known pre-existing failure
apps/api/src/__tests__/realtime.integration.test.ts has an
afterAll httpServer.close() issue predating any recent work.
Exclude it if it blocks your run:
Critical gotcha: mocked DB tests can hide SQL bugs
During the API HTTP auth Phase 0 work (Apr 9 2026), mocked DB tests passed but an integration test caught a real bug that only surfaced against a live Postgres. Lesson: for anything touching persistence (writes, triggers, atomic upserts, complex queries), add at least one integration test that runs against the real local Postgres, not just mocks. The 64 leadCapture tests include integration tests hitting real DB — use them as a template.
AI pipeline tests (Dify-owned)
There is no local pytest suite for AI pipeline behavior because the RAG pipeline now lives in Dify, not in this repo. For testing chat quality and retrieval:
- Retrieval Sandbox in admin → Knowledge base → Evaluation tab lets you run a question and see which documents Dify would retrieve to answer it.
- Golden QA Dataset — 21 curated questions with expected answers, baseline ~95.2% pass rate. Click Run Eval on the Evaluation tab to execute all.
- Dify's own eval tooling lives in the Dify admin UI at
https://dify.huph.val.id.
For reproducing a specific chat bug locally, you can curl the Dify chat-messages endpoint directly — see the recipe in debugging.en.md.
Admin (no test infrastructure)
The admin app intentionally has no Jest / Vitest / Playwright setup. Verification happens via:
- TypeScript check — catches type errors:
- Build check — catches runtime-missing imports and Next.js static analysis failures:
-
Manual browser smoke — log in, navigate the key pages (Conversations, Leads v2, Analytics v2, Knowledge base, FAQ, Follow-up, Counselor dashboard)
-
curl smoke test — hit the API endpoints the admin proxies through. The
scripts/smoke-e2e-synthetic.shscript has ~6 synthetic assertions for the full auth + lead capture flow (whenAPI_AUTH_MODE=enforce)
Why no component tests? Admin pages are mostly thin wrappers
around shared shadcn/ui components + SWR/polling. Testing them with
React Testing Library would duplicate shadcn's own tests and add
low-value mocks. The team concluded tsc + browser smoke gives
better signal for the effort. See the feedback_admin_no_test_infra
memory record for context.
Root-level npm run test
From repo root:
This delegates to npm run test -w apps/api (Jest). There is no
root-level pytest to run — AI pipeline testing is handled inside
Dify.
TDD expectations
For any new feature or bugfix in apps/api:
- Write the failing test first
- Run it, verify it fails with the expected error
- Implement the minimum code to make it pass
- Run again, verify it passes
- Commit test + implementation in the same commit (or as adjacent commits within the same PR)
See first-pr.en.md for commit conventions. For admin, TDD is not required — follow the smoke workflow above.
Gotchas
- Root-level
npx jestpicks up the wrong jest. From/opt/huphroot,npx jestpicks jest 30.2 from somewhere and can't readapps/api/jest.config.js. Always usenpm run test -w apps/apiorcd apps/api && npx jest. Spent ~10 minutes diagnosing this during the Apr 8 escalation routing work. - Jest config lives in
apps/api/jest.config.js, not at the root. If you copy-paste test commands from other monorepos, confirm the-w apps/apiworkspace filter. - Socket.io tests need a running Postgres for the pg_notify bridge. They connect to the same DB as the app — make sure infrastructure is up before running them.
- The docs site Python venv at
docs-env/at the repo root is for MkDocs only. There is no other Python venv in this repo — if old documentation mentionsapps/rag/venv/, that is historical.