Lewati ke isi

API Architecture

Purpose

Deep-dive into apps/api — the Node.js + Express server that receives webhooks, routes intents, captures leads, escalates, notifies, and serves the admin dashboard's REST API. Start here if your task touches routes, middleware, persistence, intent routing, or realtime events.

Prerequisites

Entry point and layout

apps/api/
├── src/
│   ├── index.ts                    # Express bootstrap, middleware mount
│   ├── routes/
│   │   ├── webhook.ts               # POST /webhook/whatsapp
│   │   ├── agent.ts                 # /api/v1/agent/*
│   │   ├── leads.ts                 # /api/v1/leads
│   │   ├── leadsV2.ts               # /api/v1/leads/v2 (Phase 2A)
│   │   ├── kb.ts                    # /api/v1/kb/*
│   │   ├── faq.ts                   # /api/v1/faq
│   │   ├── followUp.ts              # /api/v1/follow-up/*
│   │   ├── messages.ts              # /api/v1/messages/*
│   │   ├── webchat.ts               # DELETED in Apr 8 cleanup
│   │   └── telegram.ts              # DELETED in Apr 8 cleanup
│   ├── services/
│   │   ├── intentRouter/
│   │   │   ├── types.ts              # Intent, HandlerResult
│   │   │   ├── classifier.ts         # 4-tier classifier
│   │   │   ├── handlers/             # wantRegister, sharePersonalInfo, ...
│   │   │   ├── escalationRules/      # Rule engine + actions
│   │   │   └── leadCapture/
│   │   │       ├── extractor/        # regex + llm + index
│   │   │       ├── stateMachine.ts
│   │   │       ├── clusterResolver.ts
│   │   │       └── leadStore.ts       # Atomic upsert
│   │   ├── notifications/
│   │   │   └── escalationNotifier.ts # Fan-out to cluster + globals
│   │   └── realtime/
│   │       ├── server.ts              # Socket.io server
│   │       ├── pgBridge.ts            # Postgres LISTEN → Socket.io
│   │       ├── auth.ts                # JWE decode (uses shared verifier)
│   │       └── rooms.ts               # Room resolver
│   ├── auth/
│   │   ├── types.ts                   # AuthMode, VerifiedSession
│   │   ├── hmacVerifier.ts            # Layer 1 — HMAC constant-time
│   │   ├── jwtVerifier.ts             # NextAuth JWE decode (canonical)
│   │   └── d360Signature.ts           # DELETED (360dialog tier no App Secret)
│   ├── middleware/
│   │   ├── requireInternalSecret.ts    # Layer 1 — HMAC gate
│   │   ├── requireForwardedSession.ts  # Layer 2 — X-Forwarded-Session decode
│   │   └── auditSensitiveAccess.ts     # Sensitive pattern + write op audit
│   ├── audit/
│   │   └── logAuthEvent.ts             # Fire-and-forget audit_log writer
│   ├── db/
│   │   └── schema.ts                   # Drizzle ORM schema
│   └── __tests__/                      # Jest tests
├── jest.config.js
├── package.json
└── tsconfig.json

Route map

Path Handler Auth Notes
GET / exempt Landing
GET /health exempt Simple health check
POST /webhook/whatsapp routes/webhook.ts exempt 360dialog webhook (no HMAC signing — tier doesn't expose App Secret)
GET /webhook/whatsapp exempt hub.verify_token bootstrap
GET /api/v1/health/realtime exempt Socket.io + pgBridge status
GET /api/v1/health/full gated Deep health (info disclosure — gated)
/api/v1/agent/* routes/agent.ts gated Agent inbox messages, reply
/api/v1/leads routes/leads.ts gated Legacy leads
/api/v1/leads/v2/* routes/leadsV2.ts gated Phase 2A leads (filter, summary, detail, patch)
/api/v1/kb/* routes/kb.ts gated KB documents, eval, sources, gaps
/api/v1/faq routes/faq.ts gated Local FAQ + Dify sync
/api/v1/follow-up/* routes/followUp.ts gated Queue, rules, manual trigger
/api/v1/messages/* routes/messages.ts gated Thumbs, correction

Middleware chain

Incoming request
     ↓
Helmet (security headers)
     ↓
CORS
     ↓
express.json({ limit: '50mb', verify: (...) => req.rawBody = buf })
     ↓  (verify hook captures raw bytes BEFORE parsing — used by
         webhook HMAC when enabled)
Request ID + Pino logger
     ↓
Layer 1 — requireInternalSecret (HMAC gate, 3-mode)
     ↓
Layer 2 — requireForwardedSession (NextAuth JWE decode)
     ↓
auditSensitiveAccess (pattern + write-op audit)
     ↓
Route handler
     ↓
Zod validation (request body / params)
     ↓
Business logic
     ↓
Error handler

Auth layer details

  • Phase 0 (current): API_AUTH_MODE=disabled — middleware code is deployed but gates are no-ops. Zero behavior change.
  • Phase 1: API_AUTH_MODE=warn — middleware logs auth failures to audit_log but allows requests. 24-72h observation.
  • Phase 2: API_AUTH_MODE=enforce — blocks invalid requests with 401. 7+ days soak.
  • Phase 3: remove disabled/warn branches from code.

Rollback at any phase is a sub-30s env flip:

sed -i 's|^API_AUTH_MODE=.*|API_AUTH_MODE=disabled|' /opt/huph/.env
docker compose up -d --no-deps huph-api

Key subsystem walkthroughs

Intent Router

4-tier classifier in services/intentRouter/classifier.ts:

  1. Deterministic regex — whitelist phrases like "mau daftar", "minta bantuan manusia"
  2. Keyword + heuristic — scoring based on language patterns
  3. Claude Haiku — LLM classification as fallback
  4. Default handler — generic information intent

Handlers in handlers/wantRegister, wantVisitCampus, sharePersonalInfo, etc. Each handler returns a HandlerResult with intent, entities, escalation?, notifyAdmin?.

Lead Capture Phase 2A

Pipeline in services/intentRouter/leadCapture/:

  • extractor/regex.ts — Indonesian patterns (nama saya X, panggil X, phone 62xxx)
  • extractor/llm.ts — Claude Haiku via Vercel AI SDK + Zod
  • extractor/index.ts — gate decides regex vs LLM
  • stateMachine.tsawaiting_name → awaiting_email → captured, 6h TTL
  • leadStore.ts — atomic INSERT ... ON CONFLICT with CASE-based status recompute

See the project_lead_capture_phase2a memory for the full truth table and gotchas.

Realtime substrate

Single pg.Client in realtime/pgBridge.ts LISTENs on huph_events and forwards events to Socket.io rooms scoped by user:, role:, cluster:, conversation:, global:.

5 DB trigger functions: notify_message_event, notify_conversation_event, notify_lead_event, notify_notification_event, notify_followup_event. All use AT TIME ZONE 'UTC' to avoid the WIB timestamp bug.

Health: GET /api/v1/health/realtime returns pgBridge.connected, socketio.namespaces, eventCount.

Notification dedup (recent)

Recent composite index (user_id, event_key, created_at) on notifications enforces dedup for the rapid-fire fanout 30s window. See commit 8ce2c28 on main.

Gotchas (permanent)

  1. NextAuth uses JWE, not signed JWT. Use next-auth/jwt encode/decode, never jsonwebtoken.sign/verify. Silent failure otherwise.
  2. express.json() runs before route middleware. If you need raw bytes for a webhook HMAC, use the verify hook on the global express.json() to capture req.rawBody. Pattern from Stripe/Twilio webhook recipes.
  3. docker-compose.yml needs explicit env passthrough. Writing to .env isn't enough — the container reads from the environment: block unless env_file: is set. Add ${VAR:-default} passthroughs for every new env var.
  4. Don't duplicate JWT decode. jwtVerifier.ts is the single source of truth. Both HTTP middleware and Socket.io auth.ts delegate to it.
  5. 360dialog tier doesn't expose App Secret. Webhook HMAC signing was implemented and then removed in commit 419fe01. Don't try to add it back until a tier upgrade.
  6. Notification dedup window is 30s. If 2 escalation rules fire within 30 seconds on the same (user, event_key), only one row persists. Before the composite index, this was a fanout storm.
  7. api.auth.* audit rows require API_AUTH_MODE != disabled. In disabled mode the middleware is a no-op and writes nothing.

See also