Architecture Overview

Purpose

High-level view of the HUPH system: components, how they talk to each other, and the data flow for a single user message. Use this as a mental model before diving into per-component pages (API, RAG, Admin, Integrations).

Prerequisites

Setup — you can run the services
Repository tour — monorepo layout

System diagram

                            HUPH System Overview (April 2026)
    ┌───────────────────────────────────────────────────────────┐
    │                                                             │
    │   [WhatsApp User]                    [Operator Browser]     │
    │         │                                    │               │
    │         │ WA message                         │ https          │
    │         ↓                                    ↓               │
    │   ┌─────────────────┐              ┌──────────────────┐     │
    │   │ 360dialog       │              │ nginx            │     │
    │   │ webhook         │              │ admin.huph.val.id│     │
    │   └────────┬────────┘              └────────┬─────────┘     │
    │            │ POST /webhook/whatsapp          │ proxy          │
    │            ↓                                 ↓               │
    │   ┌─────────────────────────────────────────────────┐       │
    │   │      apps/api — Node.js + Express (3101)        │       │
    │   │  • webhook + admin REST (HMAC + JWE auth P0)     │       │
    │   │  • Intent Router (4-tier classifier)             │       │
    │   │  • Lead Capture (regex + Claude Haiku hybrid)    │       │
    │   │  • Escalation + Notification fan-out             │       │
    │   │  • Realtime Socket.io (/admin namespace)         │       │
    │   │  • Dispatch to Dify for chat + KB                │       │
    │   └──┬─────────┬────────────┬────────────┬──────────┘       │
    │      │         │            │            │                    │
    │      ↓         ↓            ↓            ↓                    │
    │  ┌────────┐ ┌────────┐ ┌──────────┐ ┌─────────────────┐      │
    │  │Postgres│ │Valkey  │ │ Dify AI  │ │ apps/crawler-   │      │
    │  │ 5433h  │ │49379h  │ │ stack    │ │ worker (bg job) │      │
    │  │ 5432c  │ │ 6379c  │ │ 5001     │ │ KB ingestion    │      │
    │  └────────┘ └────────┘ └────┬─────┘ └─────────────────┘      │
    │                              │                                │
    │                              ↓                                │
    │                    ┌──────────────────────┐                  │
    │                    │ Dify internal:        │                  │
    │                    │ • annotation reply    │                  │
    │                    │ • workflow orchestr.  │                  │
    │                    │ • LLM call (Claude)   │                  │
    │                    │ • embeddings (OpenAI) │                  │
    │                    │ • retrieval (Milvus)  │                  │
    │                    └──────────┬───────────┘                  │
    │                                │                              │
    │                                ↓                              │
    │                    ┌──────────────────────┐                  │
    │                    │ Milvus vector DB     │                  │
    │                    │ (from                │                  │
    │                    │  docker-compose      │                  │
    │                    │  .milvus.yml)        │                  │
    │                    └──────────────────────┘                  │
    │                                                                │
    │   ┌─────────────────┐        ┌──────────────────┐             │
    │   │ Phoenix (6006)  │        │ Langfuse         │             │
    │   │ OpenTel traces  │        │ LLM observability│             │
    │   └─────────────────┘        └──────────────────┘             │
    │                                                                │
    └───────────────────────────────────────────────────────────────┘

Notable: there is no self-hosted RAG service in this diagram. A Python FastAPI service at apps/rag/ existed in earlier versions (Mar 2026) with BGE-M3 + reranker, but was replaced by Dify in early April 2026. If you see older diagrams with apps/rag on port 3102, they are historical.

Component responsibilities

Component	Port (host/container)	Tech	Key deps	Responsibility
`apps/api`	3101	Node.js + Express + TS	Drizzle, ioredis, Zod, tsx, next-auth/jwt	Webhook receiver, intent routing, lead capture, escalation, REST API, realtime substrate, Phase 0 HTTP auth, Dify dispatch
`apps/admin`	47293 (prod) / 3103 (dev)	Next.js 14 + React + Tailwind	shadcn/ui, Recharts, Socket.io client, NextAuth	Enterprise dashboard
`apps/crawler-worker`	n/a (background)	Node.js worker	Dify KB API	Scheduled crawl → Dify KB ingestion jobs
`apps/landing`	static	HTML/CSS	—	Public landing page
Postgres	5433 / 5432	Postgres 16	—	Primary DB (conversations, leads, notifications, admin_users, clusters, audit_log)
Valkey	49379 / 6379	Valkey (Redis-compatible)	—	Session cache, short-lived state
Dify AI stack	5001	Self-hosted Dify (dify-api, dify-worker, dify-sandbox, dify-web, ssrf-proxy, plugin-daemon, beat)	Milvus, OpenAI (embeddings), Anthropic (chat)	Chat completion, annotation reply (~300ms FAQ match), KB management, persona-driven workflows
Milvus	from docker-compose.milvus.yml	Milvus	—	Vector DB used by Dify KB
mem0 (optional)	—	mem0ai	Neo4j backend	Per-user long-term memory (if enabled via docker-compose.mem0.yml)
Phoenix	6006	Arize Phoenix	—	OpenTelemetry trace collection from apps/api
Langfuse	behind nginx	Langfuse self-hosted + ClickHouse	—	LLM observability (prompts, latencies, cost) — historical ClickHouse OOM fix documented in ops runbook

apps/api is the Node.js web server. There is no longer a separate Python service in this repo for AI generation — it runs inside the Dify stack.

Data flow: one user message

Trace through what happens when a prospective student sends "berapa biaya FK?" on WhatsApp.

1. User sends "berapa biaya FK?" via WhatsApp
2. 360dialog delivers a webhook POST to huph.val.id/webhook/whatsapp
3. apps/api:
   a. HTTP auth middleware (Phase 0 disabled by default, Phase 1+
      logs auth warnings to audit_log, Phase 2 enforces 401 blocks)
   b. Webhook handler persists conversation + message (Postgres)
   c. Intent Router classifies → intent='ask_fee', program='FK'
      (regex → heuristic → Claude Haiku → default fallback)
   d. Lead Capture runs:
      - Regex extractor (Layer 1) scans the message
      - State machine checks current capture state (6h TTL)
      - Claude Haiku LLM Layer 2 called if gated
      - leadStore.upsert merges contact data atomically
      - clusterResolver sets cluster_id (program_match → CBT/CHS/etc)
   e. Escalation rules evaluated (frustrated? long convo? hot lead?)
4. apps/api dispatches to Dify chat-messages API at huph-dify-api:5001
5. Dify chat-messages workflow:
   a. Annotation reply check — if FAQ match → return in ~300 ms
   b. Otherwise → Dify workflow executes:
      - OpenAI embeddings for the query
      - Milvus similarity search (Dify-managed collection)
      - Persona prompt built from Chatbot Settings variables
        (tone, emoji_usage, answer_length, guidance_rules)
      - Claude call (via Dify's Anthropic integration)
      - Response assembled with citations
6. Response back to apps/api
7. apps/api:
   a. Persists bot message in DB
   b. pg_notify fires trigger → pgBridge forwards to Socket.io rooms
      scoped by cluster, conversation, user, or role
   c. Notification fan-out if escalation rule triggered
8. apps/api sends response back to user via 360dialog send API
9. Admin clients subscribed to conversation's cluster room receive
   the bot reply in realtime (no browser refresh)

Canonical specs to read

For deeper understanding of specific subsystems, read these specs in docs/superpowers/specs/:

Topic	Spec
Intent Router Phase 1	`2026-04-07-intent-routing-design.md`
Lead Capture Phase 2A	`2026-04-07-intent-routing-phase2a-design.md`
Realtime Socket.io	`2026-04-07-realtime-socketio-design.md`
Team Ownership	`2026-04-08-team-ownership-design.md`
Escalation Routing Phase 1	`2026-04-08-escalation-routing-phase1-design.md`
RBAC Phase 1.5	`2026-04-08-rbac-phase15-design.md`
API HTTP Auth	`2026-04-09-api-http-auth-design.md`
Counselor Dashboard	`2026-04-08-counselor-dashboard-design.md`

These aren't rendered by MkDocs (exclude_docs) but are in git at docs/superpowers/specs/.

Gotchas

Docker internal URLs differ from host URLs. Inside containers use postgres:5432 and huph-valkey:6379. From host, use localhost:5433 and localhost:49379. Mismatch = confusing "connection refused" errors.
Dify is the AI pipeline now. Do NOT look for apps/rag/ — it was removed. Dify is hosted as a separate docker-compose stack (docker-compose.dify.yml) with its own containers (dify-api, dify-worker, dify-web, dify-sandbox, etc.).
Vector DB is Milvus, not Qdrant. Qdrant was decommissioned in Mar/Apr 2026. Dify's KB dataset lives in Milvus and is accessed via Dify's /v1/datasets/* API — the API layer never talks to Milvus directly.
Cache is Valkey, not Redis. Valkey is Redis-API-compatible but is a distinct project. Client libraries treat it the same.
WhatsApp is the only active channel. Telegram and Web were deleted in the Apr 8 cleanup. The apps/webchat* directories are gone. Any reference to channel='telegram' or channel='web' in old code or docs is dead.
Specs drift from code over time. Always verify specs against the current source (especially file paths, function names) before acting on them. They are historical design documents.
CLAUDE.md is partially stale. Its architecture section still lists apps/rag, packages/, Qdrant, and BGE-M3. Treat its conventions section as authoritative (commit style, no-rate-limiting, enterprise shell pattern) but verify its architecture claims against docker-compose.yml + apps/ listing.