Lewati ke isi

Repository Tour

Purpose

A guided tour of the HUPH monorepo so you know where to look for each concern. Complements the setup page — this one focuses on "what lives where" rather than "how do I run things".

Prerequisites

  • You have cloned the repo and read setup
  • Basic familiarity with npm workspaces

Monorepo layout

Text Only
huph/
├── apps/
│   ├── api/                    # Node.js + Express API (port 3101)
│   │   ├── src/routes/         # webhook + REST routes (24 files incl. campaigns,
│   │   │                       # settings, monitoring, adminUsageStats, eval, crawl,
│   │   │                       # zitadelActions)
│   │   ├── src/services/       # intentRouter, leadCapture, leadScoring, leadActivity,
│   │   │                       # faqKbSync, ragClient, funnelEngine, counselorAssignment,
│   │   │                       # conversationGuard, mediaDownloader, notifications, realtime
│   │   ├── src/clients/        # langfuse.ts (judge decisions fetcher)
│   │   ├── src/jobs/           # nightlySync.ts + alertsSweep.ts (cron scheduler)
│   │   ├── src/auth/           # HMAC + NextAuth JWE + Zitadel JWKS verifiers
│   │   ├── src/middleware/     # dispatchAuth, requireInternalSecret, requireForwardedSession,
│   │   │                       # requireZitadelJwt, auditSensitiveAccess
│   │   ├── src/audit/          # logAuthEvent + logFeatureAccess (audit_log writers)
│   │   ├── src/db/             # Drizzle ORM schema
│   │   └── src/__tests__/      # Jest tests (~720 across 62 suites)
│   ├── admin/                  # Next.js 14 admin dashboard (dev 3103, prod 47293)
│   │   ├── src/app/            # app router pages
│   │   ├── src/components/     # shared + shadcn/ui components
│   │   ├── src/hooks/          # useSocketEvent, useAnalytics, etc.
│   │   └── src/lib/            # utilities (socket, time-format, serverFetch)
│   ├── crawler-worker/         # Background worker for Dify KB ingestion jobs
│   └── landing/                # Public landing page (static)
├── docker/
│   ├── nginx/                  # nginx vhost configs (in-repo sources)
│   └── clickhouse/             # ClickHouse tuning for Langfuse
├── scripts/                    # SQL migrations, bootstrap helpers, smoke tests
│   ├── migrate-*.sql           # manual migration files (applied via psql)
│   └── bootstrap-*.md / .sh    # one-off setup runbooks
├── docs/                       # This documentation site (MkDocs)
│   ├── guide/                  # Non-developer docs (ID source)
│   ├── dev/                    # Developer docs (EN source) — you are here
│   ├── reference/              # Glossary + changelog
│   ├── archive/                # Historical legacy docs + discovery
│   └── superpowers/            # Specs + plans archive (not rendered)
├── docker-compose.yml          # Primary infra (postgres + huph-api + crawler-worker)
├── docker-compose.dify.yml     # Dify AI stack (chat + KB + annotation)
├── docker-compose.langfuse.yml # Langfuse observability + ClickHouse
├── docker-compose.milvus.yml   # Milvus (vector DB used by Dify KB)
├── docker-compose.mem0.yml     # mem0 memory service
├── mkdocs.yml                  # Docs site config (at repo root)
├── docs-env/                   # Python venv for docs (gitignored)
├── site/                       # mkdocs build output (gitignored)
├── CLAUDE.md                   # Project instructions for AI assistants
├── CREDENTIALS.md               # Live credentials (gitignored — DO NOT commit)
└── uph-codebase/               # Legacy Laravel reference (DO NOT MODIFY)

Notable absences (compared to older CLAUDE.md / memory records): apps/rag/, apps/webchat/, apps/webchat-enterprise/, and packages/ do not currently exist. apps/rag was replaced by Dify in April 2026, and packages/ is planned but not yet materialized. Do not be misled by older documentation — always verify with ls /opt/huph/apps/ and cat docker-compose.yml as sources of truth.

Per-app purpose (1-sentence each)

App Purpose
apps/api Node.js + Express server: webhooks (360dialog), admin REST API, 4-tier intent routing, lead capture + scoring, counselor auto-assignment, activity timeline, funnel engine, FAQ KB sync, escalation routing, realtime Socket.io, Phase 0 HTTP auth, Dify chat dispatch
apps/admin Next.js 14 + React + Tailwind enterprise dashboard — conversation inbox, KB management, leads pipeline v2 with activity timeline, analytics (13 charts), FAQ with KB sync, follow-up, escalation routing, counselor dashboard, RBAC
apps/crawler-worker Background worker that picks up KB source changes and dispatches ingestion jobs to the Dify KB API; runs as its own Docker container
apps/landing Public static landing page for the product (separate from the admin app)

Data flow (single message)

Text Only
User WA message
360dialog webhook POST /webhook/whatsapp  (apps/api)
HTTP auth middleware (Phase 0, disabled by default)
Intent Router (4-tier classifier: regex → heuristic → Claude Haiku → default)
Persist conversation + messages (Postgres)
Lead capture pipeline (regex Layer 1 + Claude Haiku LLM Layer 2)
leadStore.upsert (atomic, cluster resolver for ownership)
Escalation rules engine (frustrated / long conversation / hot lead)
Lead scoring at milestones (3, 5, 7, every 5) → GPT-4o-mini
Activity timeline logging (score, cluster, funnel, assignment events)
Counselor auto-assignment (round-robin within cluster)
Dispatch to Dify chat-messages API (apps/api → huph-dify-api:5001)
Dify runs:
  • Annotation reply match (~300 ms FAQ hit) — bypasses LLM
  • OR full retrieval: OpenAI embeddings → Milvus similarity search
    → rerank → persona prompt assembly → Claude (via Dify workflow)
Response back to apps/api
Persist bot message + pg_notify trigger → pgBridge → Socket.io rooms
Admin clients in the lead's cluster room see the message in realtime
Reply sent back to user via 360dialog Business API

Where tests live

  • API (Jest): apps/api/src/**/__tests__/*.test.ts. Run via npm run test -w apps/api (from repo root) or npx jest (from apps/api/).
  • Admin: NO test infrastructure. Verify via tsc + browser smoke
  • curl. See running-tests.en.md.
  • crawler-worker: tests live alongside the source in apps/crawler-worker/ if present; verify via docker-compose logs crawler-worker during local runs.
  • Integration / smoke: scripts/smoke-e2e-synthetic.sh is the end-to-end auth + lead capture smoke, wired to npm run smoke:e2e.

Where configs live

  • .env — root env vars for all services (local)
  • .env.example — template committed to repo
  • docker-compose.yml + docker-compose.<service>.yml — container definitions
  • mkdocs.yml — this docs site
  • apps/api/tsconfig.json, jest.config.js, package.json
  • apps/admin/next.config.mjs, tailwind.config.ts

Where plans and specs live

  • docs/superpowers/specs/ — design specs for each phase (~35 files)
  • docs/superpowers/plans/ — execution plans matching each spec (~41 files)
  • docs/archive/ — legacy root docs from the Mar 18 phase-1 snapshot + discovery material

These directories are in exclude_docs — not rendered by MkDocs — but remain in git for history and AI-assisted work.

Gotchas

  • apps/rag/ is gone — replaced by Dify in April 2026. Any reference to a local apps/rag, npm run dev:rag, or self-hosted BGE-M3 is historical. Current AI pipeline lives in Dify.
  • apps/webchat/ and apps/webchat-enterprise/ are gone — deleted in the Apr 8 channel cleanup. WhatsApp is the only active channel. Don't expect Telegram or Web references to work.
  • packages/ does not exist yet. The root package.json has db:migrate and db:generate scripts that reference packages/database, but that workspace is not materialized. Running those commands today will fail. Use the raw SQL migration flow described in setup.
  • uph-codebase/ is a legacy Laravel reference — read-only. Do not modify it, do not run it, do not import from it.
  • CLAUDE.md is authoritative. It is updated regularly and includes all current routes, services, database tables, test coverage, and development conventions. When you spot drift in this file or CLAUDE.md, update both as part of the same PR.

See also