Skip to content

Annotation Reply Re-indexing

Purpose

How to verify, diagnose, and restore Dify annotation reply when FAQ fast-path responses stop working. Annotation reply provides ~400ms cached responses for the 101 FAQ questions, vs 5-7 seconds through the full chatflow.

Architecture

Text Only
User query → Dify API
  → Layer 1: Annotation Reply (Milvus vector match, score >= 0.7)
     → HIT: return cached answer (~400ms, $0)
     → MISS: continue to chatflow
  → Layer 2: Chatflow (classifier → KB → LLM, 5-7s, ~$0.06)
  • Annotations stored in: message_annotations table (Dify DB)
  • Annotation vectors in: Milvus collection Vector_index_cde7e3c2_7bf2_407a_8d35_93d61425de60_Node
  • Settings in: app_annotation_settings table (Dify DB)
  • Binding in: dataset_collection_bindings table (Dify DB)
  • Hit history: app_annotation_hit_histories table (Dify DB)
  • Embedding model: langgenius/openai/openai / text-embedding-3-small

Verify annotation reply is working

Quick test

Bash
START=$(date +%s%3N) && \
docker exec huph-dify-api curl -s -X POST \
  "http://localhost:5001/v1/chat-messages" \
  -H "Authorization: Bearer $DIFY_APP_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query":"Ada asrama di UPH?","user":"test","conversation_id":"","inputs":{},"response_mode":"blocking"}' \
  > /dev/null 2>&1 && \
END=$(date +%s%3N) && \
echo "Response time: $((END-START))ms"
  • ~400ms = annotation reply working (Layer 1 hit)
  • ~5000ms = annotation reply not working (full chatflow)

Check settings

Bash
docker exec -i huph-postgres psql -U huph -d dify -c "
SELECT score_threshold FROM app_annotation_settings
WHERE app_id = 'cde7e3c2-7bf2-407a-8d35-93d61425de60';"

Expected: 0.7. If 1.0, annotation reply is disabled.

Check hit history

Bash
docker exec -i huph-postgres psql -U huph -d dify -c "
SELECT COUNT(*) as hits, MAX(created_at) as last_hit
FROM app_annotation_hit_histories
WHERE app_id = 'cde7e3c2-7bf2-407a-8d35-93d61425de60';"

Check Milvus collection

Bash
docker exec huph-dify-api python3 -c "
from pymilvus import connections, Collection
connections.connect(host='huph-milvus', port='19530')
col = Collection('Vector_index_cde7e3c2_7bf2_407a_8d35_93d61425de60_Node')
col.load()
print(f'Entities: {col.num_entities}')
"

Expected: 101 (matches FAQ count).

Re-index annotations

If annotation vectors are lost (Milvus restart, volume corruption, migration), re-index via the Dify Celery task:

Bash
docker exec huph-dify-api python3 -c "
import os, uuid
os.environ.setdefault('EDITION', 'SELF_HOSTED')
from tasks.annotation.enable_annotation_reply_task import enable_annotation_reply_task

enable_annotation_reply_task.delay(
    str(uuid.uuid4()),                                # job_id
    'cde7e3c2-7bf2-407a-8d35-93d61425de60',           # app_id
    '0c441fd2-5c50-4a1a-964b-b823ed786ec9',           # user_id
    '1356b650-a75c-4e18-9f7b-386a5b28fcb0',           # tenant_id
    0.7,                                               # score_threshold
    'langgenius/openai/openai',                        # embedding_provider
    'text-embedding-3-small',                          # embedding_model
)
print('Task dispatched — check worker logs')
"

Verify in worker logs:

Bash
docker logs huph-dify-worker --tail 20 | grep -i "annotation"

Expected: App annotations added to index with latency < 2s.

Fix collection binding mismatch

If the dataset_collection_bindings.collection_name doesn't match the actual Milvus collection:

Bash
# Check current binding
docker exec -i huph-postgres psql -U huph -d dify -c "
SELECT collection_name FROM dataset_collection_bindings
WHERE type = 'annotation';"

# Update to correct Milvus collection name
docker exec -i huph-postgres psql -U huph -d dify -c "
UPDATE dataset_collection_bindings
SET collection_name = 'Vector_index_cde7e3c2_7bf2_407a_8d35_93d61425de60_Node'
WHERE type = 'annotation';"

Rollback (disable annotation reply)

Bash
docker exec -i huph-postgres psql -U huph -d dify -c "
UPDATE app_annotation_settings
SET score_threshold = 1.0
WHERE app_id = 'cde7e3c2-7bf2-407a-8d35-93d61425de60';"

Takes effect immediately — no Dify restart needed.

Gotchas

  • advanced-chat mode strips annotation_reply from API response metadata. Hits are invisible in the response but recorded in app_annotation_hit_histories.
  • The Dify DB is dify (not huph). Both run on huph-postgres.
  • Annotation embeddings were originally in Qdrant. After switching VECTOR_STORE=milvus, the collection was not auto-migrated — required manual re-indexing.
  • Threshold changes take effect immediately without Dify restart.
  • Re-indexing uses OpenAI API (text-embedding-3-small) — costs ~$0.001 for 101 annotations.

See also