9 Commits

Author SHA1 Message Date
viettranx 49f51da81c fix(kg): raise extraction temperature from 0.0 to 0.2
Zero temperature was too rigid, causing LLM to miss implied entities
and relations. 0.2 allows picking up contextual connections while
staying deterministic for structured JSON output.
2026-03-31 19:06:38 +07:00
viettranx c49952a1e5 feat(kg): expand entity types and improve extraction prompt
Add 3 new entity types: technology, product, document — reducing
concept catch-all bucket. Add 4 new relation types: authored,
references, provides, requires. Improve prompt with disambiguation
guide between similar types, stricter related_to usage, and varied
confidence examples. Update graph view colors and mass for new types.
2026-03-31 19:03:43 +07:00
viettranx 21b6c454ca feat: merge pipeline, per-user credentials, unified picker, group contacts
- Enable merge UI for linking channel contacts to tenant_users
- Contact → tenant_user resolution with cached lookup (60s TTL)
- MCP per-user credentials via user-keyed connection pool
- Secure CLI per-user credentials with AES-256-GCM encryption
- Unified UserPickerCombobox searching contacts + tenant_users
- Group contact collection with chat title in all channels
- Group permission inheritance via wildcard user_id="*"
- Fix heartbeat using wrong userID in group chats
- Filter internal senders from contact collection
- Add contact_type column (user/group) to channel_contacts
- SQLite schema v2 migration for desktop edition
2026-03-29 22:33:17 +07:00
viettranx 2e869f4ece feat(kg): add tsvector FTS, entity deduplication, and embedding fixes
- Replace ILIKE search with PostgreSQL tsvector/GIN full-text search
  for KG entities (migration 000031)
- Add entity deduplication system with dual-threshold strategy:
  auto-merge at 0.98+ similarity with Jaro-Winkler name match,
  flag 0.90-0.98 as candidates for manual review
- Add ScanDuplicates for bulk on-demand duplicate detection
- Add MergeEntities with advisory lock and tenant-scoped relation
  re-pointing (delete-then-update to avoid ON CONFLICT on UPDATE)
- Wire dedup inline after KG extraction pipeline
- Fix BackfillKGEmbeddings: was failing silently due to
  context.Background() missing tenant_id; now runs cross-tenant
- Fix BackfillKGEmbeddings: break on error → continue with failed
  ID tracking and max consecutive error cap
- Add EmbedEntity helper; UpsertEntity now generates embeddings
  in background goroutine
- Add HTTP endpoints: POST /kg/dedup/scan, GET /kg/dedup,
  POST /kg/merge, POST /kg/dedup/dismiss
- Add web UI: Dedup dialog with Scan All button, side-by-side
  entity comparison, merge/dismiss actions
- Add Jaro-Winkler similarity algorithm with 34 unit tests
- Update IngestExtraction to return upserted entity IDs
- Bump RequiredSchemaVersion to 31
2026-03-29 12:15:21 +07:00
viettranx 6dcac3d7e3 fix(kg): sanitize LLM JSON output before parsing (#167)
Add sanitizeJSON() to fix malformed decimal numbers (e.g. '0. 85' → '0.85')
and trailing commas before closing brackets. Fixes extraction failures with
Gemini 2.5 Flash which occasionally produces invalid JSON.
2026-03-14 16:22:48 +07:00
viettranx 558bdd6d5c fix(memory): use per-user workspace for memory path detection and KG extraction
Two related fixes:

1. Memory interceptor now resolves workspace from request context
   (per-user workspace) instead of using the static global workspace.
   This fixes memory writes with absolute paths under per-user
   workspaces (e.g. workspace/channel/userID/memory/) being bypassed
   and written to disk instead of the database, which also prevented
   KG extraction, memory indexing, and cross-session recall.

2. KG extractor: increase max_tokens 4096→8192, add retry on
   truncation (finish_reason=length), and support chunking for
   long inputs with deduplication on merge.
2026-03-13 13:22:02 +07:00
viettranx b4133282a6 refactor(kg): improve extraction prompt with few-shot example and controlled vocabulary
Add domain context, coreference rules, controlled relation types (15 predefined),
few-shot example, and dynamic entity count (3-15). Increase max input from 6000
to 12000 chars, reduce max output tokens from 8192 to 4096.
2026-03-12 18:54:14 +07:00
viettranx bdb60de7ae chore: upgrade Go 1.25 → 1.26 and apply go fix modernizations
- Update go.mod and Dockerfile to Go 1.26
- Apply `go fix ./...` stdlib modernizations across 170+ files
- Add `go fix` to post-implementation checklist in CLAUDE.md
- Fix go fix misapplied rewrite in loop_history.go
2026-03-10 00:09:15 +07:00
viettranx 63eff188ad feat(kg): add knowledge graph with LLM extraction, traversal, and graph visualization
- KnowledgeGraphStore interface + PostgreSQL implementation (recursive CTE traversal, 5s timeout)
- LLM entity extraction pipeline triggered on memory writes (background goroutine)
- knowledge_graph_search agent tool with search + traversal modes
- HTTP API: CRUD entities, traverse, extract, stats, graph endpoints
- Web UI: KG tab on memory page with table/graph toggle, entity detail, manual extraction
- Force-directed graph visualization using @xyflow/react + d3-force
- Builtin tool seed with configurable provider/model/confidence settings
2026-03-09 17:11:20 +07:00