miti99bot-js

mirror of https://github.com/tiennm99/miti99bot-js.git synced 2026-05-17 19:53:32 +00:00

Author	SHA1	Message	Date
tiennm99	29e5d0e6ca	revert(loldle): restore default max guesses to pre-tuning values Roll classic loldle back to 8 (from 6) and emoji to 5 (from 4). The /<module>_setmax override command stays — chats that want tighter limits can opt in instead of having defaults forced on them.	2026-05-07 15:06:25 +07:00
tiennm99	9670f85d2a	chore(repo): ignore .claude/ local agent state The directory holds machine-local memory and session-scoped permission overrides; never belongs in the tree.	2026-05-07 15:06:23 +07:00
tiennm99	297d8cb00b	feat(loldle): tunable max guesses per subject + retuned defaults Drop classic loldle from 8 → 6 (7-axis grid leaks too much per guess for 8 to feel earned) and emoji from 5 → 4 (3 emojis are usually unmistakable). Add a hidden /<module>_setmax <n> command per loldle module so a chat can override its own round length (1-10). Override stored at config:<subject> in each module's KV; getMaxGuesses() falls back to the default when unset.	2026-05-07 13:52:09 +07:00
tiennm99	15f817a162	chore(plans): drop orphan reports tied to no active plan Removed reports that documented work already shipped and not tied to any archived plan dir: - docs-manager-260420-2151-documentation-audit.md (one-off doc audit) - researcher-260421-0845-leaguepedia-api-verification.md (lolschedule research) - researcher-260421-0909-leaguepedia-auth-token.md (lolschedule research) The lolschedule module is in src/modules/lolschedule/ but never had a discrete plan dir. Findings from the leaguepedia reports are reflected in the live module code; the markdown is no longer load-bearing. plans/reports/ now contains only the 6 Atlas migration reports for the active plan.	2026-04-26 10:24:44 +07:00
tiennm99	f5c8bb72cf	chore(plans): co-locate reports with their archived plans Six research reports were sitting in plans/reports/ but tied to already-archived plans. Move each under the matching archived plan dir so the archive is self-contained and plans/reports/ only holds reports for in-flight or unarchived work. Moves - researcher-260422-2329-semantle-api-alternatives.md - researcher-260423-0025-bge-m3-cosine-calibration.md - researcher-260423-1110-vietnamese-embeddings-semantle.md → plans/archive/260422-2128-semantle-module/reports/ - researcher-260424-2215-loldle-ability-splash-modes.md - researcher-260424-2215-loldle-emoji-and-modes-overview.md - researcher-260424-2215-loldle-quote-mode.md → plans/archive/260424-2215-loldle-new-modes/reports/ (new dir) Stays in plans/reports/ - 6 Atlas migration reports (active plan) - docs-manager-260420-2151-documentation-audit.md (general audit, no discrete plan home) - researcher-260421-0845-leaguepedia-api-verification.md - researcher-260421-0909-leaguepedia-auth-token.md (lolschedule research; no archived lolschedule plan exists) No code changes.	2026-04-26 10:13:43 +07:00
tiennm99	10af11a39e	chore(plans): archive completed plans + mark atlas migration code-complete Three feature plans (semantle, twentyq, loldle-new-modes) are status:completed in their frontmatter and the corresponding modules exist in src/modules/. Move them to plans/archive/ to keep the active plans/ dir focused on in-flight work. Atlas migration (260425-1945-mongodb-atlas-migration/plan.md): bump status from `planning` to `code-complete` and annotate each phase row with its commit SHA + whether operator action is still pending. Plan stays in active plans/ until cutover lands or the Upstash standby (phase-07-alt-pivot.md) executes. No code changes. Tests, lint, register:dry unaffected (733 passing). Operator-facing summary in the plan.md status note: 8 phases of implementation are committed on dev (6f0b5ff..e2e3112). Outstanding operator work: Atlas provisioning, real-cluster smoke tests, backfill runs, soak, cutover stages, Stage 3 code cleanup.	2026-04-26 09:56:38 +07:00
tiennm99	e2e3112eb5	feat(db): finalize MongoDB Atlas migration documentation Phase 08: Complete documentation pass for MongoDB Atlas migration. - Create docs/cost-tracking.md: Cost monitoring, upgrade triggers, monthly checklist - Create docs/project-changelog.md: Full migration summary with phase breakdown - Update docs/architecture.md section 8: Describe dual-write era, MongoDB store layers - Update docs/code-standards.md: Add Persistence section for storage factory patterns - Update docs/codebase-summary.md: Reflect MongoDB as primary, update test count (733) - Update README.md: Storage section now describes MongoDB + dual-write during migration - Update CLAUDE.md: Architecture section references MongoDB instead of KV/D1 - Update tests/fakes/fake-mongo.js: Document frozen surface (Phase 02-08 API) Verified: - All 733 tests passing - Lint + secret-leak check pass - npm run register:dry succeeds - Auto-pause concern satisfied: trading (17:00), lolschedule (01:00), drift-verifier (hourly) all write to Mongo - Roadmap verified migration NOT listed (future-only per user feedback) Post-Phase-07 cutover: dual-write collapses, KV/D1 deleted, MongoDB becomes sole backend.	2026-04-26 09:34:25 +07:00
tiennm99	3f03521e84	feat(scripts): phase 07 — reverse-backfill scripts + delete guard Pre-execution prerequisites for the Phase 07 cutover. Stage 2 of the cutover keeps DUAL_WRITE=0 for ~6 days; if anything regresses during that window the operator MUST be able to roll back to KV/D1 with the last N days of Mongo-only writes recovered. Pre-building these scripts (per code-reviewer #4) eliminates "draft a backfill under outage pressure" — the anti-pattern of writing untested code at 4am. Reverse-backfill - scripts/backfill-mongo-to-kv.js: full-scan Mongo collection per module, PUT each doc back to CF KV via REST. expiresAt → expirationTtl (clamped to 60s minimum per CF KV); already-expired docs are skipped (won't resurrect dead state). 50 ops/sec throttle. --dry-run + --module flags. - scripts/backfill-mongo-to-d1.js: full-scan trading_trades, build INSERT SQL preserving legacy_id where present (round-trips D1 autoincrement IDs preserved by phase-05 forward backfill). Sequential int generation for any docs without legacy_id. Pipes through wrangler d1 execute. - scripts/lib/migration-helpers.js: cfKvPut helper added. Delete guard (debugger #12) - scripts/wrangler-delete-guard.sh: interactive CONFIRM wrapper around wrangler kv namespace delete + wrangler d1 delete. Exits 3 when stdin is not a tty so it cannot run in CI. Documented: never run in CI. package.json: backfill:mongo:kv[:dry] + backfill:mongo:d1[:dry] scripts wired. Tests: 697 → 733 (+36). - 7 cfKvPut tests (REST URL, querystring, body, expiration_ttl param). - 10 reverse-KV TTL math tests (expired sentinel, future seconds, no-TTL, CF 60s minimum clamp). - 9 reverse-D1 SQL construction tests (escaping, legacy_id preservation, sequential generation). Lint clean. No Worker code touched. Stage 1 cutover, 7-day soak, snapshots, and Stage 3 cleanup (delete CFKVStore + simplify factories + edit package.json deploy chain) remain operator-driven and will be committed separately after binding deletion.	2026-04-26 09:29:14 +07:00
tiennm99	55c873965c	feat(observability): phase 06 — timing telemetry + soak analyzer + burst tester Code prerequisites for the Phase 06 cold-start soak gate. The 24-72h soak itself is operator-run; this commit ships the instrumentation + analysis tools needed to make the PROCEED-or-PIVOT decision. Telemetry - src/util/timing.js: startTiming(cmd) returns {mark, end} that emits a structured cmd_timing log. takeColdFlag() returns {cold, isolateAgeMs} using a module-scoped boolean — first request in an isolate is cold, subsequent are warm. This replaces the originally-planned isolate_age_ms < 200ms classifier (broken because Mongo cold-connect itself is ~1500ms; cold requests would always bucket as warm — code-reviewer #11). - src/util/request-context.js: setLastCold/getLastCold shared state bridges fetch-level cold detection into the dispatcher middleware without a circular import. - src/index.js: takeColdFlag at the top of fetch() emits a request log and primes the request context for the dispatcher. - src/modules/dispatcher.js: bot.use() middleware times every command. Chosen over per-handler wrapping to preserve the existing identity assertion in tests (handler === reg.allCommands.get(name).cmd.handler) — single instrumentation point, no contract change. Soak tools (operator-run) - scripts/analyze-soak.js: parses CF Logs export (NDJSON or CSV), filters cmd_timing events, computes p50/p95/p99 per (cmd, cold/warm). Counts dual-write secondary failures, mongo connection errors, CPU-time exceeded events. Writes markdown report. - scripts/synthetic-burst.js: fires N parallel synthetic Telegram updates at the deployed Worker URL with cache-busting tokens. Used for the pre-deploy connection-cap stress test (debugger #2 — 20 parallel cold requests, abort if Atlas peak > 60% of 500-conn cap). - package.json: analyze:soak + burst:synthetic scripts wired. Tests - tests/util/timing.test.js: 8 tests — timing semantics, cold flag flip. - tests/scripts/analyze-soak.test.js: 22 tests — percentile math, NDJSON + CSV parse, aggregation, markdown formatting. Tests: 667 → 697 (+30). Lint clean. Operator runbook for Phase 06 (NOT executed by this commit): 1. Verify telemetry live via wrangler tail. 2. Run synthetic burst test: npm run burst:synthetic -- --url <prod> 3. Configure Atlas + CF Observability email alerts. 4. 24h soak (extend to 72h on stop-conditions per phase plan). 5. Daily npm run verify:mongo. 6. npm run analyze:soak -- --input <cf-logs.json> → soak-decision.md. 7. PROCEED to Phase 07 if cold-start P95 ≤ 2.5 × BASELINE_COLD_PING_MS; else execute phase-07-alt-pivot.md (Upstash standby).	2026-04-26 09:22:04 +07:00
tiennm99	0859356ec7	feat(scripts): phase 05 — backfill + verify + wipe (local node, no admin routes) Operator-run migration scripts for KV→Mongo and D1→trading_trades, plus a parity verifier and a rollback wiper. Pure local Node — no Worker code, no /__admin/* routes, no new Worker secrets. Complies with docs/architecture.md §10. Scripts - backfill-kv-to-mongo.js: paginates CF KV REST API per module, fetches values, $setOnInsert upsert into per-module Mongo collection. Resumes from .backfill-cursor-<module>.json on restart. Throttles 50 ops/sec. expiresAt derived from KV metadata.expiration (debugger #10). --dry-run and --module flags for incremental work. - backfill-d1-to-mongo.js: wrangler d1 execute --remote --json → parse → insertMany batches into trading_trades, preserving original integer id as legacy_id (code-reviewer #13). Pre-flight aborts if collection non-empty unless --force. - verify-mongo-parity.js: count parity ±1%, SHA256 value compare, expiresAt ±5min bucket. Full-scan when <10K docs, sqrt-sample capped at 500 otherwise (code-reviewer #21). Trading: full-scan on legacy_id/ts/user_id/symbol/qty. - wipe-mongo.js: rollback helper. deleteMany across all collections with readline confirm. --yes for CI. - lib/migration-helpers.js: shared sleep, sha256, checkpoint I/O, cfKvList/cfKvGet, MongoClient singleton, sample strategy. Surface updates - .env.deploy.example: CF account/token/namespace placeholders. - package.json: backfill:kv[:dry], backfill:d1[:dry], verify:mongo, wipe:mongo scripts. - check-secret-leaks.js: SECRETS array gains CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID for defense-in-depth. - .gitignore: .backfill-cursor-*.json excluded. Tests: 638 → 667 (+29 pure-logic tests for sha256, checkpoint round-trip, count-diff, sample-size, fetch-mocked CF REST). Lint clean. Operator-run sequence (after Phase 06 deploy): npm run backfill:kv:dry # preview npm run backfill:kv npm run backfill:d1:dry npm run backfill:d1 npm run verify:mongo # exit 0 = parity ok	2026-04-26 09:13:00 +07:00
tiennm99	ea7df56e2d	feat(db,cron): phase 04 — dual-write wrappers + factory routing + drift verifier + e2e The integration phase. Wires Phase 02 (MongoKVStore) and Phase 03 (MongoTradesStore + MongoSqlStore shim) into the request path behind two env flags so KV and Atlas run side-by-side until cutover. Storage routing - DualKVStore + DualSqlStore: Promise.allSettled writes to BOTH backends, reads from primary only. Secondary failures log + enqueue onto a KV retry queue (__retry:mongo-failed:* / __retry:mongo-sql-failed:). Primary failure throws. _kind="dual" sentinel for test seam. - create-store.js + create-sql-store.js: full flag matrix (STORAGE_PRIMARY ∈ {kv,mongo}, DUAL_WRITE ∈ {0,1}, MONGODB_URI presence) with STUB_SENTINEL short-circuit for deploy-time. Post-cutover shape commented inline so Phase 07 simplification is mechanical. Stub mongo for register - scripts/stub-kv.js: STUB_SENTINEL constant + duck-typed stubMongo (no-op connect/close, throwing collection access). Replaces the originally-planned string sentinel which would have stalled register.js on serverSelectionTimeoutMS if ever passed to MongoClient (code-reviewer #2). - scripts/register.js: stub env passes MONGODB_URI=STUB_SENTINEL, STORAGE_PRIMARY="kv", DUAL_WRITE="0". Asserted via vi.spyOn that MongoClient.prototype.connect is never reached. Drift verifier cron (1/hr) - src/cron/drift-verifier.js: drains both retry queues by re-attempting secondary writes, deletes on success. Spot-checks parity by sampling DRIFT_SAMPLE_N keys per module, hashing, logging mismatches. - src/modules/cron-dispatcher.js: SYSTEM_CRONS array dispatched alongside module crons. Keeping system cron out of registry.crons preserves existing module-cron length tests and is the cleaner design. - wrangler.toml: vars STORAGE_PRIMARY/DUAL_WRITE/DRIFT_SAMPLE_N + cron schedule "0 * * *" added. Trading wiring - src/modules/registry.js: builds new MongoTradesStore(env) when Mongo is in play and threads it as tradesStore into trading module's init context. Trading module already accepted optional tradesStore (Phase 03 backwards-compat) — D1 path remains for STORAGE_PRIMARY=kv + DUAL_WRITE=0. Tests + verification - tests/db/dual-kv-store.test.js, dual-sql-store.test.js: write-both, secondary-fail-logs+enqueues, primary-fail-throws, reads-primary-only, _kind sentinel. - tests/db/stub-mongo-sentinel.test.js: spy on MongoClient.connect, assert zero calls across all flag-matrix combos. - tests/cron/drift-verifier.test.js: queue drain, skip paths, error safety. - tests/e2e/storage-roundtrip.test.js: wordle KV dual-write + trading MongoTradesStore against fake-mongo. Tests: 577 → 638 (+61). register:dry passes without Atlas. Lint clean. Concerns - Drift-verifier parity-spot-check tests assert queue-drain only; full mismatch detection needs real Atlas (Vitest ES-module caching blocks reliable prototype patching). Verifier logic verified by inspection.	2026-04-26 09:02:07 +07:00
tiennm99	99cd8449ec	feat(db,trading): phase 03 — MongoTradesStore + trading refactor + SqlStore shim Replaces the originally-planned SQL-pattern dispatcher with a direct refactor: trading/history.js + retention.js call MongoTradesStore methods explicitly instead of routing strings through a regex dispatcher. Cleaner abstraction; eliminates the "7th SQL statement silently breaks" risk flagged in code-review. - src/db/mongo-trades-store.js: 6 explicit methods (insert, byUser, distinctUsers, oldRowsForUser, oldRows, deleteByIds). Lazy index init for (user_id, ts desc), (ts desc), and sparse (legacy_id). - src/db/mongo-sql-store.js: thin SqlStore shim returning {changes:1, last_row_id:0} (number, NOT hex) to satisfy the existing tests/db/create-sql-store.test.js:48-52 contract. Exists purely for factory branching; trading code calls MongoTradesStore directly. Unsupported SQL throws loud. - trading/history.js + retention.js + index.js: accept optional tradesStore in init args. Falls back to existing D1 sql path when tradesStore absent — keeps trading working on D1 until Phase 04 wires dual-write. - legacy_id: null on runtime inserts. Sparse index + field reserved for backfill (Phase 05) to preserve original D1 integer IDs for historical join-ability. Pre-refactor grep gates (all PASS): - exactly 6 SQL statements in src/modules/trading/ - zero arithmetic on .id (.id [+-*/<>]) - last_row_id consumed by zero callers in trading Tests: 529 → 577 (+48). Lint clean.	2026-04-26 08:48:18 +07:00
tiennm99	5b00cae76e	feat(db): phase 02 — MongoKVStore + memoized client + fake-mongo Implements the KVStore interface against MongoDB Atlas with full behavioral parity vs CFKVStore (null-on-missing, swallow-corrupt-JSON, idempotent delete, throw-on-undefined-putJSON). Not wired into the request path yet — Phase 04 adds dual-write wrappers and factory routing. - src/db/mongo-client.js: memoized MongoClient + getDb(env). On connect() reject, nulls both client and connectPromise so next call retries cleanly (regression-tested). Catches MongoServerSelectionError and emits a structured warning before rethrow so callers can map to 503. - src/db/mongo-kv-store.js: KVStore impl. get/getJSON filter on expiresAt at read time to close the up-to-60s TTL-sweeper stale-read window vs CFKVStore. list() returns keys WITH prefix preserved (parity — wrapper in create-store.js:65 strips). Cursor pagination via sorted _id + limit(N+1), NOT skip(). Lazy ensureIndex per (collection, isolate) tracked in module-scope Set. - src/db/mongo-list-cursor.js: extracted cursor encode/decode to keep mongo-kv-store.js under 200 LOC. - tests/fakes/fake-mongo.js: Map-backed fake covering the surface needed by both Phase 02 (KVStore) and Phase 03 (MongoTradesStore). - tests/db/mongo-kv-store.test.js: 26 tests, including TTL stale-read regression (1s TTL + time advance), 2-level prefix list regression, cursor pagination, connect-reject retry, MongoServerSelectionError structured log. Tests: 503 → 529 (+26). Lint clean.	2026-04-26 08:48:01 +07:00
tiennm99	6f0b5ff0a8	feat(db): phase 01 — atlas wrangler config + secret-leak lint + mongodb dep Code/config slice of plan phase 01 (operator-only steps for cluster provisioning, secrets, and runtime smoke tests deferred to user). - wrangler.toml: add `compatibility_flags = ["nodejs_compat_v2"]` (compatibility_date `2025-10-01` already satisfies ≥ 2025-03-20) - .env.deploy.example: add `MONGODB_URI` placeholder with mirror-protocol note - scripts/check-secret-leaks.js: lint that fails build on `console.log(env.<SECRET>)` for MONGODB_URI / TELEGRAM_BOT_TOKEN / TELEGRAM_WEBHOOK_SECRET / ADMIN_TOKEN - package.json: install mongodb@^6.7.0 (resolved 6.21.0); wire secret-leak check into `npm run lint` - docs/using-mongodb.md: operational runbook (cluster spec, free-tier ceiling, auto-pause behavior, network access permanence, rollback, rotation) Bundle-size HARD GATE: PASS. Probe with `import { MongoClient }` measures 226 KiB gzipped (3 MiB Free cap, 92% headroom) — nodejs_compat_v2 provides node:net/tls/crypto from runtime so transitive deps stay unbundled. CPU-time gate and auto-pause behavior gate require real Atlas access; deferred to operator (see docs/using-mongodb.md for procedure). 503/503 vitest tests still pass.	2026-04-26 08:32:19 +07:00
tiennm99	274a9d453d	docs(plans): add mongodb atlas migration plan with 3-round review 8-phase migration plan with standby pivot + structured analysis covering 22h budget. Strategy: dual-write wrappers at storage layer with derived cold-start gate, backfill scripts, staged canary → monitoring → cutover. Includes 3 research reports (Atlas fit, schema migration, free-tier validation) and 3 review reports (brainstorm critique, correctness check, failure modes). Reviewer dissent on timeline/complexity captured in plan.md alternatives section.	2026-04-25 21:03:38 +07:00
tiennm99	3ac06bffaa	feat(loldle): add ability and splash champion-guessing modules Closes deferred phases 04 + 05 of loldle-new-modes plan. - loldle-ability: 5 guesses, DDragon ability icon as photo. State pins slot (P/Q/W/E/R) so the same icon shows every turn. Abilities pulled from DDragon per-champion — same source loldle.net uses at runtime. - loldle-splash: 4 guesses, random skin splash as photo. Skin pool scraped from loldle.net bundle (var Ad=[…] — 172 champs × 1939 skins, non-chroma, matches their splash mode exactly). URLs from Riot DDragon CDN (no version segment, stable across patches). - fetch-ddragon-data.js: extended to write all four JSONs in one run. Shares a single DDragon per-champion fetch cycle (concurrency 10). - Credits loldle.net + Riot Games in all loldle-family READMEs. 19 new tests (503 total). Lint clean. register:dry reports 12 loldle_* commands with no conflicts.	2026-04-24 23:58:42 +07:00
tiennm99	bd5626534b	feat(loldle): add emoji and quote champion-guessing modules Ship two new loldle-family modules mirroring loldle.net's non-classic modes. Text-only MVP (ability/splash phases stay deferred). - loldle-emoji: 5 guesses, emoji-sequence clue. Pool derived algorithmically from classic's champions.json metadata (species/region/resource mapping table) since loldle.net's bundle has no static emoji pool. - loldle-quote: 6 guesses, lore-blurb clue. Pool seeded from Data Dragon champion title + first lore sentence; champion name redacted to ___. - scripts/fetch-ddragon-data.js: single generator for both JSONs. - src/util/normalize-name.js: shared lookup helper; loldle/lookup.js refactored to import it. 35 new tests (484 total passing). Lint clean.	2026-04-24 23:30:11 +07:00
tiennm99	8b8eb16911	docs(plans): archive code-review audit report from twentyq plan Output of the code-reviewer subagent run that drove commit `3be799d`. Kept under the plan dir so the cleanup decisions stay traceable.	2026-04-24 18:58:12 +07:00
tiennm99	8e04c491f3	docs(architecture): add missing types.js to file tree Self-review of the prior cleanup commit caught one omission — src/types.js (central JSDoc typedefs file: Env, Module, Command, Cron, …) was listed in the top-level README but absent from docs/architecture.md's src/ tree.	2026-04-24 18:54:23 +07:00
tiennm99	3be799d68a	chore: project cleanup — purge stale function-calling refs + sync docs Followed code-reviewer audit. Findings applied: - twentyq/README.md, twentyq/index.js header — claimed "function calling" + ANSWER_FUNCTION_SCHEMA / submit_answer; rewrite to JSON-in-content matching what the code actually does. Added generateRoundStart line. - wrangler.toml [ai] comment — list both bge-m3 (semantle/doantu) AND gemma-4 (twentyq) consumers; drop neuron math that no longer matched. - scripts/stub-kv.js — drop reference to nonexistent REGISTER_DRYRUN flag. - twentyq/ai-client.redactSecret — strip dead "if (out.length > 0)" branch (String.replace cannot produce empty string from the inputs we pass). - handlers.test.js — drop noise saveGame() before "no games" stats assert; add ai.run call-count guards on two-AI-call flows. - docs/codebase-summary.md — full rewrite of Active Modules table (semantle/doantu/lolschedule/twentyq were missing); fix vitest 2→4 + wrangler 3→4 versions; replace stale 200-test count with current ~450. - docs/architecture.md — file tree includes lolschedule/semantle/doantu/ twentyq + cron-dispatcher + sql-store* + scripts/migrate.js; moduleRegistry snippet matches src/modules/index.js. - docs/todo.md — entire file obsolete (D1 UUID populated, cron live). Deleted. Tests: 449 pass, lint clean.	2026-04-24 18:30:25 +07:00
tiennm99	f6ab94ffb0	feat(twentyq): LLM-generated category + initial hint from bare keyword seeds Previously seeds carried hand-curated {category, target, initialHint}. Now SEEDS is a flat string[] of keywords — at round-start, the model generates {category, initialHint} on the fly. Benefits: - adding a seed is trivial (just append a word) - every round gets a fresh cryptic opener (varies across plays of the same word) - HINT STYLE rules apply to the opening hint too, so the initial clue isn't a definitional giveaway Implementation: - prompts.buildStartRoundPrompt(target) — with good/bad examples - ai-client.generateRoundStart(env, target) — same JSON-in-content approach as judge(), with defensive fallbacks + redactSecret - handlers.startFreshGame now async; surfaces roundstart errors via the existing UPSTREAM_FAIL path Tests: 449 pass (5 new for generateRoundStart, 1 for roundstart error path).	2026-04-24 16:26:55 +07:00
tiennm99	fbef449247	fix(webhook): bump grammY timeout 10s → 25s for LLM call headroom Production showed: Request timed out after 10000 ms / status 500. grammY's webhookCallback defaults to 10s — fine for simple handlers but too tight for twentyq's Workers AI call (Gemma 4 26B cold-starts can easily exceed 10s). Raise to 25s, leaving 5s headroom under Cloudflare Workers' 30s wall-clock cap.	2026-04-24 15:51:16 +07:00
tiennm99	cd91c86117	tune(twentyq): make AI hints cryptic/indirect instead of definition-like Player feedback: hints were too clear — gave away the answer in one or two turns because the model was leaning on "it is used for X" / category-word phrasings. Reworked the hint-style section of the system prompt to force the model toward indirect, riddle-style, lateral facts. Added good/bad example pairs (secret="organ") so the model has concrete contrast to pattern-match. No schema change — tests unaffected (444 pass).	2026-04-24 15:47:23 +07:00
tiennm99	0887a07367	fix(twentyq): drop function calling, use JSON-in-content for Gemma 4 compat Gemma 4 likely rejects the flat "traditional" tools schema we were sending (the docs use OpenAI-wrapped shape for this model) — causing env.AI.run to throw and users to see the "AI service hiccup" reply every turn. Switch to the universal approach: - system prompt asks the model for a one-line JSON {is_guess, answer, hint} - ai-client.extractText handles both Workers-AI and OpenAI response shapes - parseJudgementJson walks brace-depth to extract JSON from stray prose / accidental code fences - logs twentyq_ai_throw / twentyq_ai_unparseable with preview on failure so future issues surface in wrangler tail immediately Tests: 7 new (parser + extractText); 444 total pass.	2026-04-24 14:57:58 +07:00
tiennm99	5b12650906	feat(twentyq): add reverse-Akinator yes/no game module powered by Workers AI - seeded 54 objects across 6 categories (instrument, animal, food, vehicle, sport, household) - @cf/google/gemma-4-26b-a4b-it judges via function calling; returns {is_guess, answer, hint} - pre-AI validator rejects open-ended questions; handler dedups exact repeats - secret-redacting hint filter as defense-in-depth - 86 new vitest tests (seeds, state, validator, ai-client, handlers, render)	2026-04-24 14:37:23 +07:00
tiennm99	820c452661	feat(doantu): add /doantu_hint — 3 warm-but-not-hot related words Uses phow2sim /neighbors. Filters out capitalized foreign place names that leak in from the corpus (e.g. al-Qantara, Nam_Afrin) and requires tokens look Vietnamese (diacritic or underscore compound) to dodge pure-ASCII junk like "adiyeh". Samples 3 from the tail after skipping the top 20% so the hint doesn't give away the answer.	2026-04-23 13:23:53 +07:00
tiennm99	293c3c9e4f	refactor(doantu): drop sigmoid calibration — map raw cosine * 100 Sigmoid was inherited from semantle where bge-m3's narrow cone (unrelated pairs at 0.40-0.55) needed spreading. phow2sim cosines span 0.0-0.8 naturally, so a linear map is honest and free of magic constants. Kept the emoji buckets — they already work well against raw percentages.	2026-04-23 12:55:58 +07:00
tiennm99	0d44ae909c	fix(doantu): retune calibration for phow2sim's wider cosine distribution format.js was inherited from semantle (bge-m3 transformer) whose raw cosines live in a narrow 0.4-0.55 band for unrelated words. phow2sim runs on PhoW2V word2vec — related pairs sit at 0.3-0.5, synonyms at 0.55-0.80 — so the FLOOR=0.4 cutoff was dumping real signal (làng/đất =0.38, làng/phố=0.38) to a displayed 0. Retune: FLOOR=0.1, CENTER=0.4, SCALE=6. Now 0.38 → 39, 0.52 → 64, 0.80 → 93.	2026-04-23 12:54:43 +07:00
tiennm99	302bcef225	fix(doantu): reset round when target fell out of phow2sim vocab Pre-phow2sim games (Workers AI era) left targets in KV that phow2sim doesn't know. The API returned in_vocab_a:false, similarity:null, which our handler misread as a guess-OOV and blamed the player's word. Now we detect target-OOV explicitly, wipe the stale round, and prompt the user to start fresh.	2026-04-23 12:20:43 +07:00
tiennm99	8024dbfd40	feat(doantu): constrain target pool to phow2sim rank 100-1000 Default /random pulled from the full Vietnamese corpus (rank 40k+ words like "sa_mạc_hoá" showed up), which made rounds unplayable for casual speakers. Filter targets to min_rank=100, max_rank=1000 so words stay recognizable.	2026-04-23 12:09:56 +07:00
tiennm99	919fa038d5	fix(register): stub env.AI so semantle init doesn't break registration `npm run register` imports buildRegistry to derive the public command list but ran outside the Worker runtime, so `env.AI` was undefined — semantle (and previously doantu) tripped `createClient` type-checks. Add a no-op AI stub alongside stubKv and wire it through the buildRegistry env.	2026-04-23 11:38:27 +07:00
tiennm99	4acc471f6f	refactor(doantu): swap Workers AI bge-m3 for hosted phow2sim HTTP API Doantu now mirrors semantle's pre-Workers-AI shape: a thin fetch wrapper around /random + /similarity on https://phow2sim.sg.miti99.com (overridable via PHOW2SIM_API_URL). Drops the local Viet22K wordlist + build script — the service owns vocabulary now. Promotes commands from protected to public so they show up in Telegram's native / menu.	2026-04-23 11:35:32 +07:00
tiennm99	fd5a1d2903	feat(semantle,doantu): calibrate cosine score via normalized sigmoid BGE embeddings occupy a narrow cone in vector space, so raw cosine of two unrelated words already sits at ~0.40-0.55. Displaying `raw * 100` made every random guess read as 40-70% warm, which defeated the warmth UX. format.js now applies a normalized sigmoid (FLOOR 0.40, CENTER 0.60, SCALE 8) to remap raw cosine → displayed 0-100. Unrelated pairs drop to ≤30, loose relation lands around 40-55, clear synonyms hit 85+, and exact match stays at 100. Emoji buckets were rebased onto the calibrated score; formatWarmth lost its sign column (calibrated output is always non-negative). render.js rounds once and feeds the integer to both formatWarmth and warmthEmoji so the display value and bucket stay in sync. Constants are empirical — retune if swapping to a non-BGE model.	2026-04-23 00:33:54 +07:00
tiennm99	4f7f6896c5	refactor(semantle): switch embedding model from bge-small-en-v1.5 to bge-m3 Aligns semantle with doantu so both modules share one Workers AI model. bge-m3 is multilingual and cheaper (1075 N/M input tokens vs 1841 N/M) and produces 1024-dim vectors. Updates the api-client default, test fake-vector dimensions, README, index.js doc comment, and the wrangler.toml [ai] binding comment (Neurons/day budget recomputed).	2026-04-23 00:22:28 +07:00
tiennm99	9b331fc24d	refactor(semantle,doantu): drop ConceptNet vestiges, trim wordlist API Now that both modules run on Workers AI embeddings, drop the legacy Word2SimError alias, the unused wordlist helpers (getLine, LINE_COUNT, pickFromPool), and every comment/README section still describing the removed ConceptNet backend. Fix the bge-small doc typo in semantle/index.js and align the semantle api-client test fake-vector dim with the real 384-dim output.	2026-04-23 00:19:28 +07:00
tiennm99	0740dffd6b	refactor(doantu): swap ConceptNet for Workers AI bge-m3 embeddings Mirror the semantle migration but with @cf/baai/bge-m3 — BAAI's multilingual embedding model — because the English-only BGE variants can't produce meaningful Vietnamese vectors (their tokenizer shreds diacritics into noisy byte-level subwords). bge-m3 is trained across 194 languages incl. Vietnamese and is actually cheaper in Neurons (1,075 vs 1,841 per M tokens for bge-small-en-v1.5). Vocab check reuses the local Viet22K wordlist as an in-memory Set — O(1) OOV detection, no upstream call. Also add a test file for the module (mirrors semantle coverage plus Vietnamese-specific cases: diacritics, multi-syllable compounds).	2026-04-22 23:53:36 +07:00
tiennm99	31ced88b78	refactor(semantle): swap ConceptNet for Workers AI embeddings ConceptNet (api.conceptnet.io) was returning sustained 502s, breaking every guess with an "Upstream hiccup" reply. Replace with env.AI.run on @cf/baai/bge-small-en-v1.5 and score guesses by computing cosine similarity locally against the target vector. The local google-10k wordlist doubles as the in/out-of-vocabulary set, so OOV detection is an O(1) Set.has() with no upstream call. The similarity() response shape is unchanged, so handlers/render/state stay as-is. Free on the Workers Free plan: 10k Neurons/day cap, ~0.0037 Neurons per 2-word guess → ~2.7M guesses/day headroom for this bot.	2026-04-22 23:48:17 +07:00
tiennm99	c0315574c0	feat(doantu): add Vietnamese semantle module (protected visibility) Near-clone of the semantle module, adapted for Vietnamese: - Targets from duyet/vietnamese-wordlist Viet22K (~22k entries, GPL). Regenerate via scripts/build-doantu-words.js; chained into npm run build. - ConceptNet client uses /c/vi/<term> URIs; multi-word guesses (e.g. "con chó") are space-to-underscore converted at URL build time so the board keeps the natural display. - lookup.js permits Unicode letters + combining marks + single internal spaces; rejects digits/punctuation. - All three commands (/doantu, /doantu_giveup, /doantu_stats) are visibility=protected — shown in /help, hidden from Telegram's native / autocomplete menu while the module is still experimental. Wired into src/modules/index.js, wrangler.toml MODULES, .env.deploy(.example), and package.json build chain. Separate module rather than a shared base with semantle — matches the repo's one-module-per-game convention (see loldle vs wordle); factor later if a third language appears.	2026-04-22 23:29:36 +07:00
tiennm99	4c2890ba25	refactor(semantle): drop word filter, expose line-based wordlist API Use the full google-10000-english list verbatim (normalize only — lowercase + dedupe, no length or alpha filtering). Pool goes from 7953 to 9894 entries; rare/short/long picks are still sieved by ConceptNet's verify-and-fallback at round start. Replaces TARGET_POOL/pickFromPool with a clearer line-based API: LINE_COUNT — how many entries randomLine() — uniform pick getLine(n) — nth entry (n = frequency rank) pickFromPool retained as a back-compat re-export so existing callers don't break.	2026-04-22 23:19:51 +07:00
tiennm99	64c0248eea	feat(semantle): source target pool from google-10000-english dictionary The ~250-word hand-curated TARGET_POOL was too small for long-term play. Replaces it with a build-script-generated dictionary: - scripts/build-semantle-words.js fetches first20hours/google-10000-english (no-swears variant), filters to 4–10 ASCII letters, drops the top-200 most frequent function words, and writes src/modules/semantle/words-data.js as a static ES-module export. - wordlist.js now just re-exports that data via TARGET_POOL + pickFromPool. - package.json: new build:semantle-words script; chained into `npm run build` alongside build:wordle-data so `npm run deploy` regenerates automatically. Pool size: ~250 → 7953 words. Same ConceptNet verify-and-fallback flow, so low-quality picks still cost at most one extra concept lookup.	2026-04-22 23:12:07 +07:00
tiennm99	fca6d733c9	refactor(semantle): swap word2sim backend for ConceptNet ConceptNet provides a free public /relatedness endpoint (returns cosine-like [-1, 1]) and /c/en/{term} for vocabulary check. No random-word endpoint, so we ship a curated local target pool in wordlist.js (~250 words) and verify each pick via the concept endpoint with a fallback to an unverified pick. Each guess now makes two parallel ConceptNet calls (concept + relatedness) instead of a single word2sim call. Slightly higher latency but zero hosting cost and no dependency on the self-hosted word2sim instance. - api-client.js rewritten; UpstreamError replaces Word2SimError (aliased for backwards compat with older imports). - wordlist.js added (curated target pool + pickFromPool). - handlers.js: drops RANDOM_FILTERS (no filtering needed; pool is curated). - index.js: drops WORD2SIM_API_URL env var; ConceptNet base hardcoded. - wrangler.toml + .dev.vars.example: drop WORD2SIM_API_URL. - api-client tests rewritten for ConceptNet shape; total tests 336 → 341.	2026-04-22 23:07:54 +07:00
tiennm99	51d36272c7	refactor(semantle): drop /semantle_new; reply on duplicate guesses Giveup already auto-starts a fresh round on next /semantle, so /semantle_new was redundant. Duplicate guesses now match loldle's behavior: reply with "🔁 already guessed" and skip the similarity API call (fast-path dedup against prior word or canonical, with a post-API fallback for different inputs that canonicalize to the same token).	2026-04-22 22:20:47 +07:00
tiennm99	08ff72985a	feat(semantle): add word2vec guessing game module Telegram commands /semantle, /semantle_new, /semantle_giveup, /semantle_stats. Round starts with /random pick from hosted word2sim; each guess scored via /similarity. Unlimited guesses; solve on case-insensitive exact match. New env var WORD2SIM_API_URL (wrangler.toml, .env.deploy). Includes module README and 90 unit tests covering api-client, state, format, render, and handlers.	2026-04-22 22:05:27 +07:00
tiennm99	0807389a81	fix(loldle): start round timer on first guess, not round end Previously startFreshGame was called at the tail of every win/lose/giveup path, stamping startedAt to that moment — so the clock accrued while the player was away between rounds. Now: - round-ending paths call clearGame (new helper in state.js), deleting the KV record instead of pre-creating the next round - getOrInitGame lazily creates the next round on the player's next /loldle call, with startedAt: null - the first actual guess inside handleLoldle stamps startedAt = Date.now() Viewing an empty board gives no hints, so it shouldn't count against the clock. handleGiveup no longer auto-creates a fresh round and now reports "No active round" when called with nothing in progress.	2026-04-22 17:00:31 +07:00
tiennm99	44c1f1a57c	chore: run biome format across the repo Broaden `npm run format` / `npm run lint` to biome's full scan (`.`) instead of a fixed src/tests/scripts list, so root-level files and any new top-level directories stay formatted. Drop the stale ignore entry for the deleted champions-data.js.	2026-04-22 14:14:29 +07:00
tiennm99	85c43109b6	fix(loldle): recover newer champions skipped by the scraper loldle.net's classic-mode bundle has two record shapes — older champions carry _id/championId, newer ones (Bel'Veth, K'Sante, Nilah, …) don't. The regex required those leading fields, silently dropping anyone added since 2022. Make _id/championId optional and non-capturing, and drop them from the output record (the bot never read them anyway). Champion count: 169 → 172; guessing /loldle k'sante, /loldle bel'veth, /loldle nilah now resolve correctly.	2026-04-22 14:07:11 +07:00
tiennm99	61f1cd79b7	refactor(loldle): align display labels with loldle.net grid Column headers now match loldle.net's classic-mode grid verbatim: Range → Range type, Region → Region(s), Lane → Position(s), Year → Release year. The champion row header becomes Champion (was Name). Data field names already matched; only labels diverged.	2026-04-22 13:54:23 +07:00
tiennm99	e1f7cdf645	refactor(loldle): trim module to current behavior only KV payload cleanup: - drop lastResultAt from stats (never read) - drop solved/giveup flags from game state (round is immediately replaced after finish, making the flags transient noise) - skip redundant saveGame on winning/giveup/out-of-guesses paths; startFreshGame overwrites anyway Code cleanup: - delete daily.js + daily.test.js (pickDaily/todayUtc were speculative "future use" — only pickRandom was wired in, inlined into handlers) - drop the dead switch default in compare.js - trim file preambles across the module Docs: rewrite README around current behavior with loldle.net as the sole data source; update scraper header to match the raw schema.	2026-04-22 13:49:38 +07:00
tiennm99	8992424947	refactor(loldle): store only championNames in KV, recompute rows on render Round state now keeps `guesses` as a plain string[] (the names the player tried) instead of caching full comparison results. The board view rehydrates rows at display time by re-running compareChampions against the current target. Smaller KV payloads, and the rendered board always reflects the live champions.json — useful if a weekly data refresh lands mid-round.	2026-04-22 13:33:46 +07:00
tiennm99	df46e4ee22	refactor(loldle): consume loldle.net's raw schema directly Drop the in-scraper normalization step — champions.json now mirrors the exact shape emitted by loldle.net's JS bundle. Records use _id, championId, championName, arrays for positions/species/regions/ range_type, "Male"/"Female"/"Other" gender strings, and a full YYYY-MM-DD release_date. Comparison is schema-aware: multi-value keys accept arrays directly, the year axis parses YYYY out of the ISO date, and exact compares stay case-insensitive.	2026-04-22 13:29:55 +07:00

1 2 3

111 Commits