miti99bot-js

mirror of https://github.com/tiennm99/miti99bot-js.git synced 2026-05-14 11:52:59 +00:00

Author	SHA1	Message	Date
tiennm99	3f03521e84	feat(scripts): phase 07 — reverse-backfill scripts + delete guard Pre-execution prerequisites for the Phase 07 cutover. Stage 2 of the cutover keeps DUAL_WRITE=0 for ~6 days; if anything regresses during that window the operator MUST be able to roll back to KV/D1 with the last N days of Mongo-only writes recovered. Pre-building these scripts (per code-reviewer #4) eliminates "draft a backfill under outage pressure" — the anti-pattern of writing untested code at 4am. Reverse-backfill - scripts/backfill-mongo-to-kv.js: full-scan Mongo collection per module, PUT each doc back to CF KV via REST. expiresAt → expirationTtl (clamped to 60s minimum per CF KV); already-expired docs are skipped (won't resurrect dead state). 50 ops/sec throttle. --dry-run + --module flags. - scripts/backfill-mongo-to-d1.js: full-scan trading_trades, build INSERT SQL preserving legacy_id where present (round-trips D1 autoincrement IDs preserved by phase-05 forward backfill). Sequential int generation for any docs without legacy_id. Pipes through wrangler d1 execute. - scripts/lib/migration-helpers.js: cfKvPut helper added. Delete guard (debugger #12) - scripts/wrangler-delete-guard.sh: interactive CONFIRM wrapper around wrangler kv namespace delete + wrangler d1 delete. Exits 3 when stdin is not a tty so it cannot run in CI. Documented: never run in CI. package.json: backfill:mongo:kv[:dry] + backfill:mongo:d1[:dry] scripts wired. Tests: 697 → 733 (+36). - 7 cfKvPut tests (REST URL, querystring, body, expiration_ttl param). - 10 reverse-KV TTL math tests (expired sentinel, future seconds, no-TTL, CF 60s minimum clamp). - 9 reverse-D1 SQL construction tests (escaping, legacy_id preservation, sequential generation). Lint clean. No Worker code touched. Stage 1 cutover, 7-day soak, snapshots, and Stage 3 cleanup (delete CFKVStore + simplify factories + edit package.json deploy chain) remain operator-driven and will be committed separately after binding deletion.	2026-04-26 09:29:14 +07:00
tiennm99	55c873965c	feat(observability): phase 06 — timing telemetry + soak analyzer + burst tester Code prerequisites for the Phase 06 cold-start soak gate. The 24-72h soak itself is operator-run; this commit ships the instrumentation + analysis tools needed to make the PROCEED-or-PIVOT decision. Telemetry - src/util/timing.js: startTiming(cmd) returns {mark, end} that emits a structured cmd_timing log. takeColdFlag() returns {cold, isolateAgeMs} using a module-scoped boolean — first request in an isolate is cold, subsequent are warm. This replaces the originally-planned isolate_age_ms < 200ms classifier (broken because Mongo cold-connect itself is ~1500ms; cold requests would always bucket as warm — code-reviewer #11). - src/util/request-context.js: setLastCold/getLastCold shared state bridges fetch-level cold detection into the dispatcher middleware without a circular import. - src/index.js: takeColdFlag at the top of fetch() emits a request log and primes the request context for the dispatcher. - src/modules/dispatcher.js: bot.use() middleware times every command. Chosen over per-handler wrapping to preserve the existing identity assertion in tests (handler === reg.allCommands.get(name).cmd.handler) — single instrumentation point, no contract change. Soak tools (operator-run) - scripts/analyze-soak.js: parses CF Logs export (NDJSON or CSV), filters cmd_timing events, computes p50/p95/p99 per (cmd, cold/warm). Counts dual-write secondary failures, mongo connection errors, CPU-time exceeded events. Writes markdown report. - scripts/synthetic-burst.js: fires N parallel synthetic Telegram updates at the deployed Worker URL with cache-busting tokens. Used for the pre-deploy connection-cap stress test (debugger #2 — 20 parallel cold requests, abort if Atlas peak > 60% of 500-conn cap). - package.json: analyze:soak + burst:synthetic scripts wired. Tests - tests/util/timing.test.js: 8 tests — timing semantics, cold flag flip. - tests/scripts/analyze-soak.test.js: 22 tests — percentile math, NDJSON + CSV parse, aggregation, markdown formatting. Tests: 667 → 697 (+30). Lint clean. Operator runbook for Phase 06 (NOT executed by this commit): 1. Verify telemetry live via wrangler tail. 2. Run synthetic burst test: npm run burst:synthetic -- --url <prod> 3. Configure Atlas + CF Observability email alerts. 4. 24h soak (extend to 72h on stop-conditions per phase plan). 5. Daily npm run verify:mongo. 6. npm run analyze:soak -- --input <cf-logs.json> → soak-decision.md. 7. PROCEED to Phase 07 if cold-start P95 ≤ 2.5 × BASELINE_COLD_PING_MS; else execute phase-07-alt-pivot.md (Upstash standby).	2026-04-26 09:22:04 +07:00
tiennm99	0859356ec7	feat(scripts): phase 05 — backfill + verify + wipe (local node, no admin routes) Operator-run migration scripts for KV→Mongo and D1→trading_trades, plus a parity verifier and a rollback wiper. Pure local Node — no Worker code, no /__admin/* routes, no new Worker secrets. Complies with docs/architecture.md §10. Scripts - backfill-kv-to-mongo.js: paginates CF KV REST API per module, fetches values, $setOnInsert upsert into per-module Mongo collection. Resumes from .backfill-cursor-<module>.json on restart. Throttles 50 ops/sec. expiresAt derived from KV metadata.expiration (debugger #10). --dry-run and --module flags for incremental work. - backfill-d1-to-mongo.js: wrangler d1 execute --remote --json → parse → insertMany batches into trading_trades, preserving original integer id as legacy_id (code-reviewer #13). Pre-flight aborts if collection non-empty unless --force. - verify-mongo-parity.js: count parity ±1%, SHA256 value compare, expiresAt ±5min bucket. Full-scan when <10K docs, sqrt-sample capped at 500 otherwise (code-reviewer #21). Trading: full-scan on legacy_id/ts/user_id/symbol/qty. - wipe-mongo.js: rollback helper. deleteMany across all collections with readline confirm. --yes for CI. - lib/migration-helpers.js: shared sleep, sha256, checkpoint I/O, cfKvList/cfKvGet, MongoClient singleton, sample strategy. Surface updates - .env.deploy.example: CF account/token/namespace placeholders. - package.json: backfill:kv[:dry], backfill:d1[:dry], verify:mongo, wipe:mongo scripts. - check-secret-leaks.js: SECRETS array gains CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID for defense-in-depth. - .gitignore: .backfill-cursor-*.json excluded. Tests: 638 → 667 (+29 pure-logic tests for sha256, checkpoint round-trip, count-diff, sample-size, fetch-mocked CF REST). Lint clean. Operator-run sequence (after Phase 06 deploy): npm run backfill:kv:dry # preview npm run backfill:kv npm run backfill:d1:dry npm run backfill:d1 npm run verify:mongo # exit 0 = parity ok	2026-04-26 09:13:00 +07:00
tiennm99	ea7df56e2d	feat(db,cron): phase 04 — dual-write wrappers + factory routing + drift verifier + e2e The integration phase. Wires Phase 02 (MongoKVStore) and Phase 03 (MongoTradesStore + MongoSqlStore shim) into the request path behind two env flags so KV and Atlas run side-by-side until cutover. Storage routing - DualKVStore + DualSqlStore: Promise.allSettled writes to BOTH backends, reads from primary only. Secondary failures log + enqueue onto a KV retry queue (__retry:mongo-failed:* / __retry:mongo-sql-failed:). Primary failure throws. _kind="dual" sentinel for test seam. - create-store.js + create-sql-store.js: full flag matrix (STORAGE_PRIMARY ∈ {kv,mongo}, DUAL_WRITE ∈ {0,1}, MONGODB_URI presence) with STUB_SENTINEL short-circuit for deploy-time. Post-cutover shape commented inline so Phase 07 simplification is mechanical. Stub mongo for register - scripts/stub-kv.js: STUB_SENTINEL constant + duck-typed stubMongo (no-op connect/close, throwing collection access). Replaces the originally-planned string sentinel which would have stalled register.js on serverSelectionTimeoutMS if ever passed to MongoClient (code-reviewer #2). - scripts/register.js: stub env passes MONGODB_URI=STUB_SENTINEL, STORAGE_PRIMARY="kv", DUAL_WRITE="0". Asserted via vi.spyOn that MongoClient.prototype.connect is never reached. Drift verifier cron (1/hr) - src/cron/drift-verifier.js: drains both retry queues by re-attempting secondary writes, deletes on success. Spot-checks parity by sampling DRIFT_SAMPLE_N keys per module, hashing, logging mismatches. - src/modules/cron-dispatcher.js: SYSTEM_CRONS array dispatched alongside module crons. Keeping system cron out of registry.crons preserves existing module-cron length tests and is the cleaner design. - wrangler.toml: vars STORAGE_PRIMARY/DUAL_WRITE/DRIFT_SAMPLE_N + cron schedule "0 * * *" added. Trading wiring - src/modules/registry.js: builds new MongoTradesStore(env) when Mongo is in play and threads it as tradesStore into trading module's init context. Trading module already accepted optional tradesStore (Phase 03 backwards-compat) — D1 path remains for STORAGE_PRIMARY=kv + DUAL_WRITE=0. Tests + verification - tests/db/dual-kv-store.test.js, dual-sql-store.test.js: write-both, secondary-fail-logs+enqueues, primary-fail-throws, reads-primary-only, _kind sentinel. - tests/db/stub-mongo-sentinel.test.js: spy on MongoClient.connect, assert zero calls across all flag-matrix combos. - tests/cron/drift-verifier.test.js: queue drain, skip paths, error safety. - tests/e2e/storage-roundtrip.test.js: wordle KV dual-write + trading MongoTradesStore against fake-mongo. Tests: 577 → 638 (+61). register:dry passes without Atlas. Lint clean. Concerns - Drift-verifier parity-spot-check tests assert queue-drain only; full mismatch detection needs real Atlas (Vitest ES-module caching blocks reliable prototype patching). Verifier logic verified by inspection.	2026-04-26 09:02:07 +07:00
tiennm99	6f0b5ff0a8	feat(db): phase 01 — atlas wrangler config + secret-leak lint + mongodb dep Code/config slice of plan phase 01 (operator-only steps for cluster provisioning, secrets, and runtime smoke tests deferred to user). - wrangler.toml: add `compatibility_flags = ["nodejs_compat_v2"]` (compatibility_date `2025-10-01` already satisfies ≥ 2025-03-20) - .env.deploy.example: add `MONGODB_URI` placeholder with mirror-protocol note - scripts/check-secret-leaks.js: lint that fails build on `console.log(env.<SECRET>)` for MONGODB_URI / TELEGRAM_BOT_TOKEN / TELEGRAM_WEBHOOK_SECRET / ADMIN_TOKEN - package.json: install mongodb@^6.7.0 (resolved 6.21.0); wire secret-leak check into `npm run lint` - docs/using-mongodb.md: operational runbook (cluster spec, free-tier ceiling, auto-pause behavior, network access permanence, rollback, rotation) Bundle-size HARD GATE: PASS. Probe with `import { MongoClient }` measures 226 KiB gzipped (3 MiB Free cap, 92% headroom) — nodejs_compat_v2 provides node:net/tls/crypto from runtime so transitive deps stay unbundled. CPU-time gate and auto-pause behavior gate require real Atlas access; deferred to operator (see docs/using-mongodb.md for procedure). 503/503 vitest tests still pass.	2026-04-26 08:32:19 +07:00
tiennm99	3ac06bffaa	feat(loldle): add ability and splash champion-guessing modules Closes deferred phases 04 + 05 of loldle-new-modes plan. - loldle-ability: 5 guesses, DDragon ability icon as photo. State pins slot (P/Q/W/E/R) so the same icon shows every turn. Abilities pulled from DDragon per-champion — same source loldle.net uses at runtime. - loldle-splash: 4 guesses, random skin splash as photo. Skin pool scraped from loldle.net bundle (var Ad=[…] — 172 champs × 1939 skins, non-chroma, matches their splash mode exactly). URLs from Riot DDragon CDN (no version segment, stable across patches). - fetch-ddragon-data.js: extended to write all four JSONs in one run. Shares a single DDragon per-champion fetch cycle (concurrency 10). - Credits loldle.net + Riot Games in all loldle-family READMEs. 19 new tests (503 total). Lint clean. register:dry reports 12 loldle_* commands with no conflicts.	2026-04-24 23:58:42 +07:00
tiennm99	bd5626534b	feat(loldle): add emoji and quote champion-guessing modules Ship two new loldle-family modules mirroring loldle.net's non-classic modes. Text-only MVP (ability/splash phases stay deferred). - loldle-emoji: 5 guesses, emoji-sequence clue. Pool derived algorithmically from classic's champions.json metadata (species/region/resource mapping table) since loldle.net's bundle has no static emoji pool. - loldle-quote: 6 guesses, lore-blurb clue. Pool seeded from Data Dragon champion title + first lore sentence; champion name redacted to ___. - scripts/fetch-ddragon-data.js: single generator for both JSONs. - src/util/normalize-name.js: shared lookup helper; loldle/lookup.js refactored to import it. 35 new tests (484 total passing). Lint clean.	2026-04-24 23:30:11 +07:00
tiennm99	3be799d68a	chore: project cleanup — purge stale function-calling refs + sync docs Followed code-reviewer audit. Findings applied: - twentyq/README.md, twentyq/index.js header — claimed "function calling" + ANSWER_FUNCTION_SCHEMA / submit_answer; rewrite to JSON-in-content matching what the code actually does. Added generateRoundStart line. - wrangler.toml [ai] comment — list both bge-m3 (semantle/doantu) AND gemma-4 (twentyq) consumers; drop neuron math that no longer matched. - scripts/stub-kv.js — drop reference to nonexistent REGISTER_DRYRUN flag. - twentyq/ai-client.redactSecret — strip dead "if (out.length > 0)" branch (String.replace cannot produce empty string from the inputs we pass). - handlers.test.js — drop noise saveGame() before "no games" stats assert; add ai.run call-count guards on two-AI-call flows. - docs/codebase-summary.md — full rewrite of Active Modules table (semantle/doantu/lolschedule/twentyq were missing); fix vitest 2→4 + wrangler 3→4 versions; replace stale 200-test count with current ~450. - docs/architecture.md — file tree includes lolschedule/semantle/doantu/ twentyq + cron-dispatcher + sql-store* + scripts/migrate.js; moduleRegistry snippet matches src/modules/index.js. - docs/todo.md — entire file obsolete (D1 UUID populated, cron live). Deleted. Tests: 449 pass, lint clean.	2026-04-24 18:30:25 +07:00
tiennm99	919fa038d5	fix(register): stub env.AI so semantle init doesn't break registration `npm run register` imports buildRegistry to derive the public command list but ran outside the Worker runtime, so `env.AI` was undefined — semantle (and previously doantu) tripped `createClient` type-checks. Add a no-op AI stub alongside stubKv and wire it through the buildRegistry env.	2026-04-23 11:38:27 +07:00
tiennm99	4acc471f6f	refactor(doantu): swap Workers AI bge-m3 for hosted phow2sim HTTP API Doantu now mirrors semantle's pre-Workers-AI shape: a thin fetch wrapper around /random + /similarity on https://phow2sim.sg.miti99.com (overridable via PHOW2SIM_API_URL). Drops the local Viet22K wordlist + build script — the service owns vocabulary now. Promotes commands from protected to public so they show up in Telegram's native / menu.	2026-04-23 11:35:32 +07:00
tiennm99	9b331fc24d	refactor(semantle,doantu): drop ConceptNet vestiges, trim wordlist API Now that both modules run on Workers AI embeddings, drop the legacy Word2SimError alias, the unused wordlist helpers (getLine, LINE_COUNT, pickFromPool), and every comment/README section still describing the removed ConceptNet backend. Fix the bge-small doc typo in semantle/index.js and align the semantle api-client test fake-vector dim with the real 384-dim output.	2026-04-23 00:19:28 +07:00
tiennm99	c0315574c0	feat(doantu): add Vietnamese semantle module (protected visibility) Near-clone of the semantle module, adapted for Vietnamese: - Targets from duyet/vietnamese-wordlist Viet22K (~22k entries, GPL). Regenerate via scripts/build-doantu-words.js; chained into npm run build. - ConceptNet client uses /c/vi/<term> URIs; multi-word guesses (e.g. "con chó") are space-to-underscore converted at URL build time so the board keeps the natural display. - lookup.js permits Unicode letters + combining marks + single internal spaces; rejects digits/punctuation. - All three commands (/doantu, /doantu_giveup, /doantu_stats) are visibility=protected — shown in /help, hidden from Telegram's native / autocomplete menu while the module is still experimental. Wired into src/modules/index.js, wrangler.toml MODULES, .env.deploy(.example), and package.json build chain. Separate module rather than a shared base with semantle — matches the repo's one-module-per-game convention (see loldle vs wordle); factor later if a third language appears.	2026-04-22 23:29:36 +07:00
tiennm99	4c2890ba25	refactor(semantle): drop word filter, expose line-based wordlist API Use the full google-10000-english list verbatim (normalize only — lowercase + dedupe, no length or alpha filtering). Pool goes from 7953 to 9894 entries; rare/short/long picks are still sieved by ConceptNet's verify-and-fallback at round start. Replaces TARGET_POOL/pickFromPool with a clearer line-based API: LINE_COUNT — how many entries randomLine() — uniform pick getLine(n) — nth entry (n = frequency rank) pickFromPool retained as a back-compat re-export so existing callers don't break.	2026-04-22 23:19:51 +07:00
tiennm99	64c0248eea	feat(semantle): source target pool from google-10000-english dictionary The ~250-word hand-curated TARGET_POOL was too small for long-term play. Replaces it with a build-script-generated dictionary: - scripts/build-semantle-words.js fetches first20hours/google-10000-english (no-swears variant), filters to 4–10 ASCII letters, drops the top-200 most frequent function words, and writes src/modules/semantle/words-data.js as a static ES-module export. - wordlist.js now just re-exports that data via TARGET_POOL + pickFromPool. - package.json: new build:semantle-words script; chained into `npm run build` alongside build:wordle-data so `npm run deploy` regenerates automatically. Pool size: ~250 → 7953 words. Same ConceptNet verify-and-fallback flow, so low-quality picks still cost at most one extra concept lookup.	2026-04-22 23:12:07 +07:00
tiennm99	85c43109b6	fix(loldle): recover newer champions skipped by the scraper loldle.net's classic-mode bundle has two record shapes — older champions carry _id/championId, newer ones (Bel'Veth, K'Sante, Nilah, …) don't. The regex required those leading fields, silently dropping anyone added since 2022. Make _id/championId optional and non-capturing, and drop them from the output record (the bot never read them anyway). Champion count: 169 → 172; guessing /loldle k'sante, /loldle bel'veth, /loldle nilah now resolve correctly.	2026-04-22 14:07:11 +07:00
tiennm99	e1f7cdf645	refactor(loldle): trim module to current behavior only KV payload cleanup: - drop lastResultAt from stats (never read) - drop solved/giveup flags from game state (round is immediately replaced after finish, making the flags transient noise) - skip redundant saveGame on winning/giveup/out-of-guesses paths; startFreshGame overwrites anyway Code cleanup: - delete daily.js + daily.test.js (pickDaily/todayUtc were speculative "future use" — only pickRandom was wired in, inlined into handlers) - drop the dead switch default in compare.js - trim file preambles across the module Docs: rewrite README around current behavior with loldle.net as the sole data source; update scraper header to match the raw schema.	2026-04-22 13:49:38 +07:00
tiennm99	df46e4ee22	refactor(loldle): consume loldle.net's raw schema directly Drop the in-scraper normalization step — champions.json now mirrors the exact shape emitted by loldle.net's JS bundle. Records use _id, championId, championName, arrays for positions/species/regions/ range_type, "Male"/"Female"/"Other" gender strings, and a full YYYY-MM-DD release_date. Comparison is schema-aware: multi-value keys accept arrays directly, the year axis parses YYYY out of the ISO date, and exact compares stay case-insensitive.	2026-04-22 13:29:55 +07:00
tiennm99	615dc8174c	refactor(loldle): import champions.json directly, drop ESM wrapper Node 24 + wrangler 4.x both accept `import ... with { type: "json" }`, so the generated champions-data.js wrapper is no longer needed. Drop scripts/build-loldle-data.js and the build:loldle-data npm script. Scraper writes champions.json only.	2026-04-22 13:24:24 +07:00
tiennm99	0836f02ab8	refactor(loldle): source all champion data from loldle.net loldle.net's JS bundle ships the complete set of classic-mode axes in plaintext, so ddragon merging is no longer needed. Scraper now produces the final schema directly. Schema changes: drop title, skinCount, image, and genre (ddragon-only). Replace genre (class tags like Fighter/Mage) with species (Human/Darkin/ Vastayan) — the axis loldle.net actually uses. Promote region to a multi-value field so multi-region champions compare correctly. Handlers no longer show "Name — Title" on win/giveup.	2026-04-22 13:19:10 +07:00
tiennm99	dd38e39c4e	feat(loldle): weekly scraper for champion data from loldle.net Pulls canonical game fields (gender, positions, range_type, regions, release_date) from loldle.net's JS bundle and merges with ddragon championFull for display metadata (title, resource, genre, skinCount, image). Opens a PR weekly via GitHub Actions (Mon 06:00 UTC).	2026-04-22 13:14:12 +07:00
tiennm99	785de9231a	feat(wordle): port classic 5-letter guessing game Replaces the wordle stub with a full implementation mirroring the loldle module layout: compare/lookup/daily/render/state/handlers/index split, per-subject KV state, standard 6 guesses, two-pass duplicate-letter marking. Commands: /wordle, /wordle_new, /wordle_giveup, /wordle_stats. Word list (14,855 entries) sourced from dracos's gist (https://gist.github.com/dracos/dd0668f281e685bad51479e5acaadb93) and bundled via scripts/build-wordle-data.js. Credits in module README and generated file headers. Dispatcher test updated for the new command count (12 → 13).	2026-04-20 22:08:58 +07:00
tiennm99	1e01437766	feat(loldle): port classic-mode game from loldle repo Adds loldle module with classic-mode champion guessing. Ports comparison logic from tiennm99/loldle (lib/classic-mode.js) and bundles champion data from tiennm99/loldle-data. Adds GH Actions workflow that re-syncs champions.json on cross-repo dispatch from loldle-data. - Three public commands: /loldle, /loldle_giveup, /loldle_stats - Per-user daily state + streak stats in KV (3-day TTL on games) - champions-data.js wrapper sidesteps Node 24 / esbuild disagreement on JSON import attributes; generator script + npm run build:loldle-data - register script now tolerates missing .env.deploy (env-file-if-exists) so Workers Builds can inject env vars directly - fix(scripts): escape stray */ in migrate.js docstring that broke node - 16 new unit tests (compare, daily, lookup); dispatcher test updated for the new command set	2026-04-20 17:10:08 +07:00
tiennm99	83c6892d6e	feat: add D1 storage layer with per-module migration runner - SqlStore interface + CF D1 wrapper + per-module factory (table prefix convention) - init signature extended to ({ db, sql, env }); sql is null when DB binding absent - custom migration runner walks src/modules//migrations/.sql, tracks applied in _migrations table - npm run db:migrate with --dry-run and --local flags; chained into deploy - fake-d1 test helper with subset of SQL semantics for retention and history tests	2026-04-15 13:21:53 +07:00
tiennm99	c4314f21df	feat: scaffold plug-n-play telegram bot on cloudflare workers grammY-based bot with a module plugin system loaded from the MODULES env var. Three command visibility levels (public/protected/private) share a unified command namespace with conflict detection at registry build. - 4 initial modules (util, wordle, loldle, misc); util fully implemented, others are stubs proving the plugin system end-to-end - util: /info (chat/thread/sender ids) + /help (pure renderer over the registry, HTML parse mode, escapes user-influenced strings) - KVStore interface with CFKVStore and a per-module prefixing factory; getJSON/putJSON convenience helpers; other backends drop in via one file - Webhook at POST /webhook with secret-token validation via grammY's webhookCallback; no admin HTTP surface - Post-deploy register script (npm run deploy = wrangler deploy && node --env-file=.env.deploy scripts/register.js) for setWebhook and setMyCommands; --dry-run flag for preview - 56 vitest unit tests across 7 suites covering registry, db wrapper, dispatcher, help renderer, validators, and HTML escaper - biome for lint + format; phased implementation plan under plans/	2026-04-11 09:49:06 +07:00

24 Commits