Commit Graph

39 Commits

Author SHA1 Message Date
tiennm99 ea6dbb5d3f fix(ci): bump node to 22 for pnpm 11 compatibility 2026-05-13 10:29:51 +07:00
tiennm99 d73b9478f0 fix(ci): migrate workflow from npm to pnpm 2026-05-13 10:28:45 +07:00
tiennm99 2229612bf2 chore: remove npm lockfile 2026-05-13 10:20:10 +07:00
tiennm99 2a49b0f2c7 Merge remote-tracking branch 'origin/main' 2026-05-13 10:19:34 +07:00
tiennm99 d5ae30974c chore: migrate from npm to pnpm 2026-05-13 10:17:03 +07:00
tiennm99 ca35648d20 docs: write substantive README 2026-05-11 20:16:54 +07:00
tiennm99 9c360cb55c refactor: match Java cron message format 2026-05-10 21:05:41 +07:00
tiennm99 c84e636b40 feat: /setappttl admin command for runtime cache TTL override
New admin-only command persists APP_CACHE_SECONDS override in admin doc
(admin.appCacheSeconds), bounds [60, 86400], 0/default clears. Fallback
to env when unset.

app-cache-repository now takes a lazy memoized TTL getter — admin doc
read at most once per request, only on cache write paths. app-builder
wires (admin.getAppCacheSeconds() ?? config.appCacheSeconds).

/settings now displays effective TTL with override status alongside the
existing per-group settings.

Also fix /setdayswarning + /setappttl arg parsing: gate reset on string
'0' instead of parseInt === 0, so '/cmd 0.5', '/cmd 0xff', '/cmd 0abc'
no longer silently trigger a reset.
2026-05-10 02:46:26 +07:00
tiennm99 14b6cc7965 feat: admin-scoped Telegram menu
setMyCommands now runs at default scope (user commands only) plus once
per ADMIN_ID with scope:{type:'chat'} carrying the full set, so admin
commands (/addgroup, /delgroup, /listgroup) are hidden from non-admins'
menu. requireAdminUser remains the actual auth gate; this is UI cleanup.

parseAdminIds extracted to src/util/parse-admin-ids.js — shared between
config loading and the register script. ADMIN_IDS now required by
register-webhook.js (.env.deploy.example updated). Per-admin call uses
allowFail to gracefully skip admins who haven't DMed the bot yet.
2026-05-10 02:46:04 +07:00
tiennm99 adfa9adeda refactor: extract bot command catalog as single source of truth
src/bot/commands/index.js owns the canonical catalog (name, description,
adminOnly, build factory). bot.js builds dispatch from it; future menu
registration reads it. Drops the 14 explicit factory imports + the inline
/info handler from bot.js. Prevents the dispatch-vs-menu drift that bit
us with /setdayswarning (commit 013120649726f1 backfill).
2026-05-10 02:45:50 +07:00
tiennm99 eb0f79be82 chore: drop migration leftovers, refresh env examples + secret-leak scope
- remove MONGODB_URI from .env.example (Atlas migration done; deleted from
  Vercel cloud env too)
- trim .env.deploy.example to vars actually consumed by deploy scripts
  (Upstash creds were only needed by the now-deleted migration script)
- README config table: drop ENV / SOURCE_COMMIT / SCHEDULE_CHECK_APP_TIME
  (never read by code; Java-era leftovers)
- check-secret-leaks: drop MONGODB_URI; add UPSTASH/KV/CRON tokens; widen
  scan roots to include api/
- add scripts/list-upstash-keys.js read-only ops helper
2026-05-10 00:23:16 +07:00
tiennm99 49726f14c1 chore: backlog cleanup — deps, CI, bot description, ops docs
- pin form-data/qs/tough-cookie via package.json overrides; clears 3 of 4
  Dependabot alerts (request SSRF risk-accepted, no upstream fix)
- add GitHub Actions CI (lint + syntax check) on push/PR
- add /settings and /setdayswarning to setMyCommands
- new npm run describe sets bot profile description via Bot API
- README: drop stale preview warning, add Operations section
2026-05-10 00:13:09 +07:00
tiennm99 01312065c5 feat: per-group days-to-warning override + /settings + /setdayswarning
Adds optional group.settings.numDaysWarningNotUpdated, resolved per-group
in scheduler and /checkapp with fallback to env default. New commands
/settings (read) and /setdayswarning <n|0|default> (write).
2026-05-09 23:30:12 +07:00
tiennm99 c32688f41b refactor: drop implemented plans + reports
Migration to Vercel + Upstash is live and the Java bot is being
shut down. Git history retains all phase docs and audit reports
if needed for reference.
2026-05-09 23:12:17 +07:00
tiennm99 4fe4a781e0 fix: gate /rawappleapp + /rawgoogleapp on authorizeGroup
Match the auth check used by every other command. Without it,
any chat that knows the bot username could dump arbitrary
App Store / Play Store JSON.
2026-05-09 23:01:19 +07:00
tiennm99 0f0f9b93f3 fix: /delgroup wipes group key + scheduler/checkapp guard non-finite updated
/delgroup previously only removed the chatId from the admin allowlist;
the matching group:{chatId} key (with its tracked-app state) was left
in Redis with no TTL. Re-adding the same group later resurrected the
old subscription list. Now the command calls store.group.deleteGroup
after the admin removal succeeds.

The scheduler's Google branch and /checkapp's Google rows path both
called daysBetween(updatedMs, now) without guarding non-finite
updatedMs values, unlike the parallel Apple branches. A garbage
upstream value would have produced NaN days. Skip those entries.
2026-05-09 22:37:39 +07:00
tiennm99 dbb6d0e015 refactor: remove redundant inner try/catch in commands
The dispatcher already wraps every handler in try/catch and sends the
'Internal server error' fallback on failure. Each command's inner
try/catch around its Redis ops was masking that path — the dispatcher
never saw the error, so logger.error('command failed') never fired.
Removing the inner catches restores observability and shortens each
file. The semantically-different try/catch blocks (mapping upstream
API failures to a different user message) are kept.
2026-05-09 22:37:02 +07:00
tiennm99 b242fea2f7 refactor: dispatcher pre-parses args, drop arg helpers + info.js
The dispatcher now extracts both the command name and its arguments
once and passes the args array to handlers via a 3rd parameter. Each
command file drops its splitArgs(getCommandArguments(msg.text)) call
and the corresponding imports. command-utils.js loses the now-unused
arg helpers (only auth helpers remain). info.js — a 12-line file with
one sendMessage — folds into bot.js's commands map.
2026-05-09 22:35:38 +07:00
tiennm99 55cdc04c57 refactor: merge apple/google app caches into one parametrized repo
Two near-identical per-store repository files collapse into a single
createAppCacheRepository(handle, prefix, ttl) factory used twice from
app-builder. Scrapers now receive the cache directly (one less layer),
and the cache entry shape drops the obsolete _id field — the Redis key
already encodes the appId.
2026-05-09 22:32:56 +07:00
tiennm99 a2016eca18 refactor: drop dead repo + telegram-api exports, inline createStore
Delete repository/store.js (one-line aggregator) — wiring now inline in
app-builder.js. Drop unused exports: scan + UpstashUnavailable from
upstash.js, getMe + TelegramApiError class from telegram-api.js,
init/getAdmin/save from admin-repository.js, exists/saveGroup from
group-repository.js. Generic Error replaces the named error classes.
2026-05-09 22:32:12 +07:00
tiennm99 8554b72b0b refactor: drop Mongo class discriminator + delete src/models/
Inline trivial factory bodies into the repos and scrapers that used them.
The class:/_id: fields were Java-Mongo parity artifacts that nothing
in this codebase reads — Redis docs with the old fields still parse
fine; the next write drops them.
2026-05-09 22:30:48 +07:00
tiennm99 2000b85bbc chore: journal Phase 7 cleanup of vercel-upstash consolidation
Closes the 260509-1656-consolidate-vercel-upstash plan with a record
of the cleanup commit (0a395bd) and the operator follow-ups left open.
2026-05-09 21:50:58 +07:00
tiennm99 0a395bde62 chore: remove cloudflare + docker + legacy migration scripts
Phase 7 cleanup of the Vercel + Upstash consolidation plan:

- delete wrangler.toml, Dockerfile, docker-compose{,.dev}.yml,
  scripts/migrate-atlas-to-upstash.js (one-shot migration done)
- drop wrangler + mongodb devDeps and migrate* npm scripts;
  regenerate package-lock.json (-70 packages)
- prune CF/Wrangler/Atlas-export entries from .gitignore + .vercelignore
- drop MONGODB_URI from .env.deploy.example
- rewrite README for Vercel + Upstash architecture
- refresh stale Cloudflare comments in src/{logger,models,repository}
2026-05-09 21:49:48 +07:00
tiennm99 b2082c4601 fix: use Vercel classic Node runtime API instead of Web Standards Request
Vercel `nodejs` runtime passes IncomingMessage/ServerResponse with
shouldAddHelpers=true (auto-parsed JSON body, .status/.send helpers),
not the Web Standards Request/Response. Calling `req.headers.get(...)`
on the classic IncomingMessage threw `TypeError: req.headers.get is
not a function` and crashed every webhook + cron invocation with 500.

Switch both handlers to (req, res) signature, read headers as plain
object (lowercased keys), use req.body for parsed JSON, and respond
via res.status().send().

Caught during Phase 6 smoke test of the first prod deploy.
2026-05-09 20:46:29 +07:00
tiennm99 d9f23ee0c2 chore: gitignore .vercel/ (CLI artifacts from vercel build/pull) 2026-05-09 20:36:48 +07:00
tiennm99 987837c1d8 feat: accept KV_REST_API_* env vars as Upstash fallback
Vercel Marketplace Upstash integration injects KV_REST_API_URL and
KV_REST_API_TOKEN — different names from vanilla Upstash signup
(UPSTASH_REDIS_REST_URL / UPSTASH_REDIS_REST_TOKEN). Adapter and
migration script now accept either form, so the operator doesn't have
to duplicate values when sharing an Upstash DB provisioned via the
Vercel integration. UPSTASH_* takes precedence when both are set.
2026-05-09 20:17:06 +07:00
tiennm99 771b5ebf8e chore: refresh .env.deploy.example for Vercel + Upstash + add session journal
- Add Upstash REST creds, KEY_PREFIX, MONGODB_URI (one-shot migration)
- Update WORKER_URL example to Vercel format (var name retained for
  register-webhook.js compatibility; rename deferred to Phase 7)
- Track docs/journals/ entry from cook session
2026-05-09 20:13:15 +07:00
tiennm99 c2dd35b75f feat: migrate to Vercel + Upstash with KEY_PREFIX namespacing
Phases 1-5 of consolidate-vercel-upstash plan. Replaces Cloudflare
Workers + KV with Vercel serverless functions + Upstash Redis. Inlines
app-store-scraper / google-play-scraper npm libs (drops the
store-scraper.vercel.app HTTP roundtrip). KEY_PREFIX (default
'store-scraper-bot:') namespaces all Redis keys so the Upstash DB can
be safely shared with other Vercel projects.

- vercel.json + .vercelignore + Vercel-aware package.json scripts
- api/webhook.js + api/cron.js Vercel functions (with shared
  src/app-builder.js); cron auth fails closed when CRON_SECRET unset
- src/repository/upstash.js replaces kv.js; all 4 repos take a handle
  bundling client + prefix
- scripts/migrate-atlas-to-upstash.js writes legacy Java Atlas state
  directly to Upstash with --dry-run + --include-cache flags
- .env.example refreshed for the new env surface

Phases 6 (Vercel deploy + webhook cutover) and 7 (Docker + wrangler
cleanup) remain operator-driven post-deploy.
2026-05-09 20:07:07 +07:00
tiennm99 134bce0826 docs: update Java reference to java-store-scraper-bot
Java repo renamed legacy-store-scraper-bot → java-store-scraper-bot for symmetry with go-store-scraper-bot. Status (deprecated/maintained) belongs in README banners, not URLs.
2026-05-09 17:38:31 +07:00
tiennm99 1ca421b3d5 refactor: rename project to store-scraper-bot
GitHub repo rename: js-store-scraper-bot becomes the canonical store-scraper-bot. The Java reference impl is renamed to legacy-store-scraper-bot in parallel.

Updates: package.json name + description, README header + reference link, wrangler.toml worker name (file slated for removal in cleanup phase but kept consistent in the interim).
2026-05-09 17:18:50 +07:00
tiennm99 fbf2e95a52 docs: lock Vercel + Upstash consolidation plan
Brainstorm + research concluded: bot moves off CF Workers + Vercel split-architecture onto a single Vercel deployment with Upstash Redis storage. 7-phase implementation plan added.

Supersedes the in-progress CF KV migration plan; code-level KV migration already shipped, deploy phases never executed.
2026-05-09 17:18:39 +07:00
tiennm99 c4c6a93e06 docs: add Phase 06 (Atlas → KV data migration) to KV plan
Adds the phase doc that drove the migration script and updates plan.md
with the new phase, dependency edges (02 → 06 → 04), and removes the
prior "Out of scope: Data migration" line.
2026-05-05 21:15:27 +07:00
tiennm99 f3a235de00 feat: add Atlas → KV one-shot migration script
Operator runs `npm run migrate` (reads admin + group docs from Atlas)
followed by `npm run migrate:bulk` (uploads via wrangler kv bulk put).
Cache collections are skipped by default since they auto-rebuild from
upstream APIs; --include-cache flag migrates them with TTL preserved.

- mongodb is added as a devDependency only — never enters the Worker
  bundle, the Worker still talks to KV exclusively.
- scripts/.atlas-export.json is gitignored (contains exported state).
- README documents the one-time runbook.
2026-05-05 21:15:18 +07:00
tiennm99 af8f1516fa docs: add KV migration plan, supersede Atlas deploy plan
Adds the 260505-1425 plan that drove the KV swap (Phase 01 already
applied in the preceding commit; Phases 02-05 are operator-driven).
Marks 260426-2327-cloudflare-deploy-and-smoke as superseded since
the KV pivot was taken pre-emptively rather than after a hard-gate trip.
2026-05-05 20:39:12 +07:00
tiennm99 067d463b6a feat: replace MongoDB driver with Cloudflare KV storage
Drops the mongodb dependency entirely; all four logical collections
(admin singleton, group, apple_app, google_app) now live in a single KV
namespace bound as STORE_KV with prefixed keys. Cache TTL is delegated
to KV via expirationTtl (clamped to the 60s minimum). Document shape,
field names, and Java parity at the doc level are preserved.

- Adds src/repository/kv.js helper (getJson/putJson/del with TTL clamp)
- Rewrites all four *-repository.js modules on top of KV
- Removes src/repository/mongodb.js and the MONGODB_URI env requirement
- Adds an early STORE_KV-binding guard in src/index.js
- Bumps to 0.3.0
2026-05-05 20:39:01 +07:00
tiennm99 e3e375d203 docs: add Java vs JS parity reports from /ck:xia
researcher report maps 47 Java files to 35 JS files (11 PARITY, 2
MINOR, 0 GAP, 3 EXTRA). xia synthesis records source manifest, decision
matrix, and risk score. No port work needed; remaining items are
operational.
2026-04-29 16:07:17 +07:00
tiennm99 cbe08d6617 chore: archive code-port plan, add plans/todo.md index
Code port (commit bff1d32) is done; move plan to plans/archive/.
New plans/todo.md is the forward-looking index pointing at the active deploy
plan, pre-flight checks, hard gates, and open questions for the operator.
2026-04-26 23:46:22 +07:00
tiennm99 bff1d324f5 feat: Cloudflare Workers code port (deploy pending)
Refactors source to be Worker-shaped. No live deploy yet — sister deploy plan
runs Atlas provisioning + smoke later.

- wrangler.toml with nodejs_compat_v2, daily UTC 0 cron (= 7am Asia/Ho_Chi_Minh)
- package.json: drop node-telegram-bot-api, node-cron, dotenv, pino,
  pino-pretty; add wrangler devDep; bump to 0.2.0
- src/bot/telegram-api.js: raw fetch wrapper for Telegram Bot API
- src/bot/dispatch.js: per-message dispatcher extracted from polling loop
- src/repository/mongodb.js: memoized MongoClient per warm isolate, typed
  MongoUnavailable error, fast-fail timeouts
- src/repository/store.js: factory binding env once
- All 4 repositories converted to factory shape
- src/api/{apple,google}-scraper.js: take store instead of importing repos
- src/index.js: Worker entry exporting { fetch, scheduled }; webhook validates
  X-Telegram-Bot-Api-Secret-Token; ack-then-waitUntil pattern
- src/scheduler/scheduler.js: trimmed; runDailyCheck only (no node-cron)
- src/config.js, src/logger.js: env-driven, console.log JSON output
- scripts/register-webhook.js: setWebhook + setMyCommands; --dry-run supported
- scripts/check-secret-leaks.js: lint blocks console.log(env.<SECRET>)
- plans/260426-2015-cloudflare-worker-code-port: this code port plan
- plans/260426-2327-cloudflare-deploy-and-smoke: sister deploy plan

Validated via node --check on all 32 source files; lint clean. Real deploy
gates (bundle size, cold-start CPU) run in deploy plan.
2026-04-26 23:36:39 +07:00
tiennm99 a656a18ce8 feat: initial JavaScript port of store-scraper-bot
Node.js 20+ ESM port mirroring Java/Go implementations.

- 13 Telegram commands matching Java identifiers
- MongoDB schema parity (common, group, apple_app, google_app collections)
- Apple/Google scrapers calling store-scraper.vercel.app with 10-min cache
- Daily 7am Vietnam-time cron with weekend-silent mode
- HTML table renderer matching Java/Go output
- Docker + Compose (prod and dev)

Untested end-to-end against live Telegram or upstream API.
2026-04-26 20:16:00 +07:00