feat: Cloudflare Workers code port (deploy pending)

Refactors source to be Worker-shaped. No live deploy yet — sister deploy plan
runs Atlas provisioning + smoke later.

- wrangler.toml with nodejs_compat_v2, daily UTC 0 cron (= 7am Asia/Ho_Chi_Minh)
- package.json: drop node-telegram-bot-api, node-cron, dotenv, pino,
  pino-pretty; add wrangler devDep; bump to 0.2.0
- src/bot/telegram-api.js: raw fetch wrapper for Telegram Bot API
- src/bot/dispatch.js: per-message dispatcher extracted from polling loop
- src/repository/mongodb.js: memoized MongoClient per warm isolate, typed
  MongoUnavailable error, fast-fail timeouts
- src/repository/store.js: factory binding env once
- All 4 repositories converted to factory shape
- src/api/{apple,google}-scraper.js: take store instead of importing repos
- src/index.js: Worker entry exporting { fetch, scheduled }; webhook validates
  X-Telegram-Bot-Api-Secret-Token; ack-then-waitUntil pattern
- src/scheduler/scheduler.js: trimmed; runDailyCheck only (no node-cron)
- src/config.js, src/logger.js: env-driven, console.log JSON output
- scripts/register-webhook.js: setWebhook + setMyCommands; --dry-run supported
- scripts/check-secret-leaks.js: lint blocks console.log(env.<SECRET>)
- plans/260426-2015-cloudflare-worker-code-port: this code port plan
- plans/260426-2327-cloudflare-deploy-and-smoke: sister deploy plan

Validated via node --check on all 32 source files; lint clean. Real deploy
gates (bundle size, cold-start CPU) run in deploy plan.
This commit is contained in:
2026-04-26 23:36:39 +07:00
parent a656a18ce8
commit bff1d324f5
44 changed files with 2132 additions and 503 deletions
+5
View File
@@ -0,0 +1,5 @@
TELEGRAM_BOT_TOKEN=
TELEGRAM_BOT_USERNAME=
TELEGRAM_WEBHOOK_SECRET=
MONGODB_URI=
ADMIN_IDS=
+3
View File
@@ -0,0 +1,3 @@
TELEGRAM_BOT_TOKEN=
TELEGRAM_WEBHOOK_SECRET=
WORKER_URL=https://js-store-scraper-bot.<account>.workers.dev
+6
View File
@@ -7,6 +7,12 @@ pnpm-debug.log*
.env
.env.local
.env.*.local
.env.deploy
.dev.vars
# Cloudflare Workers
.tmp-deploy/
.wrangler/
# Editor / IDE
# .idea/
+11 -11
View File
@@ -1,7 +1,7 @@
{
"name": "js-store-scraper-bot",
"version": "0.1.0",
"description": "JavaScript port of store-scraper-bot — Telegram bot for tracking Apple/Google Play app updates.",
"version": "0.2.0",
"description": "JavaScript port of store-scraper-bot — Telegram bot tracking Apple App Store + Google Play app updates. Deploys to Cloudflare Workers.",
"type": "module",
"private": true,
"engines": {
@@ -9,17 +9,17 @@
},
"main": "src/index.js",
"scripts": {
"start": "node src/index.js",
"dev": "node --watch src/index.js",
"lint": "node --check src/index.js"
"dev": "wrangler dev",
"deploy": "wrangler deploy && npm run register",
"register": "node --env-file=.env.deploy scripts/register-webhook.js",
"register:dry": "node --env-file=.env.deploy scripts/register-webhook.js --dry-run",
"lint": "node scripts/check-secret-leaks.js"
},
"dependencies": {
"dotenv": "^16.4.5",
"mongodb": "^6.10.0",
"node-cron": "^3.0.3",
"node-telegram-bot-api": "^0.66.0",
"pino": "^9.5.0",
"pino-pretty": "^11.3.0"
"mongodb": "^6.10.0"
},
"devDependencies": {
"wrangler": "^3.90.0"
},
"license": "Apache-2.0"
}
@@ -0,0 +1,219 @@
# Phase 01 — Wrangler Config + Deps Swap + Lint Scaffolding
## Context Links
- miti99bot wrangler.toml: `/config/workspace/tiennm99/miti99bot/wrangler.toml`
- miti99bot secret-leak lint: `/config/workspace/tiennm99/miti99bot/scripts/check-secret-leaks.js`
- Existing `package.json`: `/config/workspace/tiennm99/js-store-scraper-bot/package.json`
## Overview
- **Priority:** P0
- **Status:** pending
- **Description:** Pure config + dependency changes. No application code touched. After this, `npm install` succeeds with the new dep set; `wrangler.toml` is in place; lint runs.
## Key Insights
- Workers cron is **UTC**: `0 0 * * *` UTC = 7am Asia/Ho_Chi_Minh.
- `nodejs_compat_v2` is the lever that gives `node:net`/`node:tls` (needed by `mongodb` driver).
- Secret-leak lint added now so subsequent phases can't sneak in a `console.log(env.MONGODB_URI)`.
- Removing Node-only deps in this phase causes `src/index.js` to fail to load — that's expected and fine; Phase 04 rewrites the entry. Until then, `npm start` is broken (acceptable; we're refactoring).
## Requirements
### Functional
- `wrangler.toml` written with `nodejs_compat_v2`, cron `0 0 * * *`, observability block.
- `package.json`:
- Add: `mongodb@^6.7.0`
- Remove: `node-telegram-bot-api`, `node-cron`, `dotenv`, `pino`, `pino-pretty`
- Dev-add: `wrangler@^3`
- Scripts: `dev`, `deploy`, `register`, `register:dry`, `lint`
- `engines.node` stays `>=20`
- Drop `type: module` is **not** needed — Workers ESM is the same syntax.
- `scripts/check-secret-leaks.js` written.
- `.dev.vars.example` written.
- `.env.deploy.example` written.
- `.gitignore` adds `.dev.vars`, `.env.deploy`, `.tmp-deploy/`, `.wrangler/`.
### Non-functional
- `npm install` succeeds.
- `npm run lint` succeeds (only secret-leak check at this point; will gain `node --check` in later phases).
## Architecture
```
js-store-scraper-bot/
├── wrangler.toml (NEW)
├── package.json (MODIFIED — deps swap)
├── .gitignore (MODIFIED — Worker artifacts)
├── .dev.vars.example (NEW)
├── .env.deploy.example (NEW)
├── scripts/
│ └── check-secret-leaks.js (NEW)
└── src/ (UNTOUCHED in this phase)
```
## Related Code Files
### CREATE
- `wrangler.toml`
- `scripts/check-secret-leaks.js`
- `.dev.vars.example`
- `.env.deploy.example`
### MODIFY
- `package.json` — full rewrite of `dependencies`, `devDependencies`, `scripts`.
- `.gitignore` — append Worker artifact patterns.
### DELETE
- (none in this phase)
## Implementation Steps
1. **Write `wrangler.toml`**:
```toml
name = "js-store-scraper-bot"
main = "src/index.js"
compatibility_date = "2025-10-01"
compatibility_flags = ["nodejs_compat_v2"]
[vars]
APP_CACHE_SECONDS = "600"
NUM_DAYS_WARNING_NOT_UPDATED = "30"
# 0 UTC = 7am Asia/Ho_Chi_Minh
[triggers]
crons = ["0 0 * * *"]
[observability]
enabled = true
head_sampling_rate = 1
[observability.logs]
enabled = true
invocation_logs = true
# Secrets (set via `wrangler secret put`, NOT here):
# TELEGRAM_BOT_TOKEN
# TELEGRAM_BOT_USERNAME
# TELEGRAM_WEBHOOK_SECRET
# MONGODB_URI
# ADMIN_IDS
```
2. **Rewrite `package.json`**:
```json
{
"name": "js-store-scraper-bot",
"version": "0.2.0",
"description": "JavaScript port of store-scraper-bot, deployable to Cloudflare Workers.",
"type": "module",
"private": true,
"engines": { "node": ">=20" },
"main": "src/index.js",
"scripts": {
"dev": "wrangler dev",
"deploy": "wrangler deploy && npm run register",
"register": "node --env-file=.env.deploy scripts/register-webhook.js",
"register:dry": "node --env-file=.env.deploy scripts/register-webhook.js --dry-run",
"lint": "node scripts/check-secret-leaks.js"
},
"dependencies": { "mongodb": "^6.7.0" },
"devDependencies": { "wrangler": "^3" },
"license": "Apache-2.0"
}
```
3. **Write `scripts/check-secret-leaks.js`**:
```js
#!/usr/bin/env node
// Fails CI if any source file logs a secret.
import { readdirSync, readFileSync, statSync } from 'node:fs';
import { join } from 'node:path';
const SECRETS = ['MONGODB_URI', 'TELEGRAM_BOT_TOKEN', 'TELEGRAM_WEBHOOK_SECRET', 'ADMIN_IDS'];
const ROOTS = ['src', 'scripts'];
function* walk(dir) {
for (const entry of readdirSync(dir)) {
const p = join(dir, entry);
const s = statSync(p);
if (s.isDirectory()) yield* walk(p);
else if (/\.(js|mjs|ts)$/.test(p)) yield p;
}
}
const violations = [];
for (const root of ROOTS) {
try {
for (const file of walk(root)) {
const text = readFileSync(file, 'utf8');
for (const sec of SECRETS) {
const re = new RegExp(`console\\.(log|info|warn|error|debug)\\([^)]*\\benv\\.${sec}\\b`);
if (re.test(text)) violations.push({ file, secret: sec });
}
}
} catch { /* ignore missing root */ }
}
if (violations.length) {
console.error('Secret-leak violations:');
for (const v of violations) console.error(` ${v.file}: env.${v.secret} in console.*`);
process.exit(1);
}
console.log('check-secret-leaks: clean');
```
4. **Write `.dev.vars.example`**:
```
TELEGRAM_BOT_TOKEN=
TELEGRAM_BOT_USERNAME=
TELEGRAM_WEBHOOK_SECRET=
MONGODB_URI=
ADMIN_IDS=
```
5. **Write `.env.deploy.example`**:
```
TELEGRAM_BOT_TOKEN=
TELEGRAM_WEBHOOK_SECRET=
WORKER_URL=https://js-store-scraper-bot.<account>.workers.dev
```
6. **Append to `.gitignore`**:
```
.dev.vars
.env.deploy
.tmp-deploy/
.wrangler/
```
7. **Reinstall**:
```sh
rm -rf node_modules package-lock.json
npm install
```
8. **Smoke**: `npm run lint` — must print `check-secret-leaks: clean`.
## Todo List
- [ ] `wrangler.toml` written
- [ ] `package.json` updated (deps swap + scripts)
- [ ] `scripts/check-secret-leaks.js` written
- [ ] `.dev.vars.example` written
- [ ] `.env.deploy.example` written
- [ ] `.gitignore` updated
- [ ] `npm install` succeeds with new deps
- [ ] `npm run lint` passes
## Success Criteria
- All 4 new files exist; `package.json` reflects new dep set; `npm install` clean.
- Lint script runs even though `src/` is unchanged (no false positives).
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| `mongodb` install fails on this Node version | L | Medium | Node 20+ supported by mongodb 6.7 |
| `wrangler` install pulls huge tree | L | Low | Dev-only; not bundled |
| `src/index.js` no longer runs after deps removed | H | None | Expected — Phase 04 rewrites entry |
## Next Steps
- **Blocks:** Phase 02 (telegram-api.js relies on Worker-style fetch; doesn't depend on this phase's files but conceptually starts here), Phase 03.
- **Unblocks:** Phase 02, Phase 03 (parallel-safe).
@@ -0,0 +1,182 @@
# Phase 02 — Worker-Native Telegram Client
## Context Links
- Existing client: `src/bot/bot.js:18-66` (`node-telegram-bot-api` wrapper)
- Telegram Bot API: https://core.telegram.org/bots/api#available-methods
- miti99bot uses grammY; we deliberately stay lighter (raw fetch) since js-store-scraper-bot has 13 commands and no grammY-specific features needed.
## Overview
- **Priority:** P0 — Phase 04 entry handler depends on this.
- **Status:** pending
- **Description:** Replace `node-telegram-bot-api` (which uses Node-only `request`/streams) with a thin `fetch`-based client. Same `sender` interface (`sendMessage`, `sendMessageSilent`, `sendDocument`) so command handlers don't change.
## Key Insights
- `node-telegram-bot-api` pulls in heavy Node-only deps (`request`, `bluebird`, `form-data`). Bundle bloat + Workers incompatibility.
- All 13 commands only need 3 methods: `sendMessage`, `sendDocument`, optional `getMe`.
- `sendMessage` is JSON; `sendDocument` needs `multipart/form-data`. Workers `fetch` + `FormData` + `Blob` handle multipart natively (no library).
- HTML parse mode + `disable_web_page_preview` + `disable_notification` are URL-encodable bool/string params.
## Requirements
### Functional
- New module `src/bot/telegram-api.js` exporting `createTelegramApi(token)` returning:
- `sendMessage(chatId, text, opts?)` → POST `/sendMessage`
- `sendDocument(chatId, filename, body, opts?)` → POST `/sendDocument` multipart
- `getMe()` → GET `/getMe` (used in Phase 05 register script)
- All methods return parsed JSON or throw `TelegramApiError` on `!ok`.
- `src/bot/bot.js` `sender` rebuilt over the new client. **Same exported interface** so commands work unchanged.
- HTML parse mode + `disable_web_page_preview: true` defaults match Java/Go parity.
- `sendMessageSilent` adds `disable_notification: true`.
### Non-functional
- No Node-specific imports (`Buffer.from(...)` is fine — `nodejs_compat_v2` provides it).
- Errors logged with `console.warn({chatId, method, error})` — JSON-style for CF observability parsing.
## Architecture
```
src/bot/telegram-api.js ← raw fetch wrapper
src/bot/bot.js ← createSender(api, logger) builds {sendMessage, sendMessageSilent, sendDocument}
src/bot/commands/*.js ← unchanged; consume sender interface
```
## Related Code Files
### CREATE
- `src/bot/telegram-api.js` (new)
### MODIFY
- `src/bot/bot.js` — strip `node-telegram-bot-api` import + polling code; expose `createSender` only. Dispatcher logic moves to `src/bot/dispatch.js` in Phase 04.
- `package.json``npm uninstall node-telegram-bot-api`.
### DELETE
- (none — the file is rewritten, not removed)
## Implementation Steps
1. **Write `src/bot/telegram-api.js`**:
```js
const TELEGRAM_BASE = 'https://api.telegram.org';
export class TelegramApiError extends Error {
constructor(method, status, body) {
super(`telegram ${method} failed: ${status} ${body}`);
this.method = method; this.status = status; this.body = body;
}
}
export function createTelegramApi(token) {
const base = `${TELEGRAM_BASE}/bot${token}`;
async function callJson(method, payload) {
const res = await fetch(`${base}/${method}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
const text = await res.text();
if (!res.ok) throw new TelegramApiError(method, res.status, text);
return JSON.parse(text);
}
async function callMultipart(method, fields, file) {
const form = new FormData();
for (const [k, v] of Object.entries(fields)) form.set(k, String(v));
if (file) {
form.set(file.field, new Blob([file.body], { type: file.contentType }), file.filename);
}
const res = await fetch(`${base}/${method}`, { method: 'POST', body: form });
const text = await res.text();
if (!res.ok) throw new TelegramApiError(method, res.status, text);
return JSON.parse(text);
}
return {
getMe: () => callJson('getMe', {}),
sendMessage: (chatId, text, opts = {}) =>
callJson('sendMessage', { chat_id: chatId, text, ...opts }),
sendDocument: (chatId, filename, body, opts = {}) =>
callMultipart('sendDocument', { chat_id: chatId, ...opts },
{ field: 'document', filename, body, contentType: 'application/json' }),
};
}
```
2. **Rewrite `src/bot/bot.js`** — keep only `createSender` factory and the command map. Drop polling, drop `node-telegram-bot-api`. Pseudocode:
```js
import { createTelegramApi } from './telegram-api.js';
import { createInfoCommand } from './commands/info.js';
/* ...other commands... */
const PARSE_MODE = 'HTML';
export function createBot(config, appleScraper, googleScraper) {
const api = createTelegramApi(config.telegramBotToken);
const logger = config.logger;
const sender = {
async sendMessage(chatId, html) {
try {
await api.sendMessage(chatId, html, {
parse_mode: PARSE_MODE,
disable_web_page_preview: true,
});
} catch (err) {
logger.warn({ chatId, err: err.message }, 'send message failed');
}
},
async sendMessageSilent(chatId, html) { /* + disable_notification: true */ },
async sendDocument(chatId, filename, body) {
try { await api.sendDocument(chatId, filename, body); }
catch (err) { logger.warn({ chatId, err: err.message }, 'send document failed'); }
},
};
const commands = { info: createInfoCommand(), /* ... */ };
return { sender, commands, api };
}
```
- Remove `tg.on('message', ...)`, `tg.on('polling_error', ...)`, `tg.getMe()` calls.
3. **Uninstall `node-telegram-bot-api`**:
```sh
npm uninstall node-telegram-bot-api
```
4. **Sanity check**: `node --check src/bot/bot.js src/bot/telegram-api.js`.
## Todo List
- [ ] `src/bot/telegram-api.js` written
- [ ] `src/bot/bot.js` rewritten (no polling, no `node-telegram-bot-api`)
- [ ] `node-telegram-bot-api` uninstalled
- [ ] Syntax check passes for both files
- [ ] All 13 commands still import + reference `sender` interface (no breakage)
## Success Criteria
- `node --check` passes for `src/bot/*.js`.
- Sender interface matches existing shape; no command file needs editing.
- Bundle no longer pulls Node-only Telegram lib (verified in Phase 06 deploy).
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| `FormData`/`Blob` semantics differ in Workers vs Node | L | Medium | Both follow WHATWG Fetch standard; `nodejs_compat_v2` covers gaps |
| `sendDocument` multipart filename encoding | L | Low | Filename uses ASCII chars only (`{appId}.json`) |
| Telegram rate limits | L | Low | Bot is low-volume; 1 req per command |
## Security Considerations
- Token is in URL path (`/bot{token}/{method}`). Workers `fetch` log doesn't capture request URLs by default. Verify in Observability after deploy.
## Rollback
1. Restore `src/bot/bot.js` from git.
2. `npm install node-telegram-bot-api@^0.66.0`.
3. Delete `src/bot/telegram-api.js`.
## Next Steps
- **Blocks:** Phase 04 (Worker entry needs `createBot` shape).
- **Unblocks:** Phase 04.
@@ -0,0 +1,185 @@
# Phase 03 — MongoDB Worker Adapter
## Context Links
- Existing: `src/repository/mongodb.js`
- miti99bot proven pattern: `/config/workspace/tiennm99/miti99bot/src/db/mongo-*.js`
- Atlas docs: https://www.mongodb.com/docs/drivers/node/current/
## Overview
- **Priority:** P0 — required by all 4 repository files.
- **Status:** pending
- **Description:** Adapt `src/repository/mongodb.js` for the Workers runtime: connection memoization per warm isolate, `MongoServerSelectionError` catch path, removal of `process.exit`-style shutdown. Repository files (`admin-repository.js`, etc.) remain **unchanged** because they consume `getCollection(name)`.
## Key Insights
- Workers isolates are short-lived but reused while warm; **memoize the `MongoClient` at module scope**, not per-request.
- Connecting on every request would cost ~1500ms cold-start + connection slot exhaustion (M0 cap = 500).
- M0 auto-pauses after 30 days idle. Driver throws `MongoServerSelectionError` after `serverSelectionTimeoutMS` (default 30s). **Lower this to 5000ms** to fail fast.
- The existing module-level `let client`/`let database` pattern works fine — just replace `await client.connect()` strategy with lazy + memoized.
- Workers don't fire process signals — `closeMongoDB()` is dead code on Workers; keep it as no-op for compatibility but remove the `process.on(...)` callers in Phase 04.
## Requirements
### Functional
- `getDatabase(env)` — lazy-init pattern, memoizes per isolate. Takes `env` (Workers-injected) instead of a global `config`.
- `getCollection(name, env)` — returns collection from memoized DB.
- Catches `MongoServerSelectionError` → re-throws a typed `MongoUnavailable` error consumable by command handlers (which return graceful "Internal server error" already).
- `serverSelectionTimeoutMS: 5000`, `socketTimeoutMS: 10000`.
- `appName: 'js-store-scraper-bot'` for Atlas observability.
- Repository files (`admin-repository.js`, `group-repository.js`, `apple-app-repository.js`, `google-app-repository.js`) updated minimally to thread `env` through the call chain. **OR** — a `getStore(env)` factory builds bound repos. Pick the second; it's cleaner.
### Non-functional
- No top-level `await` (Workers ESM doesn't allow during cold init in some runtimes).
- No global side effects on import.
## Architecture
```
src/repository/mongodb.js
├─ memoizedClient (module-scope, per isolate)
├─ getMongo(env) → memoized Promise<{ client, db }>
└─ getCollection(name, env) → db.collection(name)
src/repository/store.js (NEW)
└─ createStore(env) → {
admin: { init, get, save, addGroup, removeGroup, hasGroup, getAllGroups },
group: { exists, get, save, init, delete, addAppleApp, removeAppleApp, addGoogleApp, removeGoogleApp },
appleApp: { get, save, getCached },
googleApp: { get, save, getCached },
}
src/bot/commands/*.js
└─ accept `store` param instead of importing repos directly
```
## Related Code Files
### CREATE
- `src/repository/store.js` — factory binding `env` once, returning the four repos.
### MODIFY
- `src/repository/mongodb.js` — switch to memoized lazy init, `env`-driven, fast-fail timeouts, `MongoUnavailable` typed error.
- `src/repository/admin-repository.js` — convert from module-level functions to factory `createAdminRepository(env)` returning the same shape.
- `src/repository/group-repository.js` — same conversion.
- `src/repository/apple-app-repository.js` — same; `getCachedAppleApp(appId, appCacheSeconds, env)` accepts `env` (or comes from factory closure).
- `src/repository/google-app-repository.js` — same.
### DELETE
- (none)
## Implementation Steps
1. **Rewrite `src/repository/mongodb.js`**:
```js
import { MongoClient } from 'mongodb';
export class MongoUnavailable extends Error {
constructor(cause) { super('MongoDB unavailable: ' + cause.message); this.cause = cause; }
}
let memoized = null;
export async function getMongo(env) {
if (memoized) return memoized;
memoized = (async () => {
try {
const client = new MongoClient(env.MONGODB_URI, {
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 10000,
appName: 'js-store-scraper-bot',
});
await client.connect();
const db = client.db(); // db inferred from URI path
return { client, db };
} catch (err) {
memoized = null; // allow retry on next request
throw new MongoUnavailable(err);
}
})();
return memoized;
}
export async function getCollection(name, env) {
const { db } = await getMongo(env);
return db.collection(name);
}
```
2. **Write `src/repository/store.js`**:
```js
import { createAdminRepository } from './admin-repository.js';
import { createGroupRepository } from './group-repository.js';
import { createAppleAppRepository } from './apple-app-repository.js';
import { createGoogleAppRepository } from './google-app-repository.js';
export function createStore(env) {
return {
admin: createAdminRepository(env),
group: createGroupRepository(env),
appleApp: createAppleAppRepository(env),
googleApp: createGoogleAppRepository(env),
};
}
```
3. **Convert each repository to factory shape** (example for admin):
```js
// src/repository/admin-repository.js
import { getCollection } from './mongodb.js';
import { ADMIN_ID, /* ... */ } from '../models/admin.js';
export function createAdminRepository(env) {
async function collection() { return getCollection('common', env); }
async function getAdmin() {
const c = await collection();
const doc = await c.findOne({ _id: ADMIN_ID });
return doc ?? newAdmin();
}
/* ...same logic, but every method awaits collection() ... */
return { initAdmin, getAdmin, save, addGroup, removeGroup, hasGroup, getAllGroups };
}
```
Repeat for `group`, `appleApp`, `googleApp` repositories. Logic body unchanged; only top-level binding changes.
4. **Update each command in `src/bot/commands/*.js`** that imports repos:
- Change `import * as adminRepo from '../../repository/admin-repository.js'` → command factories receive `store` param.
- Example: `createAddGroupCommand(config, store)` instead of `createAddGroupCommand(config)`.
- The `bot.js` from Phase 02 wires `store` into each command at construction time.
5. **Audit caller chain** — every command that touches a repo gets `store` from the bot factory. Total: 8 commands touch repos (everything except `info`, `rawappleapp`, `rawgoogleapp`, `check-app-scores` — wait, that one does too; recount during implementation).
6. **Sanity check**: `node --check src/repository/*.js`.
## Todo List
- [ ] `src/repository/mongodb.js` rewritten (memoized, env-driven, `MongoUnavailable`)
- [ ] `src/repository/store.js` written
- [ ] All 4 repository files converted to factory shape
- [ ] All command factories updated to accept `store`
- [ ] `src/bot/bot.js` wires `store` into commands
- [ ] `node --check src/**/*.js` passes
- [ ] Manual test in `wrangler dev`: `/info` and one DB-touching command (e.g. `/listgroup`) return correctly
## Success Criteria
- Connection is opened once per warm isolate (verify via Atlas dashboard "Current Connections" stays low).
- Cold start with one DB roundtrip stays under measured `BASELINE_COLD_PING_MS` from Phase 01.
- `MongoUnavailable` caught at command boundary → "Internal server error" message (existing behavior).
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Connection leaks across isolates | L | M | Memoize at module scope, never close in scheduled handler |
| `serverSelectionTimeoutMS: 5000` too aggressive | L | Low | Phase 01 measured baseline; 5s is 3× headroom |
| Repo factory refactor breaks commands | M | Medium | Each command's imports are explicit; sanity-check after each |
## Security Considerations
- `env.MONGODB_URI` only ever read inside `getMongo()`. Never logged. Lint enforces.
## Rollback
1. Restore `src/repository/*.js` from git.
2. Revert command-factory signatures in `src/bot/commands/*.js`.
## Next Steps
- **Blocks:** Phase 04 (entry handler injects `store` into `createBot`).
- **Unblocks:** Phase 04.
@@ -0,0 +1,246 @@
# Phase 04 — Worker Entry: fetch + scheduled
## Context Links
- Existing entry: `src/index.js` (Node polling + signal handlers)
- Existing scheduler: `src/scheduler/scheduler.js` (uses `node-cron`)
- CF Workers handler reference: https://developers.cloudflare.com/workers/runtime-apis/handlers/
- Telegram webhook payload: https://core.telegram.org/bots/api#update
## Overview
- **Priority:** P0 — the actual Workers deployment unit.
- **Status:** pending
- **Description:** Replace polling entry + `node-cron` + `process.on` with Worker `export default { fetch, scheduled }`. Webhook validation via `X-Telegram-Bot-Api-Secret-Token` header.
## Key Insights
- Workers entry is `export default` with two handlers; nothing else runs on cold start.
- `fetch(request, env, ctx)` MUST respond within ~30s; scheduled is generous (~5min) but Telegram retries webhook on non-200.
- Ack the webhook **immediately** then process async via `ctx.waitUntil(...)` — keeps Telegram from retrying on slow Mongo.
- `scheduled(event, env, ctx)` runs once per cron tick globally — no isolate uniqueness guarantee, but Telegram-side dedup is not needed for our daily report (idempotent enough).
- `pino` is removed; replace with `console.log({...})` — CF Observability indexes JSON output.
- `node-cron` removed entirely; `wrangler.toml [triggers]` declares cron, Workers fires `scheduled` handler.
- `dotenv` removed; `env` arg is the binding source.
## Requirements
### Functional
- `src/index.js` exports `default { fetch, scheduled }`.
- `fetch`:
- Accepts only `POST /` (or `POST /webhook`); other paths → 404.
- Validates `X-Telegram-Bot-Api-Secret-Token` header against `env.TELEGRAM_WEBHOOK_SECRET`. Mismatch → 401.
- Parses JSON update; extracts `update.message`.
- Routes to dispatcher; `ctx.waitUntil(dispatch(...))`. Returns `200 OK` immediately.
- `scheduled`:
- Builds store + scrapers + sender; calls `runDailyCheck()` (logic from existing `scheduler.js`, minus `node-cron` wrapping).
- All work inside `ctx.waitUntil(...)`.
- `src/bot/dispatch.js` (new) — extracts the per-message dispatch logic from old `bot.js`. Takes `(message, sender, commands, logger)`.
- `src/scheduler/scheduler.js` simplified — exports `runDailyCheck(config, store, sender, appleScraper, googleScraper)` only; remove `node-cron` schedule/start/stop.
- `src/config.js` rewritten — accepts `env` arg, returns config object. No `dotenv`. No `pino`. Logger = thin `console.log` wrapper returning JSON.
### Non-functional
- No `process.on`, no `process.exit`, no signal handling.
- No top-level `await`.
- All async errors logged with `console.error({err: err.message, stack: err.stack, ctx: ...})`.
## Architecture
```
src/index.js
├─ default.fetch(request, env, ctx)
│ ├─ validate secret header
│ ├─ parse update
│ └─ ctx.waitUntil(dispatch(message, sender, commands, logger))
│ ↳ returns Response("OK", 200) immediately
└─ default.scheduled(event, env, ctx)
└─ ctx.waitUntil(runDailyCheck(config, store, sender, appleScraper, googleScraper))
src/bot/dispatch.js (NEW)
└─ async dispatch(message, sender, commands, logger)
├─ parseCommandName(message.text, botUsername)
├─ commands[name](message, sender)
└─ catch → sender.sendMessage(chatId, "Internal server error")
```
## Related Code Files
### CREATE
- `src/bot/dispatch.js`
### MODIFY
- `src/index.js` — full rewrite as Worker entry.
- `src/config.js` — accept `env` arg; remove `dotenv`, `pino`.
- `src/logger.js` — replace pino with `createConsoleLogger()` (5 lines: `info/warn/error/debug``console.log(JSON.stringify({level, ...payload, msg}))`).
- `src/scheduler/scheduler.js` — strip `node-cron`; export `runDailyCheck` only.
- `src/bot/bot.js` — keep `createBot` factory (sender + commands map); dispatch logic moves to `dispatch.js`.
### DELETE
- (none — entry is rewritten)
### NPM uninstall
- `dotenv`, `pino`, `pino-pretty`, `node-cron`.
## Implementation Steps
1. **Rewrite `src/logger.js`** as `console.log` JSON wrapper:
```js
export function createLogger() {
const log = (level, payloadOrMsg, maybeMsg) => {
const payload = typeof payloadOrMsg === 'object' ? payloadOrMsg : {};
const msg = typeof payloadOrMsg === 'string' ? payloadOrMsg : maybeMsg ?? '';
console.log(JSON.stringify({ level, ts: new Date().toISOString(), msg, ...payload }));
};
return {
debug: (p, m) => log('debug', p, m),
info: (p, m) => log('info', p, m),
warn: (p, m) => log('warn', p, m),
error: (p, m) => log('error', p, m),
};
}
```
2. **Rewrite `src/config.js`** to accept `env`:
```js
import { createLogger } from './logger.js';
export function loadConfig(env) {
const required = ['TELEGRAM_BOT_TOKEN', 'TELEGRAM_BOT_USERNAME',
'TELEGRAM_WEBHOOK_SECRET', 'MONGODB_URI', 'ADMIN_IDS'];
for (const k of required) if (!env[k]) throw new Error(`${k} is required`);
const adminIds = env.ADMIN_IDS.split(',').map(s => Number(s.trim())).filter(Number.isFinite);
return {
telegramBotToken: env.TELEGRAM_BOT_TOKEN,
telegramBotUsername: env.TELEGRAM_BOT_USERNAME,
telegramWebhookSecret: env.TELEGRAM_WEBHOOK_SECRET,
adminIds,
isAdmin: (id) => adminIds.includes(Number(id)),
appCacheSeconds: Number(env.APP_CACHE_SECONDS ?? 600),
numDaysWarningNotUpdated: Number(env.NUM_DAYS_WARNING_NOT_UPDATED ?? 30),
timezone: 'Asia/Ho_Chi_Minh',
logger: createLogger(),
};
}
```
3. **Write `src/bot/dispatch.js`**:
```js
export async function dispatch(message, { sender, commands, config, logger }) {
if (!message?.text || message.text[0] !== '/') return;
const name = parseCommandName(message.text, config.telegramBotUsername);
if (!name) return;
const handler = commands[name];
if (!handler) { logger.debug({ command: name }, 'Unknown command'); return; }
logger.info({ command: name, userId: message.from?.id, chatId: message.chat.id }, 'Executing command');
try {
await handler(message, sender);
} catch (err) {
logger.error({ err: err.message, command: name }, 'command failed');
await sender.sendMessage(message.chat.id, 'Internal server error');
}
}
function parseCommandName(text, botUsername) {
const space = text.indexOf(' ');
const head = space < 0 ? text.slice(1) : text.slice(1, space);
const at = head.indexOf('@');
if (at < 0) return head;
const cmd = head.slice(0, at), target = head.slice(at + 1);
if (botUsername && target && target.toLowerCase() !== botUsername.toLowerCase()) return null;
return cmd;
}
```
4. **Trim `src/bot/bot.js`** — remove `tg.on(...)`, `tg.getMe()` startup; keep only `createBot(config, store, appleScraper, googleScraper)` returning `{ sender, commands }`.
5. **Trim `src/scheduler/scheduler.js`** — remove `import cron from 'node-cron'`, remove `start()`/`stop()`. Export `runDailyCheck(config, store, sender, appleScraper, googleScraper)` only. Repository calls thread through `store`.
6. **Rewrite `src/index.js`**:
```js
import { loadConfig } from './config.js';
import { createStore } from './repository/store.js';
import { createAppleScraper } from './api/apple-scraper.js';
import { createGoogleScraper } from './api/google-scraper.js';
import { createBot } from './bot/bot.js';
import { dispatch } from './bot/dispatch.js';
import { runDailyCheck } from './scheduler/scheduler.js';
function build(env) {
const config = loadConfig(env);
const store = createStore(env);
const appleScraper = createAppleScraper(config, store);
const googleScraper = createGoogleScraper(config, store);
const { sender, commands } = createBot(config, store, appleScraper, googleScraper);
return { config, store, appleScraper, googleScraper, sender, commands };
}
export default {
async fetch(request, env, ctx) {
if (request.method !== 'POST') return new Response('Not found', { status: 404 });
const ctx_ = build(env);
const secret = request.headers.get('X-Telegram-Bot-Api-Secret-Token');
if (secret !== ctx_.config.telegramWebhookSecret) {
return new Response('Unauthorized', { status: 401 });
}
const update = await request.json().catch(() => null);
if (!update?.message) return new Response('OK');
ctx.waitUntil(dispatch(update.message, { ...ctx_, logger: ctx_.config.logger }));
return new Response('OK');
},
async scheduled(event, env, ctx) {
const ctx_ = build(env);
ctx.waitUntil(
runDailyCheck(ctx_.config, ctx_.store, ctx_.sender, ctx_.appleScraper, ctx_.googleScraper),
);
},
};
```
7. **Audit api/scrapers** — `src/api/apple-scraper.js` + `google-scraper.js` already use `fetch`. They reference `config.logger` only. Adapt: `createAppleScraper(config, store)` so `getCachedAppleApp` is called via `store.appleApp.getCached(...)` instead of importing repo module directly. Same for google.
8. **Uninstall Node-only deps**:
```sh
npm uninstall dotenv pino pino-pretty node-cron
```
9. **Local dev test**: `npx wrangler dev` and curl POST a fake update with the secret header; verify dispatch logs.
10. **Sanity check**: `node --check src/**/*.js`.
## Todo List
- [ ] `src/logger.js` rewritten as JSON console wrapper
- [ ] `src/config.js` rewritten to accept `env`
- [ ] `src/bot/dispatch.js` written
- [ ] `src/bot/bot.js` trimmed (no polling)
- [ ] `src/scheduler/scheduler.js` trimmed (no node-cron)
- [ ] `src/index.js` rewritten as Worker entry
- [ ] `src/api/{apple,google}-scraper.js` adapted to take `store`
- [ ] Node-only deps uninstalled
- [ ] `wrangler dev` + curl test passes
- [ ] `node --check src/**/*.js` passes
## Success Criteria
- POST to local `wrangler dev` with valid secret + fake `/info` update → bot replies via real Telegram API.
- POST without secret → 401.
- GET / → 404.
- `wrangler dev --test-scheduled` invocation runs `runDailyCheck` without errors.
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| `ctx.waitUntil` unhandled rejection silently dropped | M | M | Wrap dispatch + scheduler bodies in try/catch + `logger.error` |
| Webhook ack delayed > 30s → Telegram retries | L | M | Ack first, dispatch in `waitUntil` (already in design) |
| Multiple bot replies on Telegram retry | L | L | Telegram dedupes by `update_id` server-side; fast ack reduces retry trigger |
| Cold-start exceeds 50ms CPU on Free plan | M | High | Phase 01 hard gate caught this; this phase only adds JSON parse overhead (sub-ms) |
## Security Considerations
- `TELEGRAM_WEBHOOK_SECRET`: random ≥32 chars, set via `wrangler secret put`. Validated on every `fetch`.
- 401 returned **before** parsing body (avoids spending CPU on attacker payloads).
## Rollback
1. Restore all modified files from git (`git checkout src/`).
2. Reinstall Node-only deps.
3. Existing Node polling bot resumes if launched separately; this Worker simply isn't deployed.
## Next Steps
- **Blocks:** Phase 05 (webhook registration needs the URL of a deployed `fetch` handler).
- **Unblocks:** Phase 05.
@@ -0,0 +1,120 @@
# Phase 05 — Webhook Register Script (Write Only)
## Context Links
- Telegram `setWebhook`: https://core.telegram.org/bots/api#setwebhook
- miti99bot register pattern: `/config/workspace/tiennm99/miti99bot/scripts/register.js`
## Overview
- **Priority:** P1
- **Status:** pending
- **Description:** Write `scripts/register-webhook.js`. **Do not run it** — that's the deploy plan's job. This phase only ensures the script exists and `--dry-run` works locally.
## Key Insights
- The dry-run path doesn't call Telegram, so it's safe to invoke during code-only work.
- `--env-file=.env.deploy` requires Node ≥20.6; project already requires Node ≥20.
## Requirements
### Functional
- `scripts/register-webhook.js` written. Reads `TELEGRAM_BOT_TOKEN`, `TELEGRAM_WEBHOOK_SECRET`, `WORKER_URL` from env.
- `--dry-run` prints `setWebhook` + `setMyCommands` payloads without HTTP calls.
- 13-command list embedded with descriptions.
### Non-functional
- Script logs Telegram response; never prints token.
- Exit 1 on missing env or non-`ok` response.
## Related Code Files
### CREATE
- `scripts/register-webhook.js`
### MODIFY
- (none — `package.json` already has `register` + `register:dry` scripts from Phase 01)
## Implementation Steps
1. **Write `scripts/register-webhook.js`**:
```js
#!/usr/bin/env node
const TOKEN = process.env.TELEGRAM_BOT_TOKEN;
const SECRET = process.env.TELEGRAM_WEBHOOK_SECRET;
const URL_ = process.env.WORKER_URL;
const DRY = process.argv.includes('--dry-run');
for (const [k, v] of Object.entries({
TELEGRAM_BOT_TOKEN: TOKEN,
TELEGRAM_WEBHOOK_SECRET: SECRET,
WORKER_URL: URL_,
})) {
if (!v) {
console.error(`${k} is required`);
process.exit(1);
}
}
const COMMANDS = [
{ command: 'info', description: 'Show this group ID' },
{ command: 'addgroup', description: '[admin] Authorize a group' },
{ command: 'delgroup', description: '[admin] Deauthorize a group' },
{ command: 'listgroup', description: '[admin] List authorized groups' },
{ command: 'addapple', description: 'Track an Apple App Store app' },
{ command: 'delapple', description: 'Stop tracking an Apple app' },
{ command: 'addgoogle', description: 'Track a Google Play app' },
{ command: 'delgoogle', description: 'Stop tracking a Google app' },
{ command: 'listapp', description: 'List tracked apps in this group' },
{ command: 'checkapp', description: 'Check update status of tracked apps' },
{ command: 'checkappscore', description: 'Check scores + ratings of tracked apps' },
{ command: 'rawappleapp', description: 'Dump raw Apple API JSON for an app' },
{ command: 'rawgoogleapp', description: 'Dump raw Google API JSON for an app' },
];
async function tg(method, payload) {
if (DRY) {
console.log(`[dry-run] ${method}`, JSON.stringify(payload, null, 2));
return { ok: true, result: '(dry)' };
}
const res = await fetch(`https://api.telegram.org/bot${TOKEN}/${method}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
const body = await res.json();
if (!body.ok) {
console.error(`${method} failed`, body);
process.exit(1);
}
return body;
}
await tg('setWebhook', { url: URL_, secret_token: SECRET, allowed_updates: ['message'] });
await tg('setMyCommands', { commands: COMMANDS });
const info = await tg('getWebhookInfo', {});
console.log('Webhook state:', JSON.stringify(info.result, null, 2));
```
2. **Local dry-run smoke** (using fake env):
```sh
TELEGRAM_BOT_TOKEN=fake TELEGRAM_WEBHOOK_SECRET=fake WORKER_URL=https://example.test \
node scripts/register-webhook.js --dry-run
```
Expect both payloads printed, exit 0.
## Todo List
- [ ] `scripts/register-webhook.js` written
- [ ] Local dry-run smoke passes
- [ ] `node --check scripts/register-webhook.js` passes
## Success Criteria
- Script runs in dry-run mode without env file (env injected on the command line).
- `getWebhookInfo` is the third call, after `setWebhook` and `setMyCommands`.
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Top-level await rejected by older Node | L | Low | Engines `>=20`; top-level await fine |
| Command list drifts from `src/bot/bot.js` map | M | Medium | Both list 13 commands; cross-reference at smoke |
## Next Steps
- **Blocks:** Deploy plan Phase 03 (uses this script).
- **Unblocks:** end of code plan.
@@ -0,0 +1,92 @@
---
title: "Deploy js-store-scraper-bot to Cloudflare Workers"
description: "Port Telegram bot from Node.js polling/Mongo-driver/node-cron to Workers webhook + nodejs_compat_v2 + scheduled triggers. MongoDB Atlas M0 free tier retains schema parity with Java/Go ports."
status: pending
priority: P1
effort: 10h
branch: dev
tags: [cloudflare-workers, telegram, mongodb, atlas, deployment]
created: 2026-04-26
blockedBy: []
blocks: []
---
# Plan: Cloudflare Workers Deployment
Port the Node.js polling bot to a Workers `fetch` (Telegram webhook) + `scheduled` (daily check) handler. Free tier only. Storage stays MongoDB (Atlas M0) via `nodejs_compat_v2` — schema parity with `store-scraper-bot` (Java) and `go-store-scraper-bot` preserved.
## Reference Implementation
**[miti99bot](/config/workspace/tiennm99/miti99bot)** — sister project on the same stack (CF Workers + Atlas via `nodejs_compat_v2` + `mongodb` driver). Code-complete, operator-pending. Lift validated patterns; do not re-research.
Key files to crib:
- `wrangler.toml``compatibility_flags = ["nodejs_compat_v2"]`, cron triggers, observability block
- `scripts/check-secret-leaks.js` — lint that grep-blocks token leaks
- `docs/using-mongodb.md` — operational runbook
## Constraints (locked)
- **Free plan only.** 100K req/day, 10ms CPU/req baseline (50ms cold start), 3 MiB bundle cap.
- **MongoDB Atlas M0** (free, region `aws-ap-southeast-1`). Schema parity with Java/Go ports.
- **Telegram webhook** (NOT polling). Auth via `secret_token` header (Telegram's first-class mechanism).
- **`nodejs_compat_v2`** (gives `node:net` + `node:tls` for the official `mongodb` driver).
- **Drop these npm deps**: `node-telegram-bot-api`, `node-cron`, `dotenv`, `pino`, `pino-pretty`. Workers ships its own equivalents (`env` bindings, scheduled handler, `console.log`).
- **Keep**: `mongodb` (only addition), business logic in `src/bot/commands/*.js`, models, util.
## Hard Gates
| Gate | Where | Threshold | On fail |
|---|---|---|---|
| **Bundle size** | Phase 01 | `wrangler deploy --dry-run` ≤ 2.7 MiB (10% headroom under 3 MiB Free cap) | **Pivot to Upstash Redis fallback** (out-of-plan) |
| **Cold-start CPU** | Phase 01 | `/__mongo-ping` cold cycles well under 50ms CPU | **Escalate to Workers Paid OR pivot to Upstash** |
| **Auto-pause behavior** | Phase 01 | Paused M0 yields catchable `MongoServerSelectionError` within 5s, not a hang | Document catch path in Phase 03 |
If any gate trips, do NOT continue downstream phases — escalate to user with measured numbers.
## Phases
| # | Phase | Status | Effort | Owner files |
|---|-------|--------|--------|-------------|
| 01 | [Atlas + Wrangler + bundle gate](phase-01-atlas-wrangler-bundle-gate.md) | pending | 2h | `wrangler.toml`, `package.json`, `.dev.vars.example`, `scripts/check-secret-leaks.js`, `docs/using-mongodb.md` |
| 02 | [Worker-native Telegram client](phase-02-telegram-client.md) | pending | 1.5h | `src/bot/telegram-api.js`, `src/bot/bot.js` |
| 03 | [MongoDB Worker adapter](phase-03-mongo-worker-adapter.md) | pending | 1.5h | `src/repository/mongodb.js`, `src/repository/*-repository.js` |
| 04 | [Worker entry — fetch + scheduled](phase-04-worker-entry.md) | pending | 2h | `src/index.js`, `src/bot/dispatch.js`, `src/scheduler/scheduler.js` |
| 05 | [Webhook registration + secrets](phase-05-webhook-registration.md) | pending | 1h | `scripts/register-webhook.js`, `package.json` |
| 06 | [Deploy + smoke + docs](phase-06-deploy-smoke-docs.md) | pending | 2h | `README.md`, `docs/cloudflare-deployment.md` |
## Critical dependencies
- **01 → all** (wrangler config + Atlas creds + bundle gate are prerequisites).
- **02 + 03 → 04** (entry handler wires both into the dispatch + scheduled paths).
- **04 → 05** (need a deployable `fetch` handler before `setWebhook` makes sense).
- **05 → 06** (deploy + smoke is end-to-end; needs webhook live).
## Out of scope (explicit)
- Node-runtime variant retained alongside Workers variant. **No.** Single deployment target. The Node polling code is removed in Phase 04, not preserved with branching.
- Dual-write to KV. Not applicable — no existing CF deployment to migrate from.
- Backfill scripts. Not applicable — no existing data; greenfield collections.
- Tests. The original Node port has none and the user did not request them; revisit if user asks.
- Worker observability tuning beyond `wrangler.toml` defaults from miti99bot.
## Fallback path (if Phase 01 hard gates trip)
**Upstash Redis** (HTTP-native, ~10K req/day free tier):
- Replace `src/repository/*` with thin wrappers over `@upstash/redis`.
- Lose Mongo schema parity with Java/Go ports (acceptable — JS becomes its own deployment).
- Bundle stays tiny (no driver), cold start essentially free.
- Spec lives in `phase-07-alt-upstash-pivot.md` (write only if Phase 01 trips).
## Questions for user (before Phase 01 starts)
1. Atlas account: do you have one, or do you need to create it? (M0 free tier — no credit card required.)
2. Cloudflare account: do you have a Workers free plan attached to a domain? Workers can deploy without a custom domain (`*.workers.dev`); confirm that's acceptable for Telegram webhook URL.
3. Bot username for the webhook: same `TELEGRAM_BOT_USERNAME` as the existing `.env.example`?
4. Any existing data in MongoDB you want imported, or greenfield?
## References
- miti99bot Atlas migration plan: `/config/workspace/tiennm99/miti99bot/plans/260425-1945-mongodb-atlas-migration/`
- CF Workers `nodejs_compat_v2`: https://developers.cloudflare.com/workers/runtime-apis/nodejs/
- Telegram Bot API webhooks: https://core.telegram.org/bots/api#setwebhook (`secret_token` field)
- MongoDB Atlas M0 limits: https://www.mongodb.com/docs/atlas/reference/free-shared-limitations/
@@ -0,0 +1,83 @@
# Deploy Phase 01 — Atlas Provisioning + CF Secrets
## Context Links
- miti99bot Phase 01 (validated): `/config/workspace/tiennm99/miti99bot/plans/260425-1945-mongodb-atlas-migration/phase-01-atlas-setup.md`
## Overview
- **Priority:** P0
- **Status:** pending
- **Description:** One-time provisioning of Atlas M0 cluster + Cloudflare secrets. No code changes.
## Implementation Steps
1. **Atlas account + cluster** (~15 min):
- Create Atlas account at https://cloud.mongodb.com (no credit card).
- Create project `js-store-scraper-bot`.
- Provision M0 in `aws-ap-southeast-1`. Cluster name: `js-store-scraper-bot-prod`.
- Wait for cluster green status.
2. **DB user**:
- Atlas → Database Access → Add user.
- Username: `js-store-scraper-worker`.
- Password: ≥32 random chars (use `openssl rand -base64 32`). **Save in password manager.**
- Role: `readWrite@store-scraper-bot`.
3. **Network access**:
- Atlas → Network Access → Add IP → `0.0.0.0/0`.
- Comment: "Cloudflare Workers — no static egress IPs on free plan".
4. **Atlas alerts** (free, ~5 min):
- Project → Alerts → Add Alert.
- Cluster unavailable → email.
- Connections > 400 → email.
5. **Connection string**:
- Atlas → Database → Connect → Drivers (Node 6.7+).
- Copy SRV string: `mongodb+srv://js-store-scraper-worker:<password>@<host>/store-scraper-bot?retryWrites=true&w=majority`.
- Substitute the real password.
6. **Cloudflare account + login**:
- Sign up at cloudflare.com if not already (free).
- In project dir: `npx wrangler login`. Browser flow to authorize.
7. **Set 5 secrets**:
```sh
cd /config/workspace/tiennm99/js-store-scraper-bot
npx wrangler secret put TELEGRAM_BOT_TOKEN # paste your bot token
npx wrangler secret put TELEGRAM_BOT_USERNAME # e.g. miti99_store_bot
npx wrangler secret put TELEGRAM_WEBHOOK_SECRET # generate: openssl rand -hex 32
npx wrangler secret put MONGODB_URI # paste full SRV string
npx wrangler secret put ADMIN_IDS # e.g. 123456789,987654321
```
8. **Mirror to `.env.deploy`** (for the register script — gitignored):
```sh
cp .env.deploy.example .env.deploy
# edit: paste TELEGRAM_BOT_TOKEN, TELEGRAM_WEBHOOK_SECRET, WORKER_URL
```
`WORKER_URL` won't be known until first deploy in Phase 02; leave placeholder for now.
## Todo List
- [ ] Atlas project + M0 cluster green
- [ ] DB user + password vaulted
- [ ] Network `0.0.0.0/0` added
- [ ] Atlas alerts configured
- [ ] Connection string captured
- [ ] `wrangler login` complete
- [ ] All 5 secrets set
- [ ] `.env.deploy` populated (WORKER_URL TBD)
## Success Criteria
- `npx wrangler secret list` shows all 5 secret names.
- Atlas dashboard shows cluster green + 0 connections.
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Atlas signup takes longer than expected | M | Low | Allocate 30 min |
| Password copied wrong | L | High | Vault first; paste from vault |
| `wrangler secret put` fails | L | Medium | Re-run; check `wrangler whoami` |
## Next Steps
- **Blocks:** Phase 02.
- **Unblocks:** Phase 02.
@@ -0,0 +1,89 @@
# Deploy Phase 02 — First Deploy + Hard Gates
## Overview
- **Priority:** P0 — gates can abort the entire plan.
- **Status:** pending
- **Description:** Bundle-size dry-run, real deploy, mongo-ping CPU gate, auto-pause behavior test.
## Implementation Steps
1. **Bundle-size hard gate**:
```sh
npx wrangler deploy --dry-run --outdir=.tmp-deploy
du -sh .tmp-deploy
```
- **PASS**: ≤ 2.7 MiB. Continue.
- **FAIL**: > 2.7 MiB. **Stop**. Pivot to Upstash (separate plan; out of scope here).
2. **First real deploy**:
```sh
npx wrangler deploy
```
Capture the URL printed (e.g. `https://js-store-scraper-bot.<account>.workers.dev`).
Update `.env.deploy` with this `WORKER_URL`.
3. **Add temporary `/__mongo-ping` route** for CPU gate (revert before commit at end of phase):
- Edit `src/index.js` `fetch` handler — top of method, before path routing:
```js
if (request.method === 'GET' && new URL(request.url).pathname === '/__mongo-ping') {
const t = Date.now();
const { db } = await getMongo(env);
await db.command({ ping: 1 });
return new Response(JSON.stringify({ wall_ms: Date.now() - t }), {
headers: { 'Content-Type': 'application/json' },
});
}
```
- `npx wrangler deploy`.
4. **Cold-start CPU gate** (operator):
- Run 5 cold cycles, 10+ min apart:
```sh
curl https://js-store-scraper-bot.<account>.workers.dev/__mongo-ping
```
- Open CF dashboard → Workers → js-store-scraper-bot → Logs / Metrics.
- Note CPU time per invocation (separate from wall_ms).
- **PASS**: cold CPU < 40ms (10ms safety under 50ms cap).
- **FAIL**: ≥ 40ms. Decide: upgrade to Workers Paid ($5/mo, 50ms baseline → 5min ceiling), or pivot to Upstash.
5. **Auto-pause behavior test**:
- Atlas UI → cluster → Pause.
- Wait ~30s for Atlas to actually pause.
- `curl .../__mongo-ping` → should return 5xx within ~5s, NOT hang.
- Atlas UI → resume cluster.
- Verify: error was catchable (logged via `wrangler tail`); not a hang.
6. **Cleanup**: revert `/__mongo-ping` route in `src/index.js`. Re-deploy:
```sh
git checkout src/index.js # revert the temp ping route
npx wrangler deploy
```
7. **Record baseline** in `docs/using-mongodb.md` (created in Phase 04): cold-ping P95 wall_ms.
## Todo List
- [ ] Bundle ≤ 2.7 MiB
- [ ] First deploy success; URL captured
- [ ] `.env.deploy` `WORKER_URL` set
- [ ] Temporary `/__mongo-ping` deployed
- [ ] Cold CPU < 40ms across 5 cold cycles
- [ ] Auto-pause yields catchable error
- [ ] Temp route reverted + redeployed
- [ ] Baseline P95 captured for docs
## Success Criteria
- `wrangler deploy` clean.
- All hard gates pass.
- Atlas dashboard "Current Connections" stays low (≤ 5) during ping smoke.
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Bundle exceeds 2.7 MiB | M | Plan-aborting | Pivot path documented in plan.md |
| Cold CPU > 50ms | M | Plan-aborting on Free | Escalate to Paid; or pivot |
| Atlas connection refused (`0.0.0.0/0` not yet propagated) | L | Medium | Wait 1-2 min; retry |
| Forgot to revert `/__mongo-ping` | L | Low | Step 6 explicit; lint adds no help here |
## Next Steps
- **Blocks:** Phase 03.
- **Unblocks:** Phase 03.
@@ -0,0 +1,43 @@
# Deploy Phase 03 — Register Webhook
## Overview
- **Priority:** P1
- **Status:** pending
- **Description:** Run the script written in code-plan Phase 05 against the live URL.
## Implementation Steps
1. Verify `.env.deploy` has `WORKER_URL` set (from Deploy Phase 02 step 2).
2. Dry-run first:
```sh
npm run register:dry
```
Verify printed payloads look right (URL, secret, command list).
3. Real run:
```sh
npm run register
```
4. Verify `getWebhookInfo` output in script's stdout shows:
- `url`: matches `WORKER_URL`
- `pending_update_count`: 0
- `last_error_date`: absent
5. In Telegram client, type `/` in a chat with the bot — autocompletion should show the 13 commands (within ~1 min of `setMyCommands`).
## Todo List
- [ ] `npm run register:dry` clean
- [ ] `npm run register` clean
- [ ] `getWebhookInfo` shows correct URL + 0 pending
- [ ] Telegram `/` autocompletion shows 13 commands
## Success Criteria
- All checks above.
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Wrong WORKER_URL | M | Medium | Dry-run catches this |
| Telegram returns `wrong response from the webhook` | L | High | Means Worker is reachable but returns non-200; check `wrangler tail` |
## Next Steps
- **Blocks:** Phase 04.
- **Unblocks:** Phase 04.
@@ -0,0 +1,96 @@
# Deploy Phase 04 — End-to-End Smoke + Docs
## Overview
- **Priority:** P1
- **Status:** pending
- **Description:** Real Telegram smoke of all 13 commands, scheduled handler smoke, README + docs update.
## Implementation Steps
1. **Open `wrangler tail`** in a terminal to stream logs.
2. **Smoke checklist** in a Telegram chat with the bot (operator must be in `ADMIN_IDS`):
- [ ] `/info` → bot replies with chat ID
- [ ] `/addgroup` → "Group added successfully"
- [ ] `/addgoogle com.android.chrome vn` → success
- [ ] `/addapple 284910350 vn` → success (YouTube iOS)
- [ ] `/listapp` → table with both apps
- [ ] `/checkapp` → 4-col table with green checks
- [ ] `/checkappscore` → score table
- [ ] `/rawappleapp 284910350 vn` → JSON file attachment
- [ ] `/rawgoogleapp com.android.chrome vn` → JSON file attachment
- [ ] `/listgroup` → group listed
- [ ] `/delapple 284910350` → success
- [ ] `/delgoogle com.android.chrome` → success
- [ ] `/delgroup` → success
3. **Scheduled smoke**: trigger the cron handler via curl (works on `wrangler dev --test-scheduled`):
```sh
npx wrangler dev --test-scheduled
# in another terminal:
curl "http://localhost:8787/__scheduled?cron=0+0+*+*+*"
```
Verify `runDailyCheck` log lines appear.
4. **Atlas check**: dashboard "Current Connections" ≤ 5 throughout smoke.
5. **Write `docs/cloudflare-deployment.md`** — operator runbook:
- Prerequisites
- One-time setup commands (Atlas + secrets + first deploy)
- Routine deploy: `npm run deploy`
- Reading `wrangler tail`
- Pause/restart procedure
- Rollback to Node polling
- Fallback to Upstash (sketch only — link to plan if it ever runs)
6. **Write `docs/using-mongodb.md`** — Atlas runbook:
- Cluster URL (without password)
- Region, alerts, auto-pause schedule
- Password rotation procedure
- Baseline cold-ping P95 (from Phase 02 step 7)
- `0.0.0.0/0` permanence + paid CF static-IP path
7. **Update `README.md`**:
- Replace existing "Run" section: "Deploys to Cloudflare Workers; see `docs/cloudflare-deployment.md`."
- Keep preview/untested banner.
- List new deps: `mongodb`, `wrangler`.
- Note: Docker compose retained for local-with-Mongo dev only.
8. **Mark Docker files deprecated** — header comment:
```
# DEPRECATED: this project deploys to Cloudflare Workers (see docs/cloudflare-deployment.md).
# Retained only for local-with-Mongo development.
```
9. **Commit + push**:
```sh
git add docs/ README.md Dockerfile docker-compose*.yml
git commit -m "docs: Cloudflare Workers deployment runbook + smoke complete"
git push
```
## Todo List
- [ ] All 13 commands smoke-tested
- [ ] Scheduled handler smoke OK
- [ ] Atlas connection count low
- [ ] `docs/cloudflare-deployment.md` written
- [ ] `docs/using-mongodb.md` written
- [ ] `README.md` updated
- [ ] Docker files marked deprecated
- [ ] Committed + pushed
## Success Criteria
- All commands respond correctly in real Telegram.
- Operator runbook is complete and someone else could redeploy from scratch following it.
- Daily 7am-VN cron will fire (verify next morning in Atlas op-log + `wrangler tail` history).
## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Smoke reveals refactor bug | M | Medium | Fix + redeploy; not catastrophic |
| Cron timezone confusion | L | Low | Documented `0 UTC = 7 VN`; verify next day |
## Next Steps
- **Blocks:** none.
- **Unblocks:** end of plan.
- **Follow-up (out of scope):** tests; observability dashboard; quarterly password rotation reminder.
@@ -0,0 +1,65 @@
---
title: "Cloudflare Workers deploy + smoke + docs"
description: "Operator-driven phases: provision Atlas, set CF secrets, run hard gates, deploy, register webhook, smoke-test all 13 commands, write docs. Sister to the code-port plan."
status: pending
priority: P1
effort: 4h
branch: main
tags: [cloudflare-workers, deploy, smoke, atlas]
created: 2026-04-26
blockedBy: [260426-2015-cloudflare-worker-code-port]
blocks: []
---
# Plan B — Deploy + Smoke
Run only after [the code-port plan](../260426-2015-cloudflare-worker-code-port/plan.md) is complete and committed. This plan touches **real accounts**: Atlas (free M0) and Cloudflare (free Workers).
## Reference Implementation
[miti99bot Phase 01 atlas-setup](/config/workspace/tiennm99/miti99bot/plans/260425-1945-mongodb-atlas-migration/phase-01-atlas-setup.md) — proven Atlas + nodejs_compat_v2 procedure. Lift the operator steps wholesale.
## Hard Gates (can abort the plan → pivot to Upstash Redis)
| Gate | Phase | Threshold | On fail |
|---|---|---|---|
| Bundle size | 02 | `wrangler deploy --dry-run` ≤ 2.7 MiB (10% headroom under 3 MiB Free cap) | Pivot to Upstash (out-of-scope here; plan + execute separately) |
| Cold-start CPU | 02 | `/__mongo-ping` cold cycles < 40ms CPU | Escalate to Workers Paid ($5/mo) OR pivot to Upstash |
| Auto-pause behavior | 02 | Paused M0 yields catchable error within 5s | Document catch path requirement; not abort-worthy |
## Phases
| # | Phase | Status | Effort |
|---|---|---|---|
| 01 | [Atlas M0 + CF secrets](phase-01-atlas-and-secrets.md) | pending | 1h |
| 02 | [First deploy + bundle/CPU gates](phase-02-deploy-and-gates.md) | pending | 1h |
| 03 | [Register webhook](phase-03-register-webhook.md) | pending | 0.5h |
| 04 | [End-to-end smoke + docs](phase-04-smoke-and-docs.md) | pending | 1.5h |
## Critical dependencies
- 01 → 02 (deploy needs MONGODB_URI + secrets)
- 02 → 03 (register needs live worker URL + passing gates)
- 03 → 04 (smoke needs registered webhook)
## Prerequisites (operator must have)
- MongoDB Atlas account (free, no credit card).
- Cloudflare account (free, no credit card).
- `wrangler login` completed locally.
- Telegram bot token (existing — same as Node deployment).
## Out of scope (explicit)
- Application code changes — handled by code plan; this plan deploys what's already on `main`.
- Custom domain — `*.workers.dev` URL is fine for Telegram webhooks.
- Tests — original project has none; not adding here.
- Upstash pivot — only triggered if hard gates trip; planned then, not now.
## Rollback path
- After Phase 02 (deploy live but not registered): `wrangler delete` removes Worker; no public surface.
- After Phase 03 (webhook registered): `setWebhook` with empty URL clears it; or rotate `TELEGRAM_WEBHOOK_SECRET` so future POSTs 401.
- After Phase 04 (production live): `git revert` to pre-Workers commit, redeploy via Docker compose for the existing Node polling path.
## Questions to confirm before Phase 01
1. Atlas account: do you have one, or first-time setup?
2. CF account: free Workers plan attached?
3. Bot username for `setMyCommands`: same as in `.env.example`?
4. Greenfield Mongo data, or existing data to import? (If existing, schedule a separate import phase.)
+41
View File
@@ -0,0 +1,41 @@
#!/usr/bin/env node
// Fails CI if any source file logs a secret via env.
// Pattern: console.{log,info,warn,error,debug}(... env.<SECRET> ...)
import { readdirSync, readFileSync, statSync } from 'node:fs';
import { join } from 'node:path';
const SECRETS = ['MONGODB_URI', 'TELEGRAM_BOT_TOKEN', 'TELEGRAM_WEBHOOK_SECRET', 'ADMIN_IDS'];
const ROOTS = ['src', 'scripts'];
function* walk(dir) {
let entries;
try {
entries = readdirSync(dir);
} catch {
return;
}
for (const entry of entries) {
const p = join(dir, entry);
const s = statSync(p);
if (s.isDirectory()) yield* walk(p);
else if (/\.(js|mjs|ts)$/.test(p)) yield p;
}
}
const violations = [];
for (const root of ROOTS) {
for (const file of walk(root)) {
const text = readFileSync(file, 'utf8');
for (const secret of SECRETS) {
const re = new RegExp(`console\\.(log|info|warn|error|debug)\\([^)]*\\benv\\.${secret}\\b`);
if (re.test(text)) violations.push({ file, secret });
}
}
}
if (violations.length > 0) {
console.error('Secret-leak violations:');
for (const v of violations) console.error(` ${v.file}: env.${v.secret} in console.*`);
process.exit(1);
}
console.log('check-secret-leaks: clean');
+63
View File
@@ -0,0 +1,63 @@
#!/usr/bin/env node
// Post-deploy registration: setWebhook (with secret_token) + setMyCommands.
// Run via: npm run register (reads .env.deploy)
// Dry run via: npm run register:dry
const TOKEN = process.env.TELEGRAM_BOT_TOKEN;
const SECRET = process.env.TELEGRAM_WEBHOOK_SECRET;
const URL_ = process.env.WORKER_URL;
const DRY = process.argv.includes('--dry-run');
for (const [k, v] of Object.entries({
TELEGRAM_BOT_TOKEN: TOKEN,
TELEGRAM_WEBHOOK_SECRET: SECRET,
WORKER_URL: URL_,
})) {
if (!v) {
console.error(`${k} is required`);
process.exit(1);
}
}
const COMMANDS = [
{ command: 'info', description: 'Show this group ID' },
{ command: 'addgroup', description: '[admin] Authorize a group' },
{ command: 'delgroup', description: '[admin] Deauthorize a group' },
{ command: 'listgroup', description: '[admin] List authorized groups' },
{ command: 'addapple', description: 'Track an Apple App Store app' },
{ command: 'delapple', description: 'Stop tracking an Apple app' },
{ command: 'addgoogle', description: 'Track a Google Play app' },
{ command: 'delgoogle', description: 'Stop tracking a Google app' },
{ command: 'listapp', description: 'List tracked apps in this group' },
{ command: 'checkapp', description: 'Check update status of tracked apps' },
{ command: 'checkappscore', description: 'Check scores + ratings of tracked apps' },
{ command: 'rawappleapp', description: 'Dump raw Apple API JSON for an app' },
{ command: 'rawgoogleapp', description: 'Dump raw Google API JSON for an app' },
];
async function tg(method, payload) {
if (DRY) {
console.log(`[dry-run] ${method}`, JSON.stringify(payload, null, 2));
return { ok: true, result: '(dry)' };
}
const res = await fetch(`https://api.telegram.org/bot${TOKEN}/${method}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
const body = await res.json();
if (!body.ok) {
console.error(`${method} failed`, body);
process.exit(1);
}
return body;
}
await tg('setWebhook', {
url: URL_,
secret_token: SECRET,
allowed_updates: ['message'],
});
await tg('setMyCommands', { commands: COMMANDS });
const info = await tg('getWebhookInfo', {});
console.log('Webhook state:', JSON.stringify(info.result, null, 2));
+5 -5
View File
@@ -1,4 +1,3 @@
import { getCachedAppleApp, saveAppleApp } from '../repository/apple-app-repository.js';
import { newAppleApp } from '../models/apple-app.js';
// Mirrors Java AppStoreScraper (api/apple/AppStoreScraper.java).
@@ -12,8 +11,9 @@ export function buildAppleRequestByBundleId(appId, country) {
return { appId, country, ratings: true };
}
export function createAppleScraper(config) {
const { logger, appCacheSeconds } = config;
export function createAppleScraper(config, store) {
const { logger } = config;
const repo = store.appleApp;
async function rawApp(req) {
const res = await fetch(`${BASE_URL}/app`, {
@@ -33,14 +33,14 @@ export function createAppleScraper(config) {
async function cache(resp) {
if (!resp || !resp.appId) return;
try {
await saveAppleApp(newAppleApp(resp.appId, resp, Date.now()));
await repo.save(newAppleApp(resp.appId, resp, Date.now()));
} catch (err) {
logger.warn({ appId: resp.appId, err: err.message }, 'failed to cache apple app');
}
}
async function getApp(appId, country) {
const cached = await getCachedAppleApp(appId, appCacheSeconds);
const cached = await repo.getCached(appId);
if (cached) return cached.app;
const resp = await app(buildAppleRequestByBundleId(appId, country));
await cache(resp);
+5 -5
View File
@@ -1,4 +1,3 @@
import { getCachedGoogleApp, saveGoogleApp } from '../repository/google-app-repository.js';
import { newGoogleApp } from '../models/google-app.js';
// Mirrors Java GooglePlayScraper (api/google/GooglePlayScraper.java).
@@ -8,8 +7,9 @@ export function buildGoogleRequest(appId, country) {
return { appId, country: country || 'vn' };
}
export function createGoogleScraper(config) {
const { logger, appCacheSeconds } = config;
export function createGoogleScraper(config, store) {
const { logger } = config;
const repo = store.googleApp;
async function rawApp(req) {
const res = await fetch(`${BASE_URL}/app`, {
@@ -31,14 +31,14 @@ export function createGoogleScraper(config) {
const id = resp.appId || fallbackId;
if (!id) return;
try {
await saveGoogleApp(newGoogleApp(id, resp, Date.now()));
await repo.save(newGoogleApp(id, resp, Date.now()));
} catch (err) {
logger.warn({ appId: id, err: err.message }, 'failed to cache google app');
}
}
async function getApp(appId, country) {
const cached = await getCachedGoogleApp(appId, appCacheSeconds);
const cached = await repo.getCached(appId);
if (cached) return cached.app;
const resp = await app(buildGoogleRequest(appId, country));
await cache(resp, appId);
+17 -63
View File
@@ -1,4 +1,4 @@
import TelegramBot from 'node-telegram-bot-api';
import { createTelegramApi } from './telegram-api.js';
import { createInfoCommand } from './commands/info.js';
import { createAddGroupCommand } from './commands/add-group.js';
import { createDeleteGroupCommand } from './commands/delete-group.js';
@@ -16,14 +16,14 @@ import { createRawGoogleAppCommand } from './commands/raw-google-app.js';
// HTML parse mode for all messages (Java parity).
const PARSE_MODE = 'HTML';
export function createBot(config, appleScraper, googleScraper) {
const tg = new TelegramBot(config.telegramBotToken, { polling: true });
export function createBot(config, store, appleScraper, googleScraper) {
const api = createTelegramApi(config.telegramBotToken);
const logger = config.logger;
const sender = {
async sendMessage(chatId, html) {
try {
await tg.sendMessage(chatId, html, {
await api.sendMessage(chatId, html, {
parse_mode: PARSE_MODE,
disable_web_page_preview: true,
});
@@ -33,7 +33,7 @@ export function createBot(config, appleScraper, googleScraper) {
},
async sendMessageSilent(chatId, html) {
try {
await tg.sendMessage(chatId, html, {
await api.sendMessage(chatId, html, {
parse_mode: PARSE_MODE,
disable_web_page_preview: true,
disable_notification: true,
@@ -44,12 +44,7 @@ export function createBot(config, appleScraper, googleScraper) {
},
async sendDocument(chatId, filename, body) {
try {
await tg.sendDocument(
chatId,
Buffer.from(body, 'utf8'),
{},
{ filename, contentType: 'application/json' },
);
await api.sendDocument(chatId, filename, body);
} catch (err) {
logger.warn({ chatId, err: err.message }, 'send document failed');
}
@@ -59,60 +54,19 @@ export function createBot(config, appleScraper, googleScraper) {
// Java command identifiers — keep names matching exactly.
const commands = {
info: createInfoCommand(),
addgroup: createAddGroupCommand(config),
delgroup: createDeleteGroupCommand(config),
listgroup: createListGroupCommand(config),
addapple: createAddAppleAppCommand(appleScraper),
delapple: createDeleteAppleAppCommand(),
addgoogle: createAddGoogleAppCommand(googleScraper),
delgoogle: createDeleteGoogleAppCommand(),
listapp: createListAppCommand(),
checkapp: createCheckAppCommand(config, appleScraper, googleScraper),
checkappscore: createCheckAppScoresCommand(appleScraper, googleScraper),
addgroup: createAddGroupCommand(config, store),
delgroup: createDeleteGroupCommand(config, store),
listgroup: createListGroupCommand(config, store),
addapple: createAddAppleAppCommand(store, appleScraper),
delapple: createDeleteAppleAppCommand(store),
addgoogle: createAddGoogleAppCommand(store, googleScraper),
delgoogle: createDeleteGoogleAppCommand(store),
listapp: createListAppCommand(store),
checkapp: createCheckAppCommand(config, store, appleScraper, googleScraper),
checkappscore: createCheckAppScoresCommand(store, appleScraper, googleScraper),
rawappleapp: createRawAppleAppCommand(appleScraper),
rawgoogleapp: createRawGoogleAppCommand(googleScraper),
};
tg.on('message', async (msg) => {
const name = parseCommandName(msg.text, config.telegramBotUsername);
if (!name) return;
const handler = commands[name];
if (!handler) {
logger.debug({ command: name }, 'Unknown command');
return;
}
logger.info(
{ command: name, userId: msg.from?.id, chatId: msg.chat.id },
'Executing command',
);
try {
await handler(msg, sender);
} catch (err) {
logger.error({ err: err.message, command: name }, 'panic in command');
await sender.sendMessage(msg.chat.id, 'Internal server error');
}
});
tg.on('polling_error', (err) => {
logger.warn({ err: err.message }, 'polling error');
});
tg.getMe()
.then((me) => logger.info({ username: me.username }, 'Authorized on account'))
.catch((err) => logger.error({ err: err.message }, 'getMe failed'));
return { sender, telegram: tg };
}
// Extracts "info" from "/info", "/info arg", "/info@bot", "/info@bot arg".
function parseCommandName(text, botUsername) {
if (!text || text[0] !== '/') return null;
const space = text.indexOf(' ');
const head = space < 0 ? text.slice(1) : text.slice(1, space);
const at = head.indexOf('@');
if (at < 0) return head;
const cmd = head.slice(0, at);
const target = head.slice(at + 1);
if (botUsername && target && target.toLowerCase() !== botUsername.toLowerCase()) return null;
return cmd;
return { sender, commands, api };
}
+3 -5
View File
@@ -1,11 +1,10 @@
import { buildAppleRequestByBundleId, buildAppleRequestByTrackId } from '../../api/apple-scraper.js';
import * as groupRepo from '../../repository/group-repository.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /addapple <id|appId> [country=vn] — Java AddAppleAppCommand.
export function createAddAppleAppCommand(appleScraper) {
export function createAddAppleAppCommand(store, appleScraper) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
const args = splitArgs(getCommandArguments(msg.text));
if (args.length < 1 || args.length > 2) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
@@ -13,7 +12,6 @@ export function createAddAppleAppCommand(appleScraper) {
}
const country = args.length === 2 ? args[1] : 'vn';
// Java: try parsing arg[0] as Long (trackId); else treat as bundleId.
const trackId = Number.parseInt(args[0], 10);
const req =
Number.isFinite(trackId) && String(trackId) === args[0]
@@ -32,7 +30,7 @@ export function createAddAppleAppCommand(appleScraper) {
}
try {
const added = await groupRepo.addAppleApp(msg.chat.id, resp.appId, country);
const added = await store.group.addAppleApp(msg.chat.id, resp.appId, country);
if (!added) {
await sender.sendMessage(msg.chat.id, `Apple app <code>${resp.appId}</code> is already added`);
return;
+3 -4
View File
@@ -1,11 +1,10 @@
import { buildGoogleRequest } from '../../api/google-scraper.js';
import * as groupRepo from '../../repository/group-repository.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /addgoogle <appId> [country=vn] — Java AddGoogleAppCommand.
export function createAddGoogleAppCommand(googleScraper) {
export function createAddGoogleAppCommand(store, googleScraper) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
const args = splitArgs(getCommandArguments(msg.text));
if (args.length < 1 || args.length > 2) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
@@ -26,7 +25,7 @@ export function createAddGoogleAppCommand(googleScraper) {
}
try {
const added = await groupRepo.addGoogleApp(msg.chat.id, appId, country);
const added = await store.group.addGoogleApp(msg.chat.id, appId, country);
if (!added) {
await sender.sendMessage(msg.chat.id, `Google app <code>${appId}</code> is already added`);
return;
+3 -5
View File
@@ -1,9 +1,7 @@
import * as adminRepo from '../../repository/admin-repository.js';
import * as groupRepo from '../../repository/group-repository.js';
import { getCommandArguments, requireAdminUser, splitArgs } from './command-utils.js';
// /addgroup [groupId] — Java AddGroupCommand. Admin-only.
export function createAddGroupCommand(config) {
export function createAddGroupCommand(config, store) {
return async (msg, sender) => {
if (!(await requireAdminUser(msg.from.id, msg.chat.id, config, sender))) return;
const args = splitArgs(getCommandArguments(msg.text));
@@ -21,12 +19,12 @@ export function createAddGroupCommand(config) {
groupId = parsed;
}
try {
const added = await adminRepo.addGroup(groupId);
const added = await store.admin.addGroup(groupId);
if (!added) {
await sender.sendMessage(msg.chat.id, 'Group is already added');
return;
}
await groupRepo.initGroup(groupId);
await store.group.initGroup(groupId);
await sender.sendMessage(msg.chat.id, 'Group added successfully');
} catch {
await sender.sendMessage(msg.chat.id, 'Internal server error');
+3 -4
View File
@@ -1,18 +1,17 @@
import * as groupRepo from '../../repository/group-repository.js';
import { buildTable } from '../../util/table.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /checkappscore — Java CheckAppScoreCommand. Reports score + ratings.
// Score rounded to 1 decimal (Java Precision.round(score, 1) parity).
export function createCheckAppScoresCommand(appleScraper, googleScraper) {
export function createCheckAppScoresCommand(store, appleScraper, googleScraper) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
if (splitArgs(getCommandArguments(msg.text)).length !== 0) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
return;
}
try {
const group = await groupRepo.getGroup(msg.chat.id);
const group = await store.group.getGroup(msg.chat.id);
const headers = ['AppId', 'Score', 'Ratings'];
const appleRows = await scoreRowsFor(group.appleApps, appleScraper);
const googleRows = await scoreRowsFor(group.googleApps, googleScraper);
+3 -4
View File
@@ -1,18 +1,17 @@
import * as groupRepo from '../../repository/group-repository.js';
import { buildTable } from '../../util/table.js';
import { daysBetween, formatDateInTz } from '../../util/time.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /checkapp — Java CheckAppCommand. Reports update status per app, per store.
export function createCheckAppCommand(config, appleScraper, googleScraper) {
export function createCheckAppCommand(config, store, appleScraper, googleScraper) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
if (splitArgs(getCommandArguments(msg.text)).length !== 0) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
return;
}
try {
const group = await groupRepo.getGroup(msg.chat.id);
const group = await store.group.getGroup(msg.chat.id);
const nowMs = Date.now();
const threshold = config.numDaysWarningNotUpdated;
const headers = ['AppId', 'Updated', 'Days', 'OK'];
+2 -4
View File
@@ -1,5 +1,3 @@
import * as adminRepo from '../../repository/admin-repository.js';
export function splitArgs(text) {
if (!text) return [];
return text.trim().split(/\s+/).filter((s) => s.length > 0);
@@ -14,9 +12,9 @@ export function getCommandArguments(text) {
return trimmed.slice(space + 1).trim();
}
export async function authorizeGroup(chatId, sender) {
export async function authorizeGroup(chatId, store, sender) {
try {
const ok = await adminRepo.hasGroup(chatId);
const ok = await store.admin.hasGroup(chatId);
if (!ok) {
await sender.sendMessage(chatId, 'Group is not allowed to use bot');
return false;
+3 -4
View File
@@ -1,17 +1,16 @@
import * as groupRepo from '../../repository/group-repository.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /delapple <appId> — Java DeleteAppleAppCommand.
export function createDeleteAppleAppCommand() {
export function createDeleteAppleAppCommand(store) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
const args = splitArgs(getCommandArguments(msg.text));
if (args.length !== 1) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
return;
}
try {
const removed = await groupRepo.removeAppleApp(msg.chat.id, args[0]);
const removed = await store.group.removeAppleApp(msg.chat.id, args[0]);
if (!removed) {
await sender.sendMessage(msg.chat.id, 'Apple app is not added');
return;
+3 -4
View File
@@ -1,17 +1,16 @@
import * as groupRepo from '../../repository/group-repository.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /delgoogle <appId> — Java DeleteGoogleAppCommand.
export function createDeleteGoogleAppCommand() {
export function createDeleteGoogleAppCommand(store) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
const args = splitArgs(getCommandArguments(msg.text));
if (args.length !== 1) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
return;
}
try {
const removed = await groupRepo.removeGoogleApp(msg.chat.id, args[0]);
const removed = await store.group.removeGoogleApp(msg.chat.id, args[0]);
if (!removed) {
await sender.sendMessage(msg.chat.id, 'Google app is not added');
return;
+2 -3
View File
@@ -1,8 +1,7 @@
import * as adminRepo from '../../repository/admin-repository.js';
import { getCommandArguments, requireAdminUser, splitArgs } from './command-utils.js';
// /delgroup [groupId] — Java DeleteGroupCommand. Admin-only.
export function createDeleteGroupCommand(config) {
export function createDeleteGroupCommand(config, store) {
return async (msg, sender) => {
if (!(await requireAdminUser(msg.from.id, msg.chat.id, config, sender))) return;
const args = splitArgs(getCommandArguments(msg.text));
@@ -20,7 +19,7 @@ export function createDeleteGroupCommand(config) {
groupId = parsed;
}
try {
const removed = await adminRepo.removeGroup(groupId);
const removed = await store.admin.removeGroup(groupId);
if (!removed) {
await sender.sendMessage(msg.chat.id, 'Group is not added');
return;
+3 -4
View File
@@ -1,17 +1,16 @@
import * as groupRepo from '../../repository/group-repository.js';
import { buildTable } from '../../util/table.js';
import { authorizeGroup, getCommandArguments, splitArgs } from './command-utils.js';
// /listapp — Java ListAppCommand. Two tables (Apple / Google) of tracked apps.
export function createListAppCommand() {
export function createListAppCommand(store) {
return async (msg, sender) => {
if (!(await authorizeGroup(msg.chat.id, sender))) return;
if (!(await authorizeGroup(msg.chat.id, store, sender))) return;
if (splitArgs(getCommandArguments(msg.text)).length !== 0) {
await sender.sendMessage(msg.chat.id, 'Invalid arguments');
return;
}
try {
const group = await groupRepo.getGroup(msg.chat.id);
const group = await store.group.getGroup(msg.chat.id);
const out =
'<b>Apple Apps</b>\n' +
formatAppTable(group.appleApps) +
+2 -3
View File
@@ -1,8 +1,7 @@
import * as adminRepo from '../../repository/admin-repository.js';
import { getCommandArguments, requireAdminUser, splitArgs } from './command-utils.js';
// /listgroup — Java ListGroupCommand. Admin-only.
export function createListGroupCommand(config) {
export function createListGroupCommand(config, store) {
return async (msg, sender) => {
if (!(await requireAdminUser(msg.from.id, msg.chat.id, config, sender))) return;
if (splitArgs(getCommandArguments(msg.text)).length !== 0) {
@@ -10,7 +9,7 @@ export function createListGroupCommand(config) {
return;
}
try {
const groups = await adminRepo.getAllGroups();
const groups = await store.admin.getAllGroups();
if (groups.length === 0) {
await sender.sendMessage(msg.chat.id, 'No groups found');
return;
+42
View File
@@ -0,0 +1,42 @@
// Per-message dispatcher. Routes Telegram update messages to the matching
// command handler. Equivalent to the inner loop of the old polling bot.
export async function dispatch(message, deps) {
const { sender, commands, config, logger } = deps;
if (!message?.text || message.text[0] !== '/') return;
const name = parseCommandName(message.text, config.telegramBotUsername);
if (!name) return;
const handler = commands[name];
if (!handler) {
logger.debug({ command: name }, 'Unknown command');
return;
}
logger.info(
{ command: name, userId: message.from?.id, chatId: message.chat.id },
'Executing command',
);
try {
await handler(message, sender);
} catch (err) {
logger.error({ err: err.message, command: name }, 'command failed');
try {
await sender.sendMessage(message.chat.id, 'Internal server error');
} catch {
// best-effort; don't double-log
}
}
}
// Extracts "info" from "/info", "/info arg", "/info@bot", "/info@bot arg".
function parseCommandName(text, botUsername) {
const space = text.indexOf(' ');
const head = space < 0 ? text.slice(1) : text.slice(1, space);
const at = head.indexOf('@');
if (at < 0) return head;
const cmd = head.slice(0, at);
const target = head.slice(at + 1);
if (botUsername && target && target.toLowerCase() !== botUsername.toLowerCase()) return null;
return cmd;
}
+59
View File
@@ -0,0 +1,59 @@
// Raw fetch wrapper for the Telegram Bot API. Replaces node-telegram-bot-api
// (which uses Node-only request/streams and bloats the Worker bundle).
const TELEGRAM_BASE = 'https://api.telegram.org';
export class TelegramApiError extends Error {
constructor(method, status, body) {
super(`telegram ${method} failed: ${status} ${body}`);
this.name = 'TelegramApiError';
this.method = method;
this.status = status;
this.body = body;
}
}
export function createTelegramApi(token) {
const base = `${TELEGRAM_BASE}/bot${token}`;
async function callJson(method, payload) {
const res = await fetch(`${base}/${method}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload),
});
const text = await res.text();
if (!res.ok) throw new TelegramApiError(method, res.status, text);
return JSON.parse(text);
}
// multipart/form-data — for sendDocument. WHATWG FormData/Blob is native to
// Workers; no `form-data` npm dep needed.
async function callMultipart(method, fields, file) {
const form = new FormData();
for (const [k, v] of Object.entries(fields)) form.set(k, String(v));
if (file) {
form.set(
file.field,
new Blob([file.body], { type: file.contentType }),
file.filename,
);
}
const res = await fetch(`${base}/${method}`, { method: 'POST', body: form });
const text = await res.text();
if (!res.ok) throw new TelegramApiError(method, res.status, text);
return JSON.parse(text);
}
return {
getMe: () => callJson('getMe', {}),
sendMessage: (chatId, text, opts = {}) =>
callJson('sendMessage', { chat_id: chatId, text, ...opts }),
sendDocument: (chatId, filename, body, opts = {}) =>
callMultipart(
'sendDocument',
{ chat_id: chatId, ...opts },
{ field: 'document', filename, body, contentType: 'application/json' },
),
};
}
+24 -70
View File
@@ -1,87 +1,41 @@
import 'dotenv/config';
import { createLogger } from './logger.js';
const DEFAULT_DATABASE_NAME = 'store-scraper-bot';
function getEnv(key, fallback = '') {
const v = process.env[key];
return v && v.length > 0 ? v : fallback;
}
function getEnvInt(key, fallback) {
const v = process.env[key];
if (!v) return fallback;
const n = Number.parseInt(v, 10);
return Number.isFinite(n) ? n : fallback;
}
function parseAdminIds(raw) {
return raw
.split(',')
.map((s) => s.trim())
.filter((s) => s.length > 0)
.map((s) => {
try {
return BigInt(s);
} catch {
return null;
}
})
.filter((v) => v !== null);
.map((s) => Number.parseInt(s, 10))
.filter((n) => Number.isFinite(n));
}
// Mirrors Java/Go databaseFromURI: extract db name from connection string.
function databaseFromUri(uri) {
const idx = uri.indexOf('://');
let rest = idx >= 0 ? uri.slice(idx + 3) : uri;
const slash = rest.indexOf('/');
if (slash < 0) return DEFAULT_DATABASE_NAME;
let tail = rest.slice(slash + 1);
const q = tail.indexOf('?');
if (q >= 0) tail = tail.slice(0, q);
tail = tail.trim();
return tail.length > 0 ? tail : DEFAULT_DATABASE_NAME;
}
// Builds config from a Workers `env` binding. Called once per fetch / scheduled
// invocation; cheap.
export function loadConfig(env) {
const required = [
'TELEGRAM_BOT_TOKEN',
'TELEGRAM_BOT_USERNAME',
'TELEGRAM_WEBHOOK_SECRET',
'MONGODB_URI',
'ADMIN_IDS',
];
for (const k of required) {
if (!env[k]) throw new Error(`${k} is required`);
}
export function loadConfig() {
const telegramBotToken = getEnv('TELEGRAM_BOT_TOKEN');
if (!telegramBotToken) throw new Error('TELEGRAM_BOT_TOKEN is required');
const telegramBotUsername = getEnv('TELEGRAM_BOT_USERNAME');
if (!telegramBotUsername) throw new Error('TELEGRAM_BOT_USERNAME is required');
// Java parity: prefer MONGODB_CONNECTION_STRING, fall back to MONGO_URI.
const mongoUri = getEnv('MONGODB_CONNECTION_STRING', getEnv('MONGO_URI', 'mongodb://localhost:27017'));
const mongoDatabase = getEnv('MONGO_DATABASE') || databaseFromUri(mongoUri);
const mongoTimeoutMs = getEnvInt('MONGO_TIMEOUT_SECONDS', 10) * 1000;
const env = getEnv('ENV', 'DEVELOPMENT') === 'PRODUCTION' ? 'PRODUCTION' : 'DEVELOPMENT';
const adminIdsRaw = getEnv('ADMIN_IDS');
if (!adminIdsRaw) throw new Error('ADMIN_IDS is required');
const adminIds = parseAdminIds(adminIdsRaw);
const adminIds = parseAdminIds(env.ADMIN_IDS);
if (adminIds.length === 0) throw new Error('at least one admin ID is required');
const config = {
telegramBotToken,
telegramBotUsername,
mongoUri,
mongoDatabase,
mongoTimeoutMs,
env,
return {
telegramBotToken: env.TELEGRAM_BOT_TOKEN,
telegramBotUsername: env.TELEGRAM_BOT_USERNAME,
telegramWebhookSecret: env.TELEGRAM_WEBHOOK_SECRET,
adminIds,
creatorId: adminIds[0],
sourceCommit: getEnv('SOURCE_COMMIT', 'unknown'),
appCacheSeconds: getEnvInt('APP_CACHE_SECONDS', 600),
numDaysWarningNotUpdated: getEnvInt('NUM_DAYS_WARNING_NOT_UPDATED', 30),
scheduleCheckAppTime: getEnv('SCHEDULE_CHECK_APP_TIME', '0 7 * * *'),
isAdmin: (userId) => adminIds.includes(Number(userId)),
appCacheSeconds: Number(env.APP_CACHE_SECONDS ?? 600),
numDaysWarningNotUpdated: Number(env.NUM_DAYS_WARNING_NOT_UPDATED ?? 30),
timezone: 'Asia/Ho_Chi_Minh',
logger: createLogger(env),
logger: createLogger(),
};
config.isAdmin = (userId) => {
const id = typeof userId === 'bigint' ? userId : BigInt(userId);
return config.adminIds.some((a) => a === id);
};
return config;
}
+66 -36
View File
@@ -1,43 +1,73 @@
import { loadConfig } from './config.js';
import { closeMongoDB, initMongoDB } from './repository/mongodb.js';
import { initAdmin } from './repository/admin-repository.js';
import { createStore } from './repository/store.js';
import { createAppleScraper } from './api/apple-scraper.js';
import { createGoogleScraper } from './api/google-scraper.js';
import { createBot } from './bot/bot.js';
import { createScheduler } from './scheduler/scheduler.js';
import { dispatch } from './bot/dispatch.js';
import { runDailyCheck } from './scheduler/scheduler.js';
async function main() {
const config = loadConfig();
const logger = config.logger;
logger.info({ env: config.env, commit: config.sourceCommit }, 'Starting Store Scraper Bot');
await initMongoDB(config);
await initAdmin(); // Java parity: ensure singleton "common/admin" doc exists.
const appleScraper = createAppleScraper(config);
const googleScraper = createGoogleScraper(config);
const bot = createBot(config, appleScraper, googleScraper);
const scheduler = createScheduler(config, bot.sender, appleScraper, googleScraper);
scheduler.start();
logger.info('Starting Telegram bot polling');
const shutdown = async (signal) => {
logger.info({ signal }, 'Received shutdown signal, stopping bot...');
try {
scheduler.stop();
await bot.telegram.stopPolling();
await closeMongoDB();
} finally {
process.exit(0);
}
};
process.on('SIGINT', () => shutdown('SIGINT'));
process.on('SIGTERM', () => shutdown('SIGTERM'));
// Builds the per-invocation context. Cheap; relies on memoized MongoClient
// inside the store factory chain.
function build(env) {
const config = loadConfig(env);
const store = createStore(env, config.appCacheSeconds);
const appleScraper = createAppleScraper(config, store);
const googleScraper = createGoogleScraper(config, store);
const { sender, commands } = createBot(config, store, appleScraper, googleScraper);
return { config, store, appleScraper, googleScraper, sender, commands };
}
main().catch((err) => {
console.error('Fatal:', err);
process.exit(1);
});
export default {
// Telegram webhook entry. Validates the `secret_token` header, acks fast,
// then dispatches in `ctx.waitUntil` so Telegram doesn't retry on slow Mongo.
async fetch(request, env, ctx) {
if (request.method !== 'POST') {
return new Response('Not found', { status: 404 });
}
let app;
try {
app = build(env);
} catch (err) {
console.log(JSON.stringify({ level: 'error', msg: 'config error', err: err.message }));
return new Response('Server misconfigured', { status: 500 });
}
const secret = request.headers.get('X-Telegram-Bot-Api-Secret-Token');
if (secret !== app.config.telegramWebhookSecret) {
return new Response('Unauthorized', { status: 401 });
}
let update;
try {
update = await request.json();
} catch {
return new Response('Bad request', { status: 400 });
}
if (!update?.message) return new Response('OK');
ctx.waitUntil(
dispatch(update.message, {
sender: app.sender,
commands: app.commands,
config: app.config,
logger: app.config.logger,
}),
);
return new Response('OK');
},
// Daily cron handler. Schedule lives in wrangler.toml.
async scheduled(event, env, ctx) {
let app;
try {
app = build(env);
} catch (err) {
console.log(JSON.stringify({ level: 'error', msg: 'config error', err: err.message }));
return;
}
ctx.waitUntil(
runDailyCheck(app.config, app.store, app.sender, app.appleScraper, app.googleScraper),
);
},
};
+15 -15
View File
@@ -1,16 +1,16 @@
import { pino } from 'pino';
export function createLogger(env) {
const isDev = env !== 'PRODUCTION';
return pino(
isDev
? {
level: 'debug',
transport: {
target: 'pino-pretty',
options: { colorize: true, translateTime: 'SYS:HH:MM:ss', ignore: 'pid,hostname' },
},
}
: { level: 'info' },
);
// Worker-friendly structured logger. Cloudflare Observability indexes JSON
// console output, so we emit one JSON record per call.
export function createLogger() {
function log(level, payloadOrMsg, maybeMsg) {
const isObj = payloadOrMsg !== null && typeof payloadOrMsg === 'object';
const payload = isObj ? payloadOrMsg : {};
const msg = isObj ? maybeMsg ?? '' : payloadOrMsg ?? '';
console.log(JSON.stringify({ level, ts: new Date().toISOString(), msg, ...payload }));
}
return {
debug: (p, m) => log('debug', p, m),
info: (p, m) => log('info', p, m),
warn: (p, m) => log('warn', p, m),
error: (p, m) => log('error', p, m),
};
}
+48 -36
View File
@@ -1,47 +1,59 @@
import { getCollection } from './mongodb.js';
import { ADMIN_ID, adminAddGroup, adminHasGroup, adminRemoveGroup, newAdmin } from '../models/admin.js';
import {
ADMIN_ID,
adminAddGroup,
adminHasGroup,
adminRemoveGroup,
newAdmin,
} from '../models/admin.js';
// Stored in "common" collection at _id="admin" (Java parity).
function collection() {
return getCollection('common');
}
export function createAdminRepository(env) {
function collection() {
return getCollection('common', env);
}
export async function initAdmin() {
const c = collection();
const count = await c.countDocuments({ _id: ADMIN_ID });
if (count > 0) return;
await save(newAdmin());
}
async function init() {
const c = await collection();
const count = await c.countDocuments({ _id: ADMIN_ID });
if (count > 0) return;
await save(newAdmin());
}
export async function getAdmin() {
const doc = await collection().findOne({ _id: ADMIN_ID });
return doc ?? newAdmin();
}
async function getAdmin() {
const c = await collection();
const doc = await c.findOne({ _id: ADMIN_ID });
return doc ?? newAdmin();
}
export async function save(admin) {
await collection().replaceOne({ _id: ADMIN_ID }, admin, { upsert: true });
}
async function save(admin) {
const c = await collection();
await c.replaceOne({ _id: ADMIN_ID }, admin, { upsert: true });
}
export async function addGroup(groupId) {
const admin = await getAdmin();
if (!adminAddGroup(admin, groupId)) return false;
await save(admin);
return true;
}
async function addGroup(groupId) {
const admin = await getAdmin();
if (!adminAddGroup(admin, groupId)) return false;
await save(admin);
return true;
}
export async function removeGroup(groupId) {
const admin = await getAdmin();
if (!adminRemoveGroup(admin, groupId)) return false;
await save(admin);
return true;
}
async function removeGroup(groupId) {
const admin = await getAdmin();
if (!adminRemoveGroup(admin, groupId)) return false;
await save(admin);
return true;
}
export async function hasGroup(groupId) {
const admin = await getAdmin();
return adminHasGroup(admin, groupId);
}
async function hasGroup(groupId) {
const admin = await getAdmin();
return adminHasGroup(admin, groupId);
}
export async function getAllGroups() {
const admin = await getAdmin();
return admin.groups;
async function getAllGroups() {
const admin = await getAdmin();
return admin.groups;
}
return { init, getAdmin, save, addGroup, removeGroup, hasGroup, getAllGroups };
}
+20 -15
View File
@@ -1,22 +1,27 @@
import { getCollection } from './mongodb.js';
import { isAppleAppExpired } from '../models/apple-app.js';
function collection() {
return getCollection('apple_app');
}
export function createAppleAppRepository(env, appCacheSeconds) {
function collection() {
return getCollection('apple_app', env);
}
export async function getAppleApp(appId) {
return collection().findOne({ _id: appId });
}
async function get(appId) {
const c = await collection();
return c.findOne({ _id: appId });
}
export async function saveAppleApp(entry) {
await collection().replaceOne({ _id: entry._id }, entry, { upsert: true });
}
async function save(entry) {
const c = await collection();
await c.replaceOne({ _id: entry._id }, entry, { upsert: true });
}
export async function getCachedAppleApp(appId, appCacheSeconds) {
const entry = await getAppleApp(appId);
if (!entry) return null;
const cacheMillis = appCacheSeconds * 1000;
if (isAppleAppExpired(entry, Date.now(), cacheMillis)) return null;
return entry;
async function getCached(appId) {
const entry = await get(appId);
if (!entry) return null;
if (isAppleAppExpired(entry, Date.now(), appCacheSeconds * 1000)) return null;
return entry;
}
return { get, save, getCached };
}
+20 -15
View File
@@ -1,22 +1,27 @@
import { getCollection } from './mongodb.js';
import { isGoogleAppExpired } from '../models/google-app.js';
function collection() {
return getCollection('google_app');
}
export function createGoogleAppRepository(env, appCacheSeconds) {
function collection() {
return getCollection('google_app', env);
}
export async function getGoogleApp(appId) {
return collection().findOne({ _id: appId });
}
async function get(appId) {
const c = await collection();
return c.findOne({ _id: appId });
}
export async function saveGoogleApp(entry) {
await collection().replaceOne({ _id: entry._id }, entry, { upsert: true });
}
async function save(entry) {
const c = await collection();
await c.replaceOne({ _id: entry._id }, entry, { upsert: true });
}
export async function getCachedGoogleApp(appId, appCacheSeconds) {
const entry = await getGoogleApp(appId);
if (!entry) return null;
const cacheMillis = appCacheSeconds * 1000;
if (isGoogleAppExpired(entry, Date.now(), cacheMillis)) return null;
return entry;
async function getCached(appId) {
const entry = await get(appId);
if (!entry) return null;
if (isGoogleAppExpired(entry, Date.now(), appCacheSeconds * 1000)) return null;
return entry;
}
return { get, save, getCached };
}
+47 -41
View File
@@ -8,52 +8,58 @@ import {
newGroup,
} from '../models/group.js';
function collection() {
return getCollection('group');
}
export function createGroupRepository(env) {
function collection() {
return getCollection('group', env);
}
export async function exists(groupId) {
const count = await collection().countDocuments({ _id: groupIdToKey(groupId) });
return count > 0;
}
async function exists(groupId) {
const c = await collection();
const count = await c.countDocuments({ _id: groupIdToKey(groupId) });
return count > 0;
}
export async function getGroup(groupId) {
const doc = await collection().findOne({ _id: groupIdToKey(groupId) });
return doc ?? newGroup(groupId);
}
async function getGroup(groupId) {
const c = await collection();
const doc = await c.findOne({ _id: groupIdToKey(groupId) });
return doc ?? newGroup(groupId);
}
export async function saveGroup(group) {
await collection().replaceOne({ _id: group._id }, group, { upsert: true });
}
async function saveGroup(group) {
const c = await collection();
await c.replaceOne({ _id: group._id }, group, { upsert: true });
}
export async function initGroup(groupId) {
if (await exists(groupId)) return;
await saveGroup(newGroup(groupId));
}
async function initGroup(groupId) {
if (await exists(groupId)) return;
await saveGroup(newGroup(groupId));
}
export async function deleteGroup(groupId) {
await collection().deleteOne({ _id: groupIdToKey(groupId) });
}
async function deleteGroup(groupId) {
const c = await collection();
await c.deleteOne({ _id: groupIdToKey(groupId) });
}
async function mutateAndSave(groupId, mutator) {
const group = await getGroup(groupId);
if (!mutator(group)) return false;
await saveGroup(group);
return true;
}
async function mutateAndSave(groupId, mutator) {
const group = await getGroup(groupId);
if (!mutator(group)) return false;
await saveGroup(group);
return true;
}
export function addAppleApp(groupId, appId, country) {
return mutateAndSave(groupId, (g) => groupAddAppleApp(g, appId, country));
}
export function removeAppleApp(groupId, appId) {
return mutateAndSave(groupId, (g) => groupRemoveAppleApp(g, appId));
}
export function addGoogleApp(groupId, appId, country) {
return mutateAndSave(groupId, (g) => groupAddGoogleApp(g, appId, country));
}
export function removeGoogleApp(groupId, appId) {
return mutateAndSave(groupId, (g) => groupRemoveGoogleApp(g, appId));
return {
exists,
getGroup,
saveGroup,
initGroup,
deleteGroup,
addAppleApp: (groupId, appId, country) =>
mutateAndSave(groupId, (g) => groupAddAppleApp(g, appId, country)),
removeAppleApp: (groupId, appId) =>
mutateAndSave(groupId, (g) => groupRemoveAppleApp(g, appId)),
addGoogleApp: (groupId, appId, country) =>
mutateAndSave(groupId, (g) => groupAddGoogleApp(g, appId, country)),
removeGoogleApp: (groupId, appId) =>
mutateAndSave(groupId, (g) => groupRemoveGoogleApp(g, appId)),
};
}
+34 -23
View File
@@ -1,30 +1,41 @@
import { MongoClient } from 'mongodb';
let client;
let database;
export async function initMongoDB(config) {
client = new MongoClient(config.mongoUri, {
serverSelectionTimeoutMS: config.mongoTimeoutMs,
});
await client.connect();
await client.db(config.mongoDatabase).command({ ping: 1 });
database = client.db(config.mongoDatabase);
config.logger.info(
{ database: config.mongoDatabase, uri: config.mongoUri },
'Connected to MongoDB',
);
// Thrown when the driver fails to reach Atlas (e.g. paused cluster, network).
// Command handlers catch this and reply with "Internal server error".
export class MongoUnavailable extends Error {
constructor(cause) {
super(`MongoDB unavailable: ${cause.message}`);
this.name = 'MongoUnavailable';
this.cause = cause;
}
}
export async function closeMongoDB() {
if (client) await client.close();
// Memoized per warm Worker isolate. Module-scope is per-isolate in Workers,
// so this caches one Promise<{client, db}> for the isolate's lifetime.
let memoized = null;
export async function getMongo(env) {
if (memoized) return memoized;
memoized = (async () => {
try {
const client = new MongoClient(env.MONGODB_URI, {
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 10000,
appName: 'js-store-scraper-bot',
});
await client.connect();
// db() with no arg uses the database from the URI path.
const db = client.db();
return { client, db };
} catch (err) {
memoized = null; // allow retry on next request
throw new MongoUnavailable(err);
}
})();
return memoized;
}
export function getDatabase() {
if (!database) throw new Error('MongoDB not initialized');
return database;
}
export function getCollection(name) {
return getDatabase().collection(name);
export async function getCollection(name, env) {
const { db } = await getMongo(env);
return db.collection(name);
}
+15
View File
@@ -0,0 +1,15 @@
import { createAdminRepository } from './admin-repository.js';
import { createGroupRepository } from './group-repository.js';
import { createAppleAppRepository } from './apple-app-repository.js';
import { createGoogleAppRepository } from './google-app-repository.js';
// Single binding point for all repositories. Threads `env` once so command
// handlers don't need to know about the Worker `env` argument.
export function createStore(env, appCacheSeconds) {
return {
admin: createAdminRepository(env),
group: createGroupRepository(env),
appleApp: createAppleAppRepository(env, appCacheSeconds),
googleApp: createGoogleAppRepository(env, appCacheSeconds),
};
}
+104 -124
View File
@@ -1,136 +1,116 @@
import cron from 'node-cron';
import * as adminRepo from '../repository/admin-repository.js';
import * as groupRepo from '../repository/group-repository.js';
import { buildTable, formatNumber, truncateString } from '../util/table.js';
import { daysBetween, formatDateInTz, formatDateTimeInTz, weekdayInTz } from '../util/time.js';
export function createScheduler(config, sender, appleScraper, googleScraper) {
// One-shot daily check, invoked from the Worker `scheduled` handler. The cron
// schedule lives in wrangler.toml ("0 0 * * *" UTC = 7am Asia/Ho_Chi_Minh).
export async function runDailyCheck(config, store, sender, appleScraper, googleScraper) {
const logger = config.logger;
let task;
const now = new Date();
const dow = weekdayInTz(now, config.timezone);
const silent = dow === 0 || dow === 6;
logger.info({ silent }, 'Running daily check job');
function start() {
task = cron.schedule(config.scheduleCheckAppTime, runDailyCheck, {
scheduled: true,
timezone: config.timezone,
});
logger.info(
{ schedule: config.scheduleCheckAppTime, timezone: config.timezone },
'Scheduler started',
);
let groups;
try {
groups = await store.admin.getAllGroups();
} catch (err) {
logger.error({ err: err.message }, 'Failed to get groups');
return;
}
function stop() {
if (task) task.stop();
logger.info('Scheduler stopped');
}
async function runDailyCheck() {
const now = new Date();
const dow = weekdayInTz(now, config.timezone);
const silent = dow === 0 || dow === 6;
logger.info({ silent }, 'Running daily check job');
let groups;
for (const gid of groups) {
try {
groups = await adminRepo.getAllGroups();
await checkGroup(gid, silent, now, config, store, sender, appleScraper, googleScraper);
} catch (err) {
logger.error({ err: err.message }, 'Failed to get groups');
return;
logger.error({ err: err.message, groupId: gid }, 'check group failed');
}
for (const gid of groups) {
try {
await checkGroup(gid, silent, now);
} catch (err) {
logger.error({ err: err.message, groupId: gid }, 'check group failed');
}
}
logger.info({ groupsChecked: groups.length }, 'Daily check job completed');
}
async function checkGroup(groupId, silent, now) {
const group = await groupRepo.getGroup(groupId);
if (group.appleApps.length === 0 && group.googleApps.length === 0) {
logger.info({ groupId }, 'Group has no apps, skipping');
return;
}
const threshold = config.numDaysWarningNotUpdated;
const stale = [];
for (const info of group.appleApps) {
try {
const app = await appleScraper.getApp(info.appId, info.country);
if (!app) continue;
const updatedMs = Date.parse(app.updated);
if (Number.isNaN(updatedMs)) continue;
const days = daysBetween(updatedMs, now.getTime());
if (days > threshold) {
stale.push({
appId: info.appId,
title: app.title,
days,
updated: formatDateInTz(new Date(updatedMs), config.timezone),
score: app.score,
reviews: Number(app.reviews ?? 0),
ratings: Number(app.ratings ?? 0),
isApple: true,
});
}
} catch (err) {
logger.error({ err: err.message, appId: info.appId }, 'Apple fetch failed');
}
}
for (const info of group.googleApps) {
try {
const app = await googleScraper.getApp(info.appId, info.country);
if (!app) continue;
const updatedMs = Number(app.updated);
const days = daysBetween(updatedMs, now.getTime());
if (days > threshold) {
stale.push({
appId: info.appId,
title: app.title,
days,
updated: formatDateInTz(new Date(updatedMs), config.timezone),
score: app.score,
reviews: Number(app.reviews ?? 0),
ratings: Number(app.ratings ?? 0),
isApple: false,
});
}
} catch (err) {
logger.error({ err: err.message, appId: info.appId }, 'Google fetch failed');
}
}
if (stale.length === 0) {
logger.info({ groupId }, 'All apps up-to-date');
return;
}
const message = buildReport(groupId, stale, now);
if (silent) await sender.sendMessageSilent(groupId, message);
else await sender.sendMessage(groupId, message);
}
function buildReport(groupId, apps, now) {
const headers = ['App', 'Store', 'Days', 'Updated', 'Score', 'Reviews', 'Ratings'];
const rows = apps.map((a) => [
truncateString(a.title || '', 30),
a.isApple ? 'Apple' : 'Google',
String(a.days),
a.updated,
Number(a.score ?? 0).toFixed(1),
String(a.reviews),
formatNumber(a.ratings),
]);
return (
`<b>Daily App Check Report</b>\n` +
`Date: ${formatDateTimeInTz(now, config.timezone)}\n` +
`Group: <code>${groupId}</code>\n` +
`Apps not updated in &gt;${config.numDaysWarningNotUpdated} days: <b>${apps.length}</b>\n\n` +
`<pre>${buildTable(headers, rows)}</pre>`
);
}
return { start, stop, runDailyCheck };
logger.info({ groupsChecked: groups.length }, 'Daily check job completed');
}
async function checkGroup(groupId, silent, now, config, store, sender, appleScraper, googleScraper) {
const logger = config.logger;
const group = await store.group.getGroup(groupId);
if (group.appleApps.length === 0 && group.googleApps.length === 0) {
logger.info({ groupId }, 'Group has no apps, skipping');
return;
}
const threshold = config.numDaysWarningNotUpdated;
const stale = [];
for (const info of group.appleApps) {
try {
const app = await appleScraper.getApp(info.appId, info.country);
if (!app) continue;
const updatedMs = Date.parse(app.updated);
if (Number.isNaN(updatedMs)) continue;
const days = daysBetween(updatedMs, now.getTime());
if (days > threshold) {
stale.push({
appId: info.appId,
title: app.title,
days,
updated: formatDateInTz(new Date(updatedMs), config.timezone),
score: app.score,
reviews: Number(app.reviews ?? 0),
ratings: Number(app.ratings ?? 0),
isApple: true,
});
}
} catch (err) {
logger.error({ err: err.message, appId: info.appId }, 'Apple fetch failed');
}
}
for (const info of group.googleApps) {
try {
const app = await googleScraper.getApp(info.appId, info.country);
if (!app) continue;
const updatedMs = Number(app.updated);
const days = daysBetween(updatedMs, now.getTime());
if (days > threshold) {
stale.push({
appId: info.appId,
title: app.title,
days,
updated: formatDateInTz(new Date(updatedMs), config.timezone),
score: app.score,
reviews: Number(app.reviews ?? 0),
ratings: Number(app.ratings ?? 0),
isApple: false,
});
}
} catch (err) {
logger.error({ err: err.message, appId: info.appId }, 'Google fetch failed');
}
}
if (stale.length === 0) {
logger.info({ groupId }, 'All apps up-to-date');
return;
}
const message = buildReport(groupId, stale, now, config);
if (silent) await sender.sendMessageSilent(groupId, message);
else await sender.sendMessage(groupId, message);
}
function buildReport(groupId, apps, now, config) {
const headers = ['App', 'Store', 'Days', 'Updated', 'Score', 'Reviews', 'Ratings'];
const rows = apps.map((a) => [
truncateString(a.title || '', 30),
a.isApple ? 'Apple' : 'Google',
String(a.days),
a.updated,
Number(a.score ?? 0).toFixed(1),
String(a.reviews),
formatNumber(a.ratings),
]);
return (
`<b>Daily App Check Report</b>\n` +
`Date: ${formatDateTimeInTz(now, config.timezone)}\n` +
`Group: <code>${groupId}</code>\n` +
`Apps not updated in &gt;${config.numDaysWarningNotUpdated} days: <b>${apps.length}</b>\n\n` +
`<pre>${buildTable(headers, rows)}</pre>`
);
}
+32
View File
@@ -0,0 +1,32 @@
name = "js-store-scraper-bot"
main = "src/index.js"
compatibility_date = "2025-10-01"
# nodejs_compat_v2 enables node:net + node:tls so the official `mongodb` driver
# can open a TCP socket to Atlas. v1 vs v2 are alternatives, not additive.
compatibility_flags = ["nodejs_compat_v2"]
[vars]
APP_CACHE_SECONDS = "600"
NUM_DAYS_WARNING_NOT_UPDATED = "30"
# Daily check job. Cloudflare cron is UTC.
# 0 UTC = 7am Asia/Ho_Chi_Minh (UTC+7).
[triggers]
crons = ["0 0 * * *"]
# Workers Observability — captures console.* logs and request metadata in the
# Cloudflare dashboard. 200k events/day on the Free plan.
[observability]
enabled = true
head_sampling_rate = 1
[observability.logs]
enabled = true
invocation_logs = true
# Secrets (set via `wrangler secret put <name>`, NOT in this file):
# TELEGRAM_BOT_TOKEN — bot token from @BotFather
# TELEGRAM_BOT_USERNAME — bot username (without @)
# TELEGRAM_WEBHOOK_SECRET — random ≥32 chars; validates the X-Telegram-Bot-Api-Secret-Token header
# MONGODB_URI — full SRV string, db inferred from URI path
# ADMIN_IDS — comma-separated Telegram user IDs