feat: scaffold plug-n-play telegram bot on cloudflare workers

grammY-based bot with a module plugin system loaded from the MODULES env
var. Three command visibility levels (public/protected/private) share a
unified command namespace with conflict detection at registry build.

- 4 initial modules (util, wordle, loldle, misc); util fully implemented,
  others are stubs proving the plugin system end-to-end
- util: /info (chat/thread/sender ids) + /help (pure renderer over the
  registry, HTML parse mode, escapes user-influenced strings)
- KVStore interface with CFKVStore and a per-module prefixing factory;
  getJSON/putJSON convenience helpers; other backends drop in via one file
- Webhook at POST /webhook with secret-token validation via grammY's
  webhookCallback; no admin HTTP surface
- Post-deploy register script (npm run deploy = wrangler deploy && node
  --env-file=.env.deploy scripts/register.js) for setWebhook and
  setMyCommands; --dry-run flag for preview
- 56 vitest unit tests across 7 suites covering registry, db wrapper,
  dispatcher, help renderer, validators, and HTML escaper
- biome for lint + format; phased implementation plan under plans/
This commit is contained in:
2026-04-11 09:49:06 +07:00
parent e76ad8c0ee
commit c4314f21df
51 changed files with 6928 additions and 1 deletions

View File

@@ -0,0 +1,56 @@
# Researcher Report: Cloudflare Workers KV basics
**Date:** 2026-04-11
**Scope:** KV API surface, wrangler binding, limits relevant to plugin framework.
## API surface (KVNamespace binding)
```js
await env.KV.get(key, { type: "text" | "json" | "arrayBuffer" | "stream" });
await env.KV.put(key, value, { expirationTtl, expiration, metadata });
await env.KV.delete(key);
await env.KV.list({ prefix, limit, cursor });
```
### `list()` shape
```js
{
keys: [{ name, expiration?, metadata? }, ...],
list_complete: boolean,
cursor: string, // present when list_complete === false
}
```
- Max `limit` per call: **1000** (also the default).
- Pagination via `cursor`. Loop until `list_complete === true`.
- Prefix filter is server-side — efficient for per-module namespacing (`wordle:` prefix).
## Limits that shape the module API
| Limit | Value | Impact on design |
|---|---|---|
| Write/sec **per key** | 1 | Counters / leaderboards must avoid hot keys. Plugin authors must know this. Document in phase-03. |
| Value size | 25 MiB | Non-issue for bot state. |
| Key size | 512 bytes | Prefixing adds ~10 bytes — no issue. |
| Consistency | Eventual (up to ~60s globally) | Read-after-write may not see update immediately from a different edge. OK for game state, NOT OK for auth sessions. |
| `list()` | Eventually consistent, max 1000/call | Paginate. |
## wrangler.toml binding
```toml
[[kv_namespaces]]
binding = "KV"
id = "<namespace-id-from-dashboard-or-wrangler-kv-create>"
preview_id = "<separate-id-for-wrangler-dev>"
```
- Access in code: `env.KV`.
- `preview_id` lets `wrangler dev` use a separate namespace — recommended.
- Create namespace: `wrangler kv namespace create miti99bot-kv` (prints IDs to paste).
## Design implications for the DB abstraction
- Interface must support `get / put / delete / list({ prefix })` — all four map 1:1 to KV.
- Namespaced factory auto-prefixes with `<module>:``list()` from a module only sees its own keys because prefix is applied on top of the requested prefix (e.g. module `wordle` calls `list({ prefix: "games:" })` → final KV prefix becomes `wordle:games:`).
- Return shape normalization: wrap KV's `list()` output in a simpler `{ keys: string[], cursor?: string, done: boolean }` to hide KV-specific metadata fields. Modules that need metadata can take the hit later.
- `get` default type: return string. Modules do their own JSON parse, or expose a `getJSON/putJSON` helper.
## Unresolved questions
- Do we need `metadata` and `expirationTtl` passthrough in v1? **Recommendation: yes for `expirationTtl`** (useful for easter-egg cooldowns), **no for metadata** (YAGNI).

View File

@@ -0,0 +1,57 @@
# Researcher Report: grammY on Cloudflare Workers
**Date:** 2026-04-11
**Scope:** grammY entry point, webhook adapter, secret-token verification, setMyCommands usage.
## Key findings
### Adapter
- Use **`"cloudflare-mod"`** adapter for ES module (fetch handler) Workers. Source: grammY `src/convenience/frameworks.ts`.
- The legacy `"cloudflare"` adapter targets service-worker style Workers. Do NOT use — CF has moved on to module workers.
- Import path (npm, not Deno): `import { Bot, webhookCallback } from "grammy";`
### Minimal fetch handler
```js
import { Bot, webhookCallback } from "grammy";
export default {
async fetch(request, env, ctx) {
const bot = new Bot(env.TELEGRAM_BOT_TOKEN);
// ... register handlers
const handle = webhookCallback(bot, "cloudflare-mod", {
secretToken: env.TELEGRAM_WEBHOOK_SECRET,
});
return handle(request);
},
};
```
### Secret-token verification
- `webhookCallback` accepts `secretToken` in its `WebhookOptions`. When set, grammY validates the incoming `X-Telegram-Bot-Api-Secret-Token` header and rejects mismatches with 401.
- **No need** to manually read the header — delegate to grammY.
- The same secret must be passed to Telegram when calling `setWebhook` (`secret_token` field).
### Bot instantiation cost
- `new Bot()` per request is acceptable for Workers (no persistent state between requests anyway). Global-scope instantiation also works and caches across warm invocations. Prefer **global-scope** for reuse but be aware env bindings are not available at module load — must instantiate lazily inside `fetch`. Recommended pattern: memoize `Bot` in a module-scope variable initialized on first request.
### setMyCommands
- Call via `bot.api.setMyCommands([{ command, description }, ...])`.
- Should be called **on demand**, not on every webhook request (rate-limit risk, latency). Two options:
1. Dedicated admin HTTP route (e.g. `POST /admin/setup`) guarded by a second secret. Runs on demand.
2. One-shot `wrangler` script. Adds tooling complexity.
- **Recommendation:** admin route. Keeps deploy flow in one place (`wrangler deploy` + `curl`). No extra script.
### Init flow
- `bot.init()` is NOT required if you only use `webhookCallback`; grammY handles lazy init.
- For `/admin/setup` that directly calls `bot.api.*`, call `await bot.init()` once to populate `bot.botInfo`.
## Resolved technical answers
| Question | Answer |
|---|---|
| Adapter string | `"cloudflare-mod"` |
| Import | `import { Bot, webhookCallback } from "grammy"` |
| Secret verify | pass `secretToken` in `webhookCallback` options |
| setMyCommands trigger | admin HTTP route guarded by separate secret |
## Unresolved questions
- None blocking. grammY version pin: recommend `^1.30.0` or latest stable at implementation time; phase-01 should `npm view grammy version` to confirm.

View File

@@ -0,0 +1,68 @@
# Researcher Report: wrangler.toml, secrets, and MODULES env var
**Date:** 2026-04-11
**Scope:** how to declare secrets vs vars, local dev via `.dev.vars`, and the list-env-var question.
## Secrets vs vars
| Kind | Where declared | Deployed via | Local dev | Use for |
|---|---|---|---|---|
| **Secret** | NOT in wrangler.toml | `wrangler secret put NAME` | `.dev.vars` file (gitignored) | `TELEGRAM_BOT_TOKEN`, `TELEGRAM_WEBHOOK_SECRET`, `ADMIN_SECRET` |
| **Var** | `[vars]` in wrangler.toml | `wrangler deploy` | `.dev.vars` overrides | `MODULES`, non-sensitive config |
- Both appear on `env.NAME` at runtime — indistinguishable in code.
- `.dev.vars` is a dotenv file (`KEY=value` lines, no quotes required). Gitignore it.
- `wrangler secret put` encrypts into CF's secret store — never visible again after set.
## `[vars]` value types
- Per wrangler docs, `[vars]` accepts **strings and JSON objects**, not top-level arrays.
- Therefore `MODULES` must be a **comma-separated string**:
```toml
[vars]
MODULES = "util,wordle,loldle,misc"
```
- Code parses with `env.MODULES.split(",").map(s => s.trim()).filter(Boolean)`.
- **Rejected alternative:** JSON-string `MODULES = '["util","wordle"]'` + `JSON.parse`. More ceremony, no benefit, looks ugly in TOML. Stick with CSV.
## Full wrangler.toml template (proposed)
```toml
name = "miti99bot"
main = "src/index.js"
compatibility_date = "2026-04-01"
# No nodejs_compat — grammY + our code is pure Web APIs. Smaller bundle.
[vars]
MODULES = "util,wordle,loldle,misc"
[[kv_namespaces]]
binding = "KV"
id = "REPLACE_ME"
preview_id = "REPLACE_ME"
# Secrets (set via `wrangler secret put`):
# TELEGRAM_BOT_TOKEN
# TELEGRAM_WEBHOOK_SECRET
# ADMIN_SECRET
```
## Local dev flow
1. `.dev.vars` contains:
```
TELEGRAM_BOT_TOKEN=xxx
TELEGRAM_WEBHOOK_SECRET=yyy
ADMIN_SECRET=zzz
```
2. `wrangler dev` picks up `.dev.vars` + `[vars]` + `preview_id` KV.
3. For local Telegram testing, expose via `cloudflared tunnel` or ngrok, then `setWebhook` to the public URL.
## Secrets setup commands (for README/phase-09)
```bash
wrangler secret put TELEGRAM_BOT_TOKEN
wrangler secret put TELEGRAM_WEBHOOK_SECRET
wrangler secret put ADMIN_SECRET
wrangler kv namespace create miti99bot-kv
wrangler kv namespace create miti99bot-kv --preview
```
## Unresolved questions
- Should `MODULES` default to a hard-coded list in code if the env var is empty? **Recommendation:** no — fail loudly on empty/missing MODULES so misconfiguration is obvious. `/info` still works since `util` is always in the list.