mirror of
https://github.com/tiennm99/miti99bot-js.git
synced 2026-05-16 13:53:40 +00:00
6f0b5ff0a8
Code/config slice of plan phase 01 (operator-only steps for cluster
provisioning, secrets, and runtime smoke tests deferred to user).
- wrangler.toml: add `compatibility_flags = ["nodejs_compat_v2"]`
(compatibility_date `2025-10-01` already satisfies ≥ 2025-03-20)
- .env.deploy.example: add `MONGODB_URI` placeholder with mirror-protocol note
- scripts/check-secret-leaks.js: lint that fails build on `console.log(env.<SECRET>)`
for MONGODB_URI / TELEGRAM_BOT_TOKEN / TELEGRAM_WEBHOOK_SECRET / ADMIN_TOKEN
- package.json: install mongodb@^6.7.0 (resolved 6.21.0); wire secret-leak
check into `npm run lint`
- docs/using-mongodb.md: operational runbook (cluster spec, free-tier ceiling,
auto-pause behavior, network access permanence, rollback, rotation)
Bundle-size HARD GATE: PASS. Probe with `import { MongoClient }` measures
226 KiB gzipped (3 MiB Free cap, 92% headroom) — nodejs_compat_v2 provides
node:net/tls/crypto from runtime so transitive deps stay unbundled.
CPU-time gate and auto-pause behavior gate require real Atlas access;
deferred to operator (see docs/using-mongodb.md for procedure).
503/503 vitest tests still pass.
143 lines
5.1 KiB
Markdown
143 lines
5.1 KiB
Markdown
# Using MongoDB Atlas
|
||
|
||
Operational runbook for the MongoDB Atlas backend introduced by `plans/260425-1945-mongodb-atlas-migration/`.
|
||
|
||
## Cluster
|
||
|
||
| Field | Value |
|
||
|---|---|
|
||
| Provider | MongoDB Atlas |
|
||
| Tier | M0 Free |
|
||
| Region | `aws-ap-southeast-1` (Singapore) |
|
||
| Cluster name | `miti99bot-prod` (operator confirms) |
|
||
| Database | `miti99bot` |
|
||
| DB user | `miti99bot-worker` (`readWrite@miti99bot`) |
|
||
|
||
Connection string format:
|
||
|
||
```
|
||
mongodb+srv://miti99bot-worker:<pass>@<host>/miti99bot?retryWrites=true&w=majority
|
||
```
|
||
|
||
Stored in two places (must match):
|
||
1. CF Worker secret: `wrangler secret put MONGODB_URI`
|
||
2. `.env.deploy` (gitignored, used by local backfill / verify scripts)
|
||
|
||
Same secret-mirror protocol as `TELEGRAM_BOT_TOKEN`.
|
||
|
||
## Free-tier ceiling
|
||
|
||
- 512 MB storage (data + indexes)
|
||
- 500 max concurrent connections
|
||
- ~100 ops/sec sustained (no daily cap)
|
||
- No backups, single region, no PITR
|
||
- Auto-pauses after 30 days of zero ops
|
||
|
||
Upgrade path: **Flex Tier $8–$30/month** (M2/M5 deprecated as of 2026).
|
||
|
||
## Auto-pause
|
||
|
||
After 30 days idle the cluster pauses. First request after pause:
|
||
- Driver throws `MongoServerSelectionError` after `serverSelectionTimeoutMS` (5s).
|
||
- Worker code (see `src/db/mongo-client.js`, lands Phase 02) catches and returns 503 with `Retry-After: 30`.
|
||
- Cluster auto-wakes within 30–60s on attempted connection.
|
||
|
||
The bot has 6+ daily crons; any cron that writes Mongo prevents pause. Phase 08 confirms.
|
||
|
||
## Network access
|
||
|
||
`0.0.0.0/0` — Cloudflare Workers do NOT have static egress IPs on the Free or basic Paid plans. Only auth (SCRAM-SHA-256) + TLS gate connections.
|
||
|
||
**Permanent risk** unless upgrading to CF Workers paid static-egress IP add-on (~$10/mo).
|
||
|
||
Mitigations:
|
||
- DB user has `readWrite` on one db only (NOT `dbAdmin` / `clusterAdmin`).
|
||
- Password ≥32 chars random.
|
||
- Rotate quarterly.
|
||
- Atlas free-tier email alerts configured for cluster unavailability + connections > 400.
|
||
|
||
## Bundle gate (Phase 01 result)
|
||
|
||
Measured `npx wrangler deploy --dry-run` with a minimal probe importing `MongoClient`:
|
||
|
||
| Metric | Value | Cap (Free) | Cap (Paid) |
|
||
|---|---|---|---|
|
||
| Compressed (gzip) | **226 KiB** | 3 MiB | 10 MiB |
|
||
| Raw (minified) | 1.74 MiB | — | — |
|
||
| On-disk (uncompressed) | 3.9 MiB | — | — |
|
||
|
||
Pass on both plans with **>92% headroom**. nodejs_compat_v2 provides `node:net`/`node:tls`/`node:crypto` from the runtime, so the driver's transitive deps are not bundled.
|
||
|
||
## CPU-time gate (Phase 01 — operator-run)
|
||
|
||
Requires real Atlas + `wrangler dev`. Procedure:
|
||
|
||
1. Add a temporary `/__mongo-ping` route that connects + runs `db.runCommand({ping:1})` + returns `{wall_ms}`.
|
||
2. Run 5+ cold cycles (10-min spaced).
|
||
3. Inspect CF dashboard CPU column for each invocation.
|
||
4. **Hard gate**: if any cold-start CPU time approaches 50ms (Free plan limit), abort migration. Escalate to paid plan or pivot via `phase-07-alt-pivot.md`.
|
||
5. Record cold-ping P95 wall-clock as `BASELINE_COLD_PING_MS` here:
|
||
|
||
```
|
||
BASELINE_COLD_PING_MS = <fill after measurement>
|
||
```
|
||
|
||
Phase 06 derives the abort threshold from this value: `2.5 × BASELINE_COLD_PING_MS`.
|
||
|
||
## Auto-pause behavior gate (Phase 01 — operator-run)
|
||
|
||
In Atlas UI, manually pause the cluster, then hit `/__mongo-ping`. Confirm:
|
||
- Driver throws within 5s (does NOT hang indefinitely).
|
||
- Error class is `MongoServerSelectionError` (or driver subclass).
|
||
- Phase 02 `getDb()` catches this and surfaces a 503.
|
||
|
||
## Node API surface
|
||
|
||
`src/` (the Worker) imports zero `node:*` modules today. `nodejs_compat_v2` is enabled solely for the `mongodb` driver:
|
||
|
||
| Module | Used by Worker? | Used by scripts/? |
|
||
|---|---|---|
|
||
| `node:fs` | no | yes (build/scrape/migrate) |
|
||
| `node:path` | no | yes |
|
||
| `node:child_process` | no | yes (migrate.js) |
|
||
| `node:net` | indirectly (via mongodb) | no |
|
||
| `node:tls` | indirectly (via mongodb) | no |
|
||
| `node:crypto` | indirectly (via mongodb) | no |
|
||
| `process.env` | no | yes (register.js) |
|
||
| `Buffer` | no | no |
|
||
|
||
Risk: minimal. No existing module relies on the absence of these globals.
|
||
|
||
## Rollback
|
||
|
||
If migration is abandoned at any phase before cutover:
|
||
|
||
1. `wrangler secret delete MONGODB_URI`
|
||
2. Revert `wrangler.toml`: remove `compatibility_flags = ["nodejs_compat_v2"]`.
|
||
3. `npm uninstall mongodb`.
|
||
4. `npm run deploy` — bot continues on KV/D1 unchanged.
|
||
5. (Optional) Delete Atlas cluster from UI.
|
||
|
||
`scripts/check-secret-leaks.js` should stay — it covers other secrets too.
|
||
|
||
## Rotation
|
||
|
||
`MONGODB_URI` rotation cadence: every 90 days, owner = repo maintainer.
|
||
|
||
Procedure:
|
||
1. In Atlas UI → Database Access → edit `miti99bot-worker` → reset password.
|
||
2. Update `.env.deploy` with new URI.
|
||
3. `wrangler secret put MONGODB_URI` (paste new URI).
|
||
4. `npm run deploy` (re-runs register; no Worker restart needed since secret reads at request time via `env.MONGODB_URI`).
|
||
|
||
Mismatch between `.env.deploy` and CF secret causes register-script failure on next deploy — same fail-loud pattern as `TELEGRAM_WEBHOOK_SECRET`.
|
||
|
||
## Alerts
|
||
|
||
Configured in Atlas free-tier UI:
|
||
|
||
- **Cluster unavailable** → email maintainer.
|
||
- **Current connections > 400** (80% of cap) → email maintainer.
|
||
|
||
Plus CF Observability rule (Phase 06): >10 errors per 1 min window → email.
|