feat: structured logging, JSDoc types, and WEBHOOK_SECRET guard

- Replace string-interpolated console.log/error with JSON.stringify
  for searchable/filterable CF Workers dashboard logs
- Add shared JSDoc typedefs (Subscriber, QueueMessage, ChatTarget)
  with @param/@returns annotations on key functions
- Guard against undefined WEBHOOK_SECRET env var (auth bypass on
  misconfigured deploy)
- Add 3 declined features to feature-decisions.md (fan-out decoupling,
  idempotency keys, /ping)
This commit is contained in:
2026-04-09 11:30:00 +07:00
parent 39afb0fd68
commit 7949da3734
7 changed files with 104 additions and 31 deletions

View File

@@ -6,7 +6,32 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
## Declined Features
### 1. Admin Commands (/stats)
### 1. Fan-Out Decoupling (Two-Phase Queue)
**Idea**: Webhook handler enqueues a single "dispatch" message; queue consumer lists subscribers and re-enqueues individual "deliver" messages. Converts O(N) webhook handler to O(1).
**Decision**: Skip. Current subscriber count is small. The webhook handler completing in one pass is simpler to reason about and debug. Adding a two-phase queue introduces message type routing, a new queue message schema, and makes the data flow harder to follow — all for a scaling problem that doesn't exist yet.
**Why this rank**: Clear trigger: webhook response times or CPU usage climbing in CF dashboard. Straightforward to implement when needed.
### 2. Queue Message Idempotency Keys
**Idea**: Include `{ incidentId, chatId }` hash as dedup key. Check short-TTL KV key before sending to prevent duplicate delivery on queue retries.
**Decision**: Skip. Duplicate notifications are a minor UX annoyance, not a correctness issue. Adding a KV read+write per message doubles KV operations in the queue consumer for a rare edge case (crash between successful Telegram send and `msg.ack()`). CF Queues retry is already bounded to 3 attempts.
**Why this rank**: Only worth it if users report duplicate notifications as a real problem.
### 3. /ping Command
**Idea**: Bot replies with worker region + timestamp for liveness check.
**Decision**: Skip. `/status` already proves the bot is alive (it fetches from external API and replies). A dedicated `/ping` adds another command for marginal value. The web health check endpoint (`GET /`) serves the same purpose for monitoring.
**Why this rank**: Trivial to add but not useful enough to justify another command.
### 4. Admin Commands (/stats)
**Idea**: `/stats` to show subscriber count, recent webhook events (useful for bot operator).
@@ -14,7 +39,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why highest**: Low effort, no architectural changes. Just a new command + `kv.list()` count. First thing to add if the bot grows.
### 2. Webhook HMAC Signature Verification
### 5. Webhook HMAC Signature Verification
**Idea**: Verify Statuspage webhook payloads using HMAC signatures as a second auth layer beyond URL secret.
@@ -22,7 +47,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Not blocked by effort — blocked by platform. Would be implemented immediately if Atlassian ships HMAC support.
### 3. Proactive Rate Limit Tracking
### 6. Proactive Rate Limit Tracking
**Idea**: Track per-chat message counts to stay within Telegram's rate limits proactively.
@@ -30,7 +55,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Becomes necessary at scale. Clear trigger: frequent 429 errors in logs.
### 4. Status Change Deduplication
### 7. Status Change Deduplication
**Idea**: If a component flaps (operational → degraded → operational in 2 minutes), debounce into one message.
@@ -38,7 +63,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Useful if flapping becomes noisy. Moderate effort with clear user-facing benefit.
### 5. Inline Keyboard for /subscribe
### 8. Inline Keyboard for /subscribe
**Idea**: Replace text commands with clickable buttons using grammY's inline keyboard support.
@@ -46,7 +71,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Nice UX polish but not functional gap. grammY supports it well — moderate effort.
### 6. Scheduled Status Digest
### 9. Scheduled Status Digest
**Idea**: CF Workers `scheduled` cron trigger sends a daily "all clear" or summary to subscribers.
@@ -54,7 +79,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Low user value. Only useful if users explicitly request daily summaries.
### 7. Mute Command (/mute \<duration>)
### 10. Mute Command (/mute \<duration>)
**Idea**: Temporarily pause notifications without unsubscribing (e.g., `/mute 2h`).
@@ -62,7 +87,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Contradicts real-time purpose. `/stop` + `/start` is sufficient.
### 8. Multi-Language Support
### 11. Multi-Language Support
**Idea**: At minimum English/Vietnamese support.
@@ -70,7 +95,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Source data is English-only. Translating bot chrome while incidents stay English creates a mixed-language experience.
### 9. Web Dashboard
### 12. Web Dashboard
**Idea**: Replace the `/` health check with a status page showing subscriber count and recent webhook events.
@@ -78,7 +103,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Out of scope. The bot is the product — adding a web frontend changes the project's nature.
### 10. Dead Letter Queue for Failed Messages
### 13. Dead Letter Queue for Failed Messages
**Idea**: After CF Queues exhausts 3 retries, persist failed messages to KV or a dedicated DLQ for debugging.
@@ -86,7 +111,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Logging is sufficient for current scale. Revisit only if log retention (3-day free tier) is too short for debugging patterns.
### 11. KV List Scalability (Subscriber Sharding)
### 14. KV List Scalability (Subscriber Sharding)
**Idea**: Shard subscriber keys by event type (e.g., `sub:incident:{chatId}`, `sub:component:{chatId}`) to avoid listing all subscribers on every webhook.
@@ -94,7 +119,7 @@ Ordered by likelihood of future implementation (top = most likely to revisit).
**Why this rank**: Clear trigger: slow webhook response times at high subscriber counts. Migration path is straightforward when needed.
### 12. Digest / Quiet Mode
### 15. Digest / Quiet Mode
**Idea**: Batch notifications into a daily summary instead of instant alerts.