Files
miti99bot/internal/telegram/webhook.go
T
tiennm99 a8ed67a0a3 refactor: audit-driven hygiene pass across modules and infra
Concurrency
- lolschedule: serialize subscriber Get→mutate→Put via state.subscribersMu;
  the single-slot list was previously losing writes under concurrent
  /lolschedule_subscribe.
- trading: PriceClient memoises its default *http.Client so /trade_stats
  reuses TLS connections across held tickers.

Observability
- server/log_middleware: defer the req log line and recover panics so a
  panicking cron handler still emits the structured req entry CloudWatch
  filters on for 5xx alerting.
- server/router (cron): inner recover with cron-name context captures the
  panicking job before the middleware's safety net does.
- telegram/webhook: rune-safe truncation in dispatch logs — Vietnamese,
  Korean, and emoji previews no longer ship as garbled bytes.
- lolschedule/api_client: same rune-safe fix for error-body log truncation.
- telegram/webhook: gate the post-recover WriteHeader(200) so a panicking
  handler that already touched w doesn't trigger superfluous-WriteHeader.

Correctness
- twentyq: clearGame error during solved-relaunch is logged instead of
  silently swallowed (was a permanent deadlock vector on KV failure).
- misc /mstats: KV read failure replies "Could not load stats. Try again
  later." to the user instead of returning into the dispatcher; matches the
  pattern other modules use.
- migrate_cf_data trading-audit-dump: surface f.Close error so a truncated
  JSONL never passes silently as a complete audit dump.

Operator ergonomics
- migrate_cf_data (all 4 subcommands): signal.NotifyContext for SIGINT /
  SIGTERM. Ctrl-C mid-Scan now propagates cleanly instead of leaving a
  half-converted DynamoDB table.
- ai/ratelimit: doc the Lambda-recycle memory bound to match keylock.Map
  so a future reviewer doesn't re-flag the unbounded map.

I/O-changing (user-approved)
- lolschedule daily push auto-prunes subscribers whose Telegram error
  matches a terminal marker (blocked / deactivated / chat gone). Transient
  errors keep the chat on the list. Subscribe message updated to mention
  the auto-cleanup.
- twentyq seed pool grown 50 → 178; repeat-collision threshold moves from
  ~9 plays to ~17 (birthday paradox).
- util /info flipped Public → Protected — chat/thread/sender IDs are no
  longer enumerable by every group member.
- cmd/server WriteTimeout 6min → 75s (cron 60s + 15s slack). No-op on
  Lambda; matters only for local non-Lambda runs.
- webhook + cron rejection paths drop response bodies (no fingerprintable
  text for internet scanners hitting the public Function URL). Status
  codes preserved for CloudWatch metrics; structured log lines carry the
  rejection reason for operator triage.

Tests added: TestTruncateRunes, TestRunDailyPush_PrunesDeadSubscribers,
TestIsTerminalSendError, TestInfo_DeniedToNonOwner,
TestInfo_DeniedToChannelMessageNoFrom, plus owner-allowed counterparts.
2026-05-16 13:35:00 +07:00

150 lines
5.2 KiB
Go

package telegram
import (
"context"
"crypto/subtle"
"encoding/json"
"errors"
"net/http"
"runtime/debug"
"time"
"unicode/utf8"
"github.com/go-telegram/bot"
"github.com/go-telegram/bot/models"
"github.com/tiennm99/miti99bot/internal/log"
)
// secretTokenHeader is the case-insensitive HTTP header Telegram sets when it
// POSTs an update to the webhook. It must equal the value passed to setWebhook.
// See: https://core.telegram.org/bots/api#setwebhook
// #nosec G101 — header name, not credential value
const secretTokenHeader = "X-Telegram-Bot-Api-Secret-Token"
// maxWebhookBody bounds inbound JSON. Telegram updates are well under 100 KiB
// even with media; 1 MiB is a defensive ceiling against malformed clients.
const maxWebhookBody = 1 << 20
// handlerTimeout caps a single Telegram update handler. Telegram retries after
// 60s of no 2xx; 10s leaves headroom for outbound API calls inside handlers
// without holding a Lambda instance long enough to block other updates.
const handlerTimeout = 10 * time.Second
// WebhookHandler returns an http.HandlerFunc that validates Telegram's secret
// token (constant-time) and dispatches the update synchronously to the bot.
//
// Dispatch is synchronous because the bot is constructed with
// bot.WithNotAsyncHandlers — handlers run inside this goroutine, so r.Context()
// stays live and bounded by handlerTimeout.
//
// secret must be non-empty; main is responsible for failing-fast at startup.
func WebhookHandler(b *bot.Bot, secret string) http.HandlerFunc {
secretBytes := []byte(secret)
return func(w http.ResponseWriter, r *http.Request) {
// Rejection paths use bare status codes (no response body) so internet
// scanners hitting the public Function URL can't fingerprint this as a
// Telegram webhook from the response text. CloudWatch metric filters
// still see the distinct status codes (401 / 405 / 413 / 400), and the
// structured log lines below carry the *reason* for operator triage.
if r.Method != http.MethodPost {
log.Warn("webhook rejected", "reason", "method", "method", r.Method)
w.WriteHeader(http.StatusMethodNotAllowed)
return
}
got := []byte(r.Header.Get(secretTokenHeader))
if subtle.ConstantTimeCompare(got, secretBytes) != 1 {
log.Warn("webhook rejected", "reason", "secret_mismatch")
w.WriteHeader(http.StatusUnauthorized)
return
}
r.Body = http.MaxBytesReader(w, r.Body, maxWebhookBody)
var update models.Update
if err := json.NewDecoder(r.Body).Decode(&update); err != nil {
// MaxBytesReader returns *http.MaxBytesError when the cap is hit;
// surface 413 distinctly so Telegram (and ops dashboards) can
// distinguish "body too big" from generic malformed JSON.
var maxBytesErr *http.MaxBytesError
if errors.As(err, &maxBytesErr) {
log.Warn("webhook rejected", "reason", "body_too_large")
w.WriteHeader(http.StatusRequestEntityTooLarge)
return
}
log.Warn("webhook rejected", "reason", "bad_json", "err", err)
w.WriteHeader(http.StatusBadRequest)
return
}
logDispatch(&update)
ctx, cancel := context.WithTimeout(r.Context(), handlerTimeout)
defer cancel()
// Recover panics so a buggy handler does not propagate up to the
// http.Server (which would close the response mid-write and trigger
// Telegram's 24-hour retry loop on the same poisoned update).
panicked := false
func() {
defer func() {
if rec := recover(); rec != nil {
panicked = true
log.Error("webhook handler panic",
"panic", rec,
"stack", string(debug.Stack()))
}
}()
b.ProcessUpdate(ctx, &update)
}()
// Suppress the trailing 200 if a panic occurred: a poisoned handler
// may have already written headers/body, and a second WriteHeader
// here emits `superfluous response.WriteHeader` noise. The
// LogRequests middleware will mark this as 500 from its own recover
// path; we just stay quiet here.
if !panicked {
w.WriteHeader(http.StatusOK)
}
}
}
// dispatchTextPreview caps message text in dispatch logs so chatty media
// captions or long DM threads don't bloat CloudWatch / drive up cost.
const dispatchTextPreview = 64
// truncateRunes returns the longest prefix of s whose UTF-8 byte length is
// <= maxBytes AND that ends on a rune boundary. Byte-slicing alone would
// split a multi-byte rune (Vietnamese, emoji, CJK), producing invalid UTF-8
// in the log line that downstream JSON encoders replace with U+FFFD.
func truncateRunes(s string, maxBytes int) string {
if len(s) <= maxBytes {
return s
}
cut := maxBytes
for cut > 0 && !utf8.RuneStart(s[cut]) {
cut--
}
return s[:cut]
}
// logDispatch emits a single structured line per inbound update so the
// CloudWatch trail has chat type + command text without resorting to
// the library's pointer-printing debug mode. Cheap (no allocation when
// the message is short) and fires once per webhook hit.
func logDispatch(u *models.Update) {
if u == nil || u.Message == nil {
return
}
text := u.Message.Text
if text == "" {
text = u.Message.Caption
}
if len(text) > dispatchTextPreview {
text = truncateRunes(text, dispatchTextPreview) + "…"
}
log.Info("dispatch",
"update_id", u.ID,
"chat_id", u.Message.Chat.ID,
"chat_type", string(u.Message.Chat.Type),
"text", text,
)
}