diff --git a/plans/reports/researcher-260422-2329-semantle-api-alternatives.md b/plans/reports/researcher-260422-2329-semantle-api-alternatives.md
new file mode 100644
index 0000000..733302b
--- /dev/null
+++ b/plans/reports/researcher-260422-2329-semantle-api-alternatives.md
@@ -0,0 +1,395 @@
+# Semantle API Alternatives Research Report
+**Date:** 2026-04-22 | **Project:** miti99bot
+
+---
+
+## Executive Summary
+
+**Recommendation: Cloudflare Workers AI (BGE-base-en-v1.5) + Vectorize for production. Runner-up: Self-hosted precomputed embeddings (GloVe/R2).**
+
+ConceptNet's unreliability (502 errors) requires immediate replacement. The consensus winner is **Cloudflare Workers AI embeddings** because:
+- Native to CF Workers (no fetch latency overhead, binding-based)
+- Proven at scale with edge inference (<100ms cold start, ~50-200ms per embedding)
+- Free tier: 10M input tokens/month (sufficient for 10k-20k single-word requests)
+- Cosine similarity built-in (768-dim BGE vectors match ConceptNet's semantic space)
+- Solves OOV detection via vocabulary check on ingestion
+
+For "free tier only" projects (100% cost-conscious), **precomputed GloVe vectors in R2/KV** is feasible if you accept a one-time ~10MB upload and manual vocab checking.
+
+---
+
+## Comparison Table
+
+| Provider | Auth | Free Tier | Latency | Similarity API | OOV Support | CF Workers Fit | Verdict |
+|----------|------|-----------|---------|----------------|-------------|----------------|---------|
+| **CF Workers AI (BGE)** | Binding | 10M tokens/mo | ~50-200ms | Cosine (768d) | Via vocab list | Native ⭐⭐⭐ | **RECOMMENDED** |
+| **CF Vectorize** | Binding | 30M dimensions/mo | ~30ms | Cosine query | Via storage | Native ⭐⭐⭐ | Best for scale |
+| **HuggingFace Inference** | API key | ~100 req/hr free | 500ms-2s cold | Cosine (384d) | Yes | Fetch OK ⭐⭐ | Viable but slow |
+| **OpenAI text-embedding-3-small** | API key | $0.02/1M tokens | ~200-500ms | Cosine | Yes | Fetch OK ⭐⭐ | Overkill, cost adds up |
+| **Replicate** | API key | Free w/ credits | 500ms+ | Cosine | Yes | Fetch OK ⭐⭐ | Slower than CF AI |
+| **Datamuse API** | None (free) | ∞ | ~100-300ms | No (ranking only) | Yes | Fetch OK ⭐⭐⭐ | **No similarity score** ❌ |
+| **GloVe (self-hosted KV/R2)** | None | ✓ (one-time) | ~10-50ms | Cosine (300d) | Manual | Fastest ⭐⭐⭐ | Great for small vocab |
+| **Word2Vec REST APIs** | Varies | Mostly down | Variable | Cosine | Yes | Fetch | Dead/unreliable |
+
+---
+
+## Detailed Option Analysis
+
+### 1. ⭐ **Cloudflare Workers AI (BGE-base-en-v1.5) — RECOMMENDED**
+
+**What it does:** Generates 768-dimensional sentence/word embeddings via BAAI's BGE model, runs on CF edge infrastructure.
+
+**Call pattern:**
+```javascript
+const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", {
+  text: "word_to_embed"
+});
+// Returns { shape: [768], data: [float...], pooling: "mean" }
+```
+
+**Pros:**
+- No fetch overhead — native CF Workers binding, direct to inference layer
+- Cold start <100ms, typical latency 50-200ms for single words (ideal for game speed)
+- 768 dimensions match semantic similarity expectations (ConceptNet-like quality)
+- Free tier: 10M input tokens/month = ~100k single-word embeddings (sufficient for 100-200 games/day if each game = 5 guesses)
+- Cosine similarity scoring built-in at client (just compute `dot(a, b) / (||a|| * ||b||)`)
+- Paid tier: $0.067/M input tokens (cheap relative to OpenAI)
+
+**Cons:**
+- Requires Workers paid plan for Workers AI access (~$15/month base, then token-metered on top)
+- OOV detection not native to embeddings — must maintain separate vocab list or batch-verify words
+- 768 dimensions = ~3KB per cached embedding; not huge but adds up for 10k words
+
+**OOV handling:** Pre-load Google-10000-english into KV (~300KB), check membership before calling similarity. Or call embeddings on both words; if confidence is suspiciously uniform, flag OOV.
+
+**Real-world latency:** One production report: "global latency under 80ms p50" for embeddings via CF Workers.
+
+**Cost at scale:**
+- 100 games/day × 5 guesses/game × 2 words/guess = 1000 embeddings/day
+- 1000 × 30 days = 30k tokens/month → free tier sufficient (10M tokens)
+- At paid tier: negligible ($0.002/month)
+
+**Recommendation:** Ship with this. It's the path of least resistance and best latency.
+
+---
+
+### 2. ⭐⭐ **Cloudflare Vectorize — BEST FOR SCALE**
+
+**What it does:** Managed vector database; store embeddings by word key, query by cosine similarity.
+
+**Architecture:**
+1. Pre-compute all 10k words' embeddings via Workers AI in a setup task
+2. Store in Vectorize index (768 dimensions, cosine metric)
+3. At game start, query Vectorize: `index.query(targetWordEmbedding, topK=1)` to verify it's in vocab
+4. On each guess, fetch both embeddings from Vectorize (cached) + compute similarity client-side OR use Vectorize search with custom scoring
+
+**Pros:**
+- Median query latency 30-31ms (faster than Workers AI)
+- Free tier: 30M queried vectors/month = ~1M queries/month (plenty for hobby traffic)
+- Pre-computed vectors cached globally (no re-embedding per-game)
+- Deterministic: same two words always return same score
+
+**Cons:**
+- Setup overhead: must pre-compute and upload all 10k embeddings once
+- Requires Vectorize binding + Workers AI for initial embedding generation
+- Query cost if not in free tier ($0.01/1M queried dimensions)
+- Overkill for simple 10k-word game; adds operational complexity
+
+**Cost at scale:** Negligible if free tier applies. At paid tier (unlikely): $0.000001 per query.
+
+**Recommendation:** Consider after MVP ships. For initial launch, Workers AI direct is simpler. Migrate to Vectorize if you scale beyond 100k guesses/month or want <50ms guarantees.
+
+---
+
+### 3. **HuggingFace Inference API — VIABLE BUT SLOW**
+
+**What it does:** API to sentence-transformers (all-MiniLM-L6-v2, all-mpnet-base-v2, etc.). Free tier has rate limits.
+
+**Call pattern:**
+```javascript
+const response = await fetch("https://api-inference.huggingface.co/models/sentence-transformers/all-MiniLM-L6-v2", {
+  headers: { Authorization: `Bearer ${HF_TOKEN}` },
+  method: "POST",
+  body: JSON.stringify({ inputs: ["word"] })
+});
+```
+
+**Pros:**
+- Free tier available (~few hundred req/hr limit)
+- Many sentence-transformer models to choose from
+- OOV is implicit (any word gets an embedding)
+- No account setup beyond free HF profile
+
+**Cons:**
+- Cold start latency: 30-60 seconds on free tier (models unloaded on-demand)
+- Even after warmup, 500ms-2s per request
+- Rate limits strict on free tier (~100 requests/hour)
+- 384 dimensions (smaller than BGE, may affect quality)
+- Fetch round-trip from CF Workers adds another 50-100ms
+
+**Adoption risk:** Free tier is unreliable for production games (rate limits + cold starts violate 5s timeout). Requires paid tier ($9/month for 2M credits) to be usable.
+
+**Recommendation:** Skip. Workers AI is superior and same cost (or free).
+
+---
+
+### 4. **OpenAI text-embedding-3-small — OVERKILL**
+
+**What it does:** Industry-standard embeddings via OpenAI API.
+
+**Pros:**
+- Best-in-class embeddings quality
+- Simple API, well-documented
+- No cold starts
+
+**Cons:**
+- $0.02/1M tokens = $0.0002 per word embedding (adds up fast)
+- 100 games/day × 5 guesses × 2 words = 1000 tokens/day = $0.02/day = $0.60/month
+- Overkill for single-word similarity (trained on sentences, wasted capacity)
+- Slower than CF Workers AI (200-500ms typical)
+- Requires API key in Worker env (security overhead)
+- Rate limits apply
+
+**Recommendation:** Too expensive and slow. Only if you value maximum embedding quality above cost/latency (not applicable here).
+
+---
+
+### 5. **Replicate — SLIGHTLY WORSE CF OPTION**
+
+**What it does:** Cloud inference platform supporting embeddings models (multilingual-e5-large, all-mpnet-base-v2, etc.).
+
+**Pros:**
+- Competitive pricing (~$0.11 per run vs OpenAI's $0.51)
+- Wide model choice
+- Cloudflare acquired Replicate in Nov 2025 → may improve integration
+
+**Cons:**
+- Latency 500ms+ (slower than Workers AI + fetch round-trip)
+- Requires separate account + API key
+- Not a binding, so full fetch overhead from CF Workers
+- Pricing less clear (charged by time, not tokens)
+
+**Recommendation:** Skip in favor of Workers AI. Close competitor if CF AI binding becomes unavailable.
+
+---
+
+### 6. **Datamuse API — INSUFFICIENT**
+
+**What it does:** Free word relationship API (rhyming, meaning, spelling, sound-alike).
+
+**Why it fails:**
+- **No numeric similarity score between two words.** Returns ranked lists of related words, not pairwise scores.
+- Scores have "no interpretable meaning" (per official API docs); used for ranking results only.
+- Cannot compute "target vs guess" similarity in the Semantle game format.
+
+**Recommendation:** Rejected. Core requirement not met.
+
+---
+
+### 7. ⭐⭐⭐ **Self-Hosted Precomputed Embeddings (GloVe/R2) — COST-OPTIMAL**
+
+**What it does:** Pre-download GloVe vectors (300d, 6B tokens, 822MB), extract vectors for google-10k words, store in R2, load into KV for fast lookup.
+
+**Architecture:**
+```
+1. Download glove.6B.300d.txt (free, public domain)
+2. Extract 10k words + vectors → ~30MB JSON
+3. Gzip → ~8-10MB, store in R2 (free tier includes 10GB)
+4. On Worker startup: fetch from R2 (or lazy-load per region), cache in KV
+5. similarity(a, b) = cosine(glove[a], glove[b])
+6. OOV: word not in glove dict → return null
+```
+
+**Pros:**
+- Truly free (no API calls, no Workers AI quota)
+- Fastest: cosine similarity is ~5-10ms JS computation
+- Completely deterministic, no API reliability concerns
+- Can pre-compute all 10k × (10k-1) / 2 similarity pairs into KV (expensive, not needed)
+- Offline-first: no upstream dependency
+
+**Cons:**
+- One-time setup: download, parse, compress, upload to R2 (~30 min work)
+- GloVe is ~7 years old; quality lag behind BGE (but still good for word similarity)
+- 300 dimensions vs 768 in BGE (may affect semantic quality, but acceptable for Semantle)
+- ~3KB per cached embedding × 10k words = 30MB in memory if fully loaded (within CF Worker limits, but tight)
+- Manual OOV check: need separate google-10000-english list in KV
+
+**Adoption risk:** Low. GloVe is stable, no API changes. Can cache vectors indefinitely. Only upgrade path is re-download if you want newer embeddings.
+
+**Real cost:** $0 per month.
+
+**Recommendation:** Highly viable for hobby projects. Better than any API if you're cost-optimizing and accept slight quality trade-off. Hybrid: use GloVe for now, upgrade to Workers AI if you need better semantics later.
+
+---
+
+### 8. **Word2Vec REST APIs — OBSOLETE**
+
+Search found several GitHub repos (quhfus/DoSeR, 3Top/word2vec-api, bmzhao/word2vec-rest-api) but:
+- None maintained recently (last commits 2020-2022)
+- No public instances available
+- Self-hosting them defeats the purpose (you'd run a server alongside CF Workers)
+
+**Recommendation:** Skip. GloVe is better maintained if you want self-hosted embeddings.
+
+---
+
+## Migration Sketch for Recommended Option
+
+### Implementation Plan: Cloudflare Workers AI (BGE) + Vectorize Fallback
+
+**Phase 1: Workers AI (MVP)**
+
+Modify `api-client.js`:
+
+```javascript
+export function createClient(options = {}) {
+  const { env, useVectorize = false } = options;
+  
+  // Vocab check: load google-10000-english from KV
+  async function isInVocab(word) {
+    const vocab = await env.KV.get("semantle:vocab-10000");
+    if (!vocab) return true; // pessimistic: assume yes if missing
+    return vocab.includes(word.toLowerCase());
+  }
+
+  return {
+    async randomWord() {
+      // Pick from pool; verify it's "in vocab" by checking if we can embed it
+      for (let i = 0; i < MAX_RANDOM_ATTEMPTS; i++) {
+        const candidate = pickFromPool();
+        try {
+          const inVocab = await isInVocab(candidate);
+          if (inVocab) return { word: candidate, verified: true };
+        } catch {
+          // continue
+        }
+      }
+      return { word: pickFromPool(), verified: false };
+    },
+
+    async similarity(a, b) {
+      const inVocabB = await isInVocab(b);
+      if (!inVocabB) {
+        return {
+          a, b, canonical_a: a, canonical_b: b,
+          in_vocab_a: true, in_vocab_b: false, similarity: null
+        };
+      }
+
+      try {
+        // Call Workers AI to get embeddings
+        const [embA, embB] = await Promise.all([
+          env.AI.run("@cf/baai/bge-base-en-v1.5", { text: a }),
+          env.AI.run("@cf/baai/bge-base-en-v1.5", { text: b })
+        ]);
+
+        // Compute cosine similarity
+        const sim = cosineSimilarity(embA.data, embB.data);
+        return {
+          a, b, canonical_a: a, canonical_b: b,
+          in_vocab_a: true, in_vocab_b: true,
+          similarity: sim
+        };
+      } catch (err) {
+        throw new UpstreamError("workers-ai embedding failed", { cause: err });
+      }
+    }
+  };
+}
+
+function cosineSimilarity(vecA, vecB) {
+  let dotProduct = 0, normA = 0, normB = 0;
+  for (let i = 0; i < vecA.length; i++) {
+    dotProduct += vecA[i] * vecB[i];
+    normA += vecA[i] * vecA[i];
+    normB += vecB[i] * vecB[i];
+  }
+  return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
+}
+```
+
+**Phase 2: Setup Task**
+
+Add to `wrangler.toml`:
+```toml
+[env.production]
+ai = true
+kv_namespaces = [{ binding = "KV", id = "..." }]
+```
+
+Pre-populate KV with vocab list:
+```bash
+# Download google-10000-english, store in KV
+curl -s https://raw.githubusercontent.com/first20hours/google-10000-english/master/google-10000-english.txt | \
+  jq -Rs 'split("\n") | map(select(length > 0))' | \
+  npx wrangler kv:key put --binding=KV "semantle:vocab-10000" -
+```
+
+**Phase 3: Vectorize (Optional, Future)**
+
+Once MVP is stable:
+1. Pre-compute all 10k word embeddings
+2. Store in Vectorize index
+3. Replace `similarity()` with cached lookups
+4. ~30ms per query vs ~200ms (7x speedup, negligible for gameplay)
+
+---
+
+## Cost Breakdown (Monthly)
+
+| Option | Setup | Per-Game (avg) | 100 games/mo | 1000 games/mo | 10k games/mo |
+|--------|-------|----------------|--------------|---------------|--------------|
+| Workers AI (BGE) | $0 | $0 (free tier) | Free | Free | $0.07 |
+| Vectorize | 1h | $0 (cached) | Free | Free | $0.001 |
+| HuggingFace (paid) | $0 | $0.0002 | $0.02 | $0.20 | $2.00 |
+| OpenAI | $0 | $0.0002 | $0.02 | $0.20 | $2.00 |
+| Replicate | $0 | $0.0002 | $0.02 | $0.20 | $2.00 |
+| GloVe (self-host) | 1h | $0 | Free | Free | Free |
+
+---
+
+## Recommendation Summary
+
+| Use Case | Recommendation |
+|----------|---|
+| **MVP / Immediate fix** | Cloudflare Workers AI (BGE-base-en-v1.5) + Google-10k vocab in KV |
+| **Ultra cost-conscious** | GloVe vectors in R2 + KV (one-time setup, zero ongoing cost) |
+| **Production scale (>1k games/mo)** | Workers AI → migrate to Vectorize for caching |
+| **Maximum semantic quality** | Workers AI (BGE is excellent, no need for OpenAI overkill) |
+
+**Ship recommendation:** Go with Workers AI. 50-200ms latency is acceptable for game UX (faster than ConceptNet ever was), free tier covers hobby traffic, and if you grow, Vectorize is a drop-in upgrade.
+
+---
+
+## Unresolved Questions
+
+1. **GloVe quality for single-word semantics?** GloVe trained on document context; single-word embeddings may be noisier than BGE (which uses contrastive learning for dense retrieval). Needs A/B testing if semantics matter (probably doesn't for game).
+
+2. **BGE "mean" vs "cls" pooling?** Current approach uses default "mean" pooling. Does "cls" (CLS token) pooling improve single-word similarity? Requires testing on Semantle target words.
+
+3. **OOV detection robustness?** Relying on google-10000-english for vocab checking; what if player guesses a valid English word outside this list (e.g., "cryptocurrency")? Current approach: fallback to "not in vocabulary" (conservative). Could call embeddings on all words and use confidence/variance heuristics, but adds latency.
+
+4. **Vectorize v2 latency in practice?** Cited 30-31ms median; but is that from CF Workers client or external? If external, add 50-100ms fetch. Need real-world benchmark from within a Worker.
+
+5. **Workers AI quota enforcement?** 10M tokens/month free tier — is this enforced? What happens on overage? (Assumed: immediate billing, no auto-overage blocking, similar to other CF quotas.)
+
+6. **Cloudflare API stability for Workers AI?** ConceptNet is failing; is Workers AI more reliable? (Assumption: yes, it's Cloudflare's own service, not external upstream. Still risk, but lower.)
+
+---
+
+## Sources
+
+- [Cloudflare BGE-base-en-v1.5 embeddings docs](https://developers.cloudflare.com/workers-ai/models/bge-base-en-v1.5/)
+- [Cloudflare Workers AI models](https://developers.cloudflare.com/workers-ai/models/)
+- [Cloudflare Vectorize pricing](https://developers.cloudflare.com/vectorize/platform/pricing/)
+- [Cloudflare Vectorize get-started](https://developers.cloudflare.com/vectorize/get-started/embeddings/)
+- [HuggingFace Inference API](https://huggingface.co/docs/api-inference/en/index)
+- [HuggingFace pricing](https://huggingface.co/docs/inference-providers/pricing)
+- [OpenAI text-embedding-3-small pricing](https://developers.openai.com/api/docs/pricing)
+- [Datamuse API docs](https://www.datamuse.com/api/)
+- [GloVe word embeddings Stanford NLP](https://nlp.stanford.edu/projects/glove/)
+- [google-10000-english corpus](https://github.com/first20hours/google-10000-english)
+- [Replicate embeddings models](https://replicate.com/collections/embedding-models)
+- [Cloudflare Workers AI latency benchmarks](https://www.kalviumlabs.ai/blog/production-ai-on-cloudflare-workers/)
+- [Embeddings API comparison 2026](https://supermemory.ai/blog/best-open-source-embedding-models-benchmarked-and-ranked/)
+- [Cloudflare KV + Vectorize integration](https://dev.to/andyjessop/building-ai-powered-second-brain-in-a-cloudflare-worker-with-cloudflare-vectorize-and-openai-23di)
diff --git a/src/modules/semantle/api-client.js b/src/modules/semantle/api-client.js
index 32c6daf..44ecf94 100644
--- a/src/modules/semantle/api-client.js
+++ b/src/modules/semantle/api-client.js
@@ -1,25 +1,25 @@
 /**
- * @file ConceptNet API client for the semantle module.
+ * @file Cloudflare Workers AI client for the semantle module.
  *
- * ConceptNet endpoints:
- *   GET /relatedness?node1=/c/en/X&node2=/c/en/Y  → { value: number ∈ [-1, 1] }
- *   GET /c/en/{term}                              → { edges: [...] }   (empty ⇒ OOV)
+ * Runs the `@cf/baai/bge-small-en-v1.5` text-embedding model via the `env.AI`
+ * binding, then scores guesses by computing cosine similarity between the
+ * target and guess vectors locally (no extra round-trip).
  *
- * There is no official random-word endpoint, so the client picks a candidate
- * from our local `TARGET_POOL` and verifies it has a ConceptNet entry with
- * edges. After a few failed attempts it falls back to an unverified pick —
- * the curated pool is trusted enough that this should be rare.
+ * Vocabulary: the curated `words-data.js` list (google-10k) doubles as our
+ * in/out-of-vocabulary set — anything outside it is treated as OOV so players
+ * get the "not in the vocabulary" reply instead of a noisy embedding score.
  *
- * The returned `similarity(a, b)` shape mirrors the earlier word2sim contract
- * so handlers/render/state don't have to change.
+ * The returned `similarity(a, b)` shape is kept identical to the prior
+ * ConceptNet/word2sim contract so handlers/render/state stay untouched.
  */
 
 import { pickFromPool } from "./wordlist.js";
+import WORDS from "./words-data.js";
 
-const DEFAULT_API_BASE = "https://api.conceptnet.io";
-const DEFAULT_TIMEOUT_MS = 5000;
-const USER_AGENT = "miti99bot/semantle";
-const MAX_RANDOM_ATTEMPTS = 5;
+const DEFAULT_MODEL = "@cf/baai/bge-small-en-v1.5";
+
+// O(1) membership lookup for OOV detection. Built once per isolate.
+const VOCAB = new Set(WORDS);
 
 export class UpstreamError extends Error {
   /** @param {string} message @param {{status?: number, body?: string, cause?: unknown}} [meta] */
@@ -32,98 +32,60 @@ export class UpstreamError extends Error {
   }
 }
 
-function buildUrl(base, path, params = {}) {
-  const normalized = String(base).replace(/\/+$/, "");
-  const url = new URL(`${normalized}${path}`);
-  for (const [k, v] of Object.entries(params)) {
-    if (v === undefined || v === null) continue;
-    url.searchParams.set(k, String(v));
+function cosineSimilarity(a, b) {
+  if (!a || !b || a.length !== b.length) return null;
+  let dot = 0;
+  let normA = 0;
+  let normB = 0;
+  for (let i = 0; i < a.length; i++) {
+    dot += a[i] * b[i];
+    normA += a[i] * a[i];
+    normB += b[i] * b[i];
   }
-  return url.toString();
-}
-
-async function fetchJson(url, timeoutMs) {
-  const controller = new AbortController();
-  const timer = setTimeout(() => controller.abort(), timeoutMs);
-  let res;
-  try {
-    res = await fetch(url, {
-      headers: { "User-Agent": USER_AGENT, Accept: "application/json" },
-      signal: controller.signal,
-    });
-  } catch (err) {
-    clearTimeout(timer);
-    throw new UpstreamError("conceptnet fetch failed", { cause: err });
-  }
-  clearTimeout(timer);
-  const text = await res.text();
-  if (!res.ok) {
-    throw new UpstreamError(`conceptnet HTTP ${res.status}`, {
-      status: res.status,
-      body: text.slice(0, 500),
-    });
-  }
-  try {
-    return JSON.parse(text);
-  } catch (err) {
-    throw new UpstreamError("conceptnet non-JSON response", { cause: err });
-  }
-}
-
-function hasEdges(concept) {
-  return Array.isArray(concept?.edges) && concept.edges.length > 0;
+  const denom = Math.sqrt(normA) * Math.sqrt(normB);
+  return denom === 0 ? null : dot / denom;
 }
 
 /**
- * @param {string} [apiBase] — override for mirrors/tests (default api.conceptnet.io)
- * @param {{ timeoutMs?: number }} [opts]
+ * @param {{ run: (model: string, input: { text: string[] }) => Promise<{ data: number[][] }> }} ai
+ *   — Workers AI binding (`env.AI`). Tests pass a fake with the same `.run()` shape.
+ * @param {{ model?: string }} [opts]
  */
-export function createClient(apiBase = DEFAULT_API_BASE, { timeoutMs = DEFAULT_TIMEOUT_MS } = {}) {
-  /** @param {string} term */
-  function concept(term) {
-    return fetchJson(buildUrl(apiBase, `/c/en/${encodeURIComponent(term)}`), timeoutMs);
+export function createClient(ai, { model = DEFAULT_MODEL } = {}) {
+  if (!ai || typeof ai.run !== "function") {
+    throw new TypeError("createClient: ai binding with .run(model, input) is required");
   }
 
-  /** @param {string} a @param {string} b */
-  function relatedness(a, b) {
-    return fetchJson(
-      buildUrl(apiBase, "/relatedness", {
-        node1: `/c/en/${a}`,
-        node2: `/c/en/${b}`,
-      }),
-      timeoutMs,
-    );
+  async function embedPair(a, b) {
+    let resp;
+    try {
+      resp = await ai.run(model, { text: [a, b] });
+    } catch (err) {
+      throw new UpstreamError("workers-ai embedding failed", { cause: err });
+    }
+    const data = resp?.data;
+    if (!Array.isArray(data) || data.length < 2) {
+      throw new UpstreamError("workers-ai returned malformed embedding payload");
+    }
+    return [data[0], data[1]];
   }
 
   return {
-    concept,
-    relatedness,
-
     /**
-     * Pick a target word from the local pool. Verifies each candidate has a
-     * ConceptNet entry; falls back to an unverified pick after a few tries.
+     * Pick a target word from the local pool. The pool IS our vocabulary,
+     * so every pick is trivially verified — no upstream check needed.
      * Shape matches the old word2sim `/random` response for handler reuse.
      * @returns {Promise<{ word: string, verified: boolean }>}
      */
     async randomWord() {
-      for (let i = 0; i < MAX_RANDOM_ATTEMPTS; i++) {
-        const candidate = pickFromPool();
-        try {
-          const c = await concept(candidate);
-          if (hasEdges(c)) return { word: candidate, verified: true };
-        } catch {
-          // swallow — try the next candidate
-        }
-      }
-      return { word: pickFromPool(), verified: false };
+      return { word: pickFromPool(), verified: true };
     },
 
     /**
-     * Cosine-like similarity between `a` (target) and `b` (guess). Runs the
-     * edge-check for `b` in parallel with the relatedness call so OOV guesses
-     * are identified on the same round-trip.
+     * Cosine similarity between `a` (target) and `b` (guess). Uses the local
+     * wordlist as the vocabulary — unknown words return `in_vocab_b: false`
+     * with `similarity: null` and skip the inference call entirely.
      *
-     * Shape deliberately mirrors the old word2sim response.
      * @param {string} a
      * @param {string} b
      * @returns {Promise<{
@@ -134,18 +96,13 @@ export function createClient(apiBase = DEFAULT_API_BASE, { timeoutMs = DEFAULT_T
      * }>}
      */
     async similarity(a, b) {
-      const [conceptB, rel] = await Promise.all([concept(b), relatedness(a, b)]);
-      const inVocabB = hasEdges(conceptB);
-      const value = typeof rel?.value === "number" ? rel.value : null;
-      return {
-        a,
-        b,
-        canonical_a: a,
-        canonical_b: b,
-        in_vocab_a: true, // target was verified at round start
-        in_vocab_b: inVocabB,
-        similarity: inVocabB ? value : null,
-      };
+      const base = { a, b, canonical_a: a, canonical_b: b, in_vocab_a: true };
+      if (!VOCAB.has(b)) {
+        return { ...base, in_vocab_b: false, similarity: null };
+      }
+      const [vecA, vecB] = await embedPair(a, b);
+      const sim = cosineSimilarity(vecA, vecB);
+      return { ...base, in_vocab_b: true, similarity: sim };
     },
   };
 }
diff --git a/src/modules/semantle/index.js b/src/modules/semantle/index.js
index 731c71d..bebc7be 100644
--- a/src/modules/semantle/index.js
+++ b/src/modules/semantle/index.js
@@ -1,10 +1,10 @@
 /**
- * @file Semantle module — similarity guessing game backed by ConceptNet.
+ * @file Semantle module — similarity guessing game backed by Cloudflare Workers AI.
  *
- * Targets come from a curated local wordlist (ConceptNet has no /random).
- * Similarity scores come from `api.conceptnet.io/relatedness`. The ConceptNet
- * base URL is hardcoded in the client; tests can still override via
- * `createClient(url)` if needed.
+ * Targets come from a curated local wordlist (same list doubles as the
+ * vocabulary for OOV detection, so no upstream check is needed to pick or
+ * validate a word). Similarity scores come from cosine distance between
+ * `@cf/baai/bge-base-en-v1.5` embeddings produced by the `env.AI` binding.
  */
 
 import { createClient } from "./api-client.js";
@@ -18,9 +18,9 @@ let client = null;
 /** @type {import("../registry.js").BotModule} */
 const semantleModule = {
   name: "semantle",
-  init: async ({ db: store }) => {
+  init: async ({ db: store, env }) => {
     db = store;
-    client = createClient();
+    client = createClient(env.AI);
   },
   commands: [
     {
diff --git a/tests/modules/semantle/api-client.test.js b/tests/modules/semantle/api-client.test.js
index 69a3ed0..e8940a1 100644
--- a/tests/modules/semantle/api-client.test.js
+++ b/tests/modules/semantle/api-client.test.js
@@ -1,4 +1,4 @@
-import { afterEach, describe, expect, it, vi } from "vitest";
+import { describe, expect, it, vi } from "vitest";
 import {
   UpstreamError,
   Word2SimError,
@@ -6,32 +6,24 @@ import {
 } from "../../../src/modules/semantle/api-client.js";
 
 /**
- * ConceptNet stubs — minimal shape the client cares about.
+ * Build a deterministic 768-dim vector from a seed so cosine scores are
+ * reproducible in tests without hardcoding 768 floats.
  */
-function conceptResp(edgeCount = 5) {
-  return {
-    ok: true,
-    text: () =>
-      Promise.resolve(
-        JSON.stringify({
-          edges: Array.from({ length: edgeCount }, (_, i) => ({ id: `e${i}` })),
-        }),
-      ),
-  };
+function fakeVector(seed, dim = 768) {
+  const out = new Array(dim);
+  for (let i = 0; i < dim; i++) out[i] = Math.sin(seed * (i + 1));
+  return out;
 }
 
-function relatednessResp(value) {
-  return {
-    ok: true,
-    text: () => Promise.resolve(JSON.stringify({ value })),
-  };
+/**
+ * Minimal Workers AI binding fake. `impl(model, input)` returns the payload
+ * `env.AI.run()` would normally resolve to.
+ */
+function fakeAi(impl) {
+  return { run: vi.fn(impl) };
 }
 
 describe("semantle/api-client", () => {
-  afterEach(() => {
-    vi.restoreAllMocks();
-  });
-
   describe("UpstreamError", () => {
     it("stores status and body metadata", () => {
       const err = new UpstreamError("test", { status: 404, body: "not found" });
@@ -53,167 +45,99 @@ describe("semantle/api-client", () => {
   });
 
   describe("createClient", () => {
-    it("similarity runs concept + relatedness in parallel", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      const calls = [];
-      global.fetch = vi.fn((url) => {
-        calls.push(String(url));
-        if (url.includes("/relatedness")) return Promise.resolve(relatednessResp(0.45));
-        return Promise.resolve(conceptResp(3));
-      });
-      const res = await client.similarity("apple", "orange");
-      expect(res.similarity).toBe(0.45);
-      expect(res.in_vocab_b).toBe(true);
-      expect(res.canonical_b).toBe("orange");
-      expect(global.fetch).toHaveBeenCalledTimes(2);
-      expect(calls.some((u) => u.includes("/c/en/orange"))).toBe(true);
-      expect(calls.some((u) => u.includes("node1=%2Fc%2Fen%2Fapple"))).toBe(true);
-      expect(calls.some((u) => u.includes("node2=%2Fc%2Fen%2Forange"))).toBe(true);
+    it("throws without a valid AI binding", () => {
+      expect(() => createClient(null)).toThrow(TypeError);
+      expect(() => createClient({})).toThrow(TypeError);
+      expect(() => createClient({ run: "not a function" })).toThrow(TypeError);
     });
 
-    it("similarity flags OOV when the concept endpoint returns no edges", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn((url) => {
-        if (url.includes("/relatedness")) return Promise.resolve(relatednessResp(0.02));
-        return Promise.resolve(conceptResp(0));
-      });
-      const res = await client.similarity("apple", "zzzfoo");
+    it("similarity batches target + guess in a single run() call", async () => {
+      const ai = fakeAi(async (_model, { text }) => ({
+        shape: [text.length, 768],
+        data: text.map((_, i) => fakeVector(i + 1)),
+      }));
+      const client = createClient(ai);
+      await client.similarity("apple", "orange");
+      expect(ai.run).toHaveBeenCalledTimes(1);
+      const [model, input] = ai.run.mock.calls[0];
+      expect(model).toBe("@cf/baai/bge-small-en-v1.5");
+      expect(input).toEqual({ text: ["apple", "orange"] });
+    });
+
+    it("similarity returns cosine score for in-vocab guess", async () => {
+      const ai = fakeAi(async (_model, { text }) => ({
+        data: text.map((_, i) => fakeVector(i + 1)),
+      }));
+      const client = createClient(ai);
+      const res = await client.similarity("apple", "orange");
+      expect(res.in_vocab_a).toBe(true);
+      expect(res.in_vocab_b).toBe(true);
+      expect(res.canonical_a).toBe("apple");
+      expect(res.canonical_b).toBe("orange");
+      expect(typeof res.similarity).toBe("number");
+      expect(res.similarity).toBeGreaterThan(-1);
+      expect(res.similarity).toBeLessThanOrEqual(1);
+    });
+
+    it("similarity returns 1 for identical vectors", async () => {
+      const vec = fakeVector(7);
+      const ai = fakeAi(async () => ({ data: [vec, vec] }));
+      const client = createClient(ai);
+      const res = await client.similarity("apple", "orange");
+      expect(res.similarity).toBeCloseTo(1, 10);
+    });
+
+    it("similarity skips the AI call for OOV guess and flags in_vocab_b:false", async () => {
+      const ai = fakeAi(async () => ({ data: [fakeVector(1), fakeVector(2)] }));
+      const client = createClient(ai);
+      const res = await client.similarity("apple", "zzzfoobarbaz");
       expect(res.in_vocab_b).toBe(false);
       expect(res.similarity).toBe(null);
+      expect(ai.run).not.toHaveBeenCalled();
     });
 
-    it("similarity returns null when relatedness payload lacks a numeric value", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn((url) => {
-        if (url.includes("/relatedness")) {
-          return Promise.resolve({ ok: true, text: () => Promise.resolve("{}") });
-        }
-        return Promise.resolve(conceptResp(5));
+    it("similarity wraps AI.run rejection as UpstreamError", async () => {
+      const ai = fakeAi(async () => {
+        throw new Error("boom");
       });
+      const client = createClient(ai);
+      await expect(client.similarity("apple", "orange")).rejects.toMatchObject({
+        name: "UpstreamError",
+      });
+    });
+
+    it("similarity throws UpstreamError on malformed payload", async () => {
+      const ai = fakeAi(async () => ({ data: [fakeVector(1)] })); // only 1 vector
+      const client = createClient(ai);
+      await expect(client.similarity("apple", "orange")).rejects.toMatchObject({
+        name: "UpstreamError",
+      });
+    });
+
+    it("similarity returns null score when a vector norm is zero", async () => {
+      const zero = new Array(768).fill(0);
+      const ai = fakeAi(async () => ({ data: [zero, fakeVector(1)] }));
+      const client = createClient(ai);
       const res = await client.similarity("apple", "orange");
+      expect(res.in_vocab_b).toBe(true);
       expect(res.similarity).toBe(null);
     });
 
-    it("similarity distinguishes 0 from null", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn((url) => {
-        if (url.includes("/relatedness")) return Promise.resolve(relatednessResp(0));
-        return Promise.resolve(conceptResp(5));
-      });
-      const res = await client.similarity("apple", "orange");
-      expect(res.similarity).toBe(0);
-      expect(res.in_vocab_b).toBe(true);
-    });
-
-    it("randomWord returns a verified pick when edges present", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn(() => Promise.resolve(conceptResp(5)));
+    it("randomWord returns a verified pick from the local pool", async () => {
+      const ai = fakeAi(async () => ({ data: [] }));
+      const client = createClient(ai);
       const res = await client.randomWord();
       expect(typeof res.word).toBe("string");
       expect(res.word.length).toBeGreaterThan(0);
       expect(res.verified).toBe(true);
+      expect(ai.run).not.toHaveBeenCalled();
     });
 
-    it("randomWord falls back to unverified pick after max attempts", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      // Every concept lookup returns zero edges → exhausts retries.
-      global.fetch = vi.fn(() => Promise.resolve(conceptResp(0)));
-      const res = await client.randomWord();
-      expect(res.verified).toBe(false);
-      expect(typeof res.word).toBe("string");
-    });
-
-    it("randomWord swallows transient fetch errors during verification", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      let n = 0;
-      global.fetch = vi.fn(() => {
-        n += 1;
-        // Error for the first few attempts, then succeed.
-        if (n <= 2) return Promise.reject(new Error("transient"));
-        return Promise.resolve(conceptResp(3));
-      });
-      const res = await client.randomWord();
-      expect(res.verified).toBe(true);
-    });
-
-    it("concept throws UpstreamError on non-2xx response", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn(() =>
-        Promise.resolve({
-          ok: false,
-          status: 500,
-          text: () => Promise.resolve("Internal Server Error"),
-        }),
-      );
-      await expect(client.concept("apple")).rejects.toMatchObject({
-        name: "UpstreamError",
-        status: 500,
-        body: "Internal Server Error",
-      });
-    });
-
-    it("concept throws UpstreamError when response is not valid JSON", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn(() =>
-        Promise.resolve({ ok: true, text: () => Promise.resolve("not json") }),
-      );
-      await expect(client.concept("apple")).rejects.toMatchObject({ name: "UpstreamError" });
-    });
-
-    it("concept throws UpstreamError on fetch failure", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn(() => Promise.reject(new Error("network error")));
-      await expect(client.concept("apple")).rejects.toThrow("conceptnet fetch failed");
-    });
-
-    it("truncates response body to 500 chars in UpstreamError", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 50 });
-      const longBody = "x".repeat(600);
-      global.fetch = vi.fn(() =>
-        Promise.resolve({ ok: false, status: 400, text: () => Promise.resolve(longBody) }),
-      );
-      try {
-        await client.concept("apple");
-      } catch (err) {
-        expect(err.body.length).toBe(500);
-      }
-    });
-
-    it("sends User-Agent and Accept headers", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn((_, opts) => {
-        expect(opts.headers["User-Agent"]).toContain("miti99bot");
-        expect(opts.headers.Accept).toBe("application/json");
-        return Promise.resolve(conceptResp(1));
-      });
-      await client.concept("apple");
-    });
-
-    it("strips trailing slashes from the API base URL", async () => {
-      const client = createClient("https://api.test///", { timeoutMs: 100 });
-      global.fetch = vi.fn((url) => {
-        expect(url.startsWith("https://api.test/c/en/")).toBe(true);
-        return Promise.resolve(conceptResp(1));
-      });
-      await client.concept("apple");
-    });
-
-    it("URL-encodes the term path segment", async () => {
-      const client = createClient("https://api.test", { timeoutMs: 100 });
-      global.fetch = vi.fn((url) => {
-        expect(url).toContain("/c/en/hello%20world");
-        return Promise.resolve(conceptResp(1));
-      });
-      await client.concept("hello world");
-    });
-
-    it("defaults to the public ConceptNet base URL when none provided", async () => {
-      const client = createClient();
-      global.fetch = vi.fn((url) => {
-        expect(url.startsWith("https://api.conceptnet.io/")).toBe(true);
-        return Promise.resolve(conceptResp(1));
-      });
-      await client.concept("apple");
+    it("supports model override via options", async () => {
+      const ai = fakeAi(async () => ({ data: [fakeVector(1), fakeVector(2)] }));
+      const client = createClient(ai, { model: "@cf/baai/bge-large-en-v1.5" });
+      await client.similarity("apple", "orange");
+      expect(ai.run.mock.calls[0][0]).toBe("@cf/baai/bge-large-en-v1.5");
     });
   });
 });
diff --git a/wrangler.toml b/wrangler.toml
index 4f84de7..21b2c5d 100644
--- a/wrangler.toml
+++ b/wrangler.toml
@@ -25,6 +25,15 @@ binding = "DB"
 database_name = "miti99bot-db"
 database_id = "261b54e7-0fdb-4fe7-8ed9-2e8a8bcf459c"
 
+# Workers AI — inference binding used by the semantle module for
+# @cf/baai/bge-small-en-v1.5 text embeddings (replaces ConceptNet upstream).
+# Accessed as `env.AI` in handlers. Included on the Workers Free plan:
+# 10,000 Neurons/day at no charge (hard-stops — no billing on Free plan).
+# bge-small is ~0.0037 Neurons/guess → ~2.7M guesses/day within the cap.
+# Pricing: https://developers.cloudflare.com/workers-ai/platform/pricing/
+[ai]
+binding = "AI"
+
 # Cron Triggers — union of all schedules declared by modules.
 # When adding a module with cron entries, append its schedule(s) here.
 # See docs/adding-a-module.md for the full module author workflow.