Now that both modules run on Workers AI embeddings, drop the legacy
Word2SimError alias, the unused wordlist helpers (getLine, LINE_COUNT,
pickFromPool), and every comment/README section still describing the
removed ConceptNet backend. Fix the bge-small doc typo in semantle/index.js
and align the semantle api-client test fake-vector dim with the real
384-dim output.
Use the full google-10000-english list verbatim (normalize only —
lowercase + dedupe, no length or alpha filtering). Pool goes from 7953
to 9894 entries; rare/short/long picks are still sieved by ConceptNet's
verify-and-fallback at round start.
Replaces TARGET_POOL/pickFromPool with a clearer line-based API:
LINE_COUNT — how many entries
randomLine() — uniform pick
getLine(n) — nth entry (n = frequency rank)
pickFromPool retained as a back-compat re-export so existing callers
don't break.
The ~250-word hand-curated TARGET_POOL was too small for long-term play.
Replaces it with a build-script-generated dictionary:
- scripts/build-semantle-words.js fetches first20hours/google-10000-english
(no-swears variant), filters to 4–10 ASCII letters, drops the top-200
most frequent function words, and writes src/modules/semantle/words-data.js
as a static ES-module export.
- wordlist.js now just re-exports that data via TARGET_POOL + pickFromPool.
- package.json: new build:semantle-words script; chained into `npm run build`
alongside build:wordle-data so `npm run deploy` regenerates automatically.
Pool size: ~250 → 7953 words. Same ConceptNet verify-and-fallback flow, so
low-quality picks still cost at most one extra concept lookup.