mirror of
https://github.com/tiennm99/phow2sim.git
synced 2026-05-28 20:21:09 +00:00
8dd17acd4f
Tiny FastAPI service over PhoW2V Vietnamese word vectors. Mirrors word2sim's endpoint shapes (/similarity /neighbors /vocab /random) so clients can swap URLs without code changes. - Auto-downloads VinAI's PhoW2V on first boot, caches binary .bin for ~5x faster restarts - Viet-aware canonicalizer: exact -> lowercase -> space-to-underscore - Supports both word (compound) and syllable variants via env - Unicode-aware random-word filter accepts diacritics, rejects digits/punct