* feat(xlsxread): vendor Rust binary cloned from thptqg2017
Copies the xlsxread Rust crate from thptqg2017@8b4a755 (chore/xlsxread-rust).
Adds format_detect_2016 module with per-file column-layout auto-detection
mirroring detectFormat() in scripts/build-database.js (lines 63-87):
- separate-scores: SBD/HOTEN/TOAN... fixed columns (dhhanghai files)
- mapped: header-derived SOBAODANH|SBD + DIEM_THI dynamic indices
- default: positional 6-col layout (no header)
Extends ParsedRow with ten_cum_thi and gioi_tinh fields.
Adds SCORE_FIELDS_2016 (12 cols: tieng_duc/tieng_nhat; no khtn/khxh/tieng_nga).
Adds thptqg2016-data.toml config with 18-column schema and format_detection flag.
58 tests pass (50 unit + 8 integration), 0 failures.
* feat(build): wire build:db to xlsxread CLI
Replaces the Node.js build:db script with the xlsxread Rust binary.
Adds build:rust script for the cargo compile step in isolation.
* chore: remove deprecated build-database.js
Superseded by the xlsxread Rust binary. All 119 source files (4 .xls +
115 .xlsx) are now processed by xlsxread with per-file format detection.
* ci: build xlsxread before running database build job
Adds dtolnay/rust-toolchain@stable and Swatinem/rust-cache@v2 steps
before the xlsxread build and database generation steps. Node/pnpm
steps now follow the Rust build rather than preceding it.
* chore: add Rust build artifacts to .gitignore
* docs: update README build instructions for xlsxread pipeline
* chore(deps): drop xlsx and better-sqlite3 from package.json and lockfile
Add ho_ten_ascii column with normalized names (no diacritics, lowercase)
so users can search "nguyen van a" to find "NGUYỄN VĂN A".
- ASCII input searches against ho_ten_ascii column
- Vietnamese input searches both ho_ten and ho_ten_ascii
- Indexed for fast lookups