Files
tiennm99 eead216338 feat: replace xlsx (SheetJS) build pipeline with Rust xlsxread CLI (#1)
* feat(xlsxread): vendor Rust binary cloned from thptqg2017

Copies the xlsxread Rust crate from thptqg2017@8b4a755 (chore/xlsxread-rust).
Adds format_detect_2016 module with per-file column-layout auto-detection
mirroring detectFormat() in scripts/build-database.js (lines 63-87):
  - separate-scores: SBD/HOTEN/TOAN... fixed columns (dhhanghai files)
  - mapped: header-derived SOBAODANH|SBD + DIEM_THI dynamic indices
  - default: positional 6-col layout (no header)

Extends ParsedRow with ten_cum_thi and gioi_tinh fields.
Adds SCORE_FIELDS_2016 (12 cols: tieng_duc/tieng_nhat; no khtn/khxh/tieng_nga).
Adds thptqg2016-data.toml config with 18-column schema and format_detection flag.
58 tests pass (50 unit + 8 integration), 0 failures.

* feat(build): wire build:db to xlsxread CLI

Replaces the Node.js build:db script with the xlsxread Rust binary.
Adds build:rust script for the cargo compile step in isolation.

* chore: remove deprecated build-database.js

Superseded by the xlsxread Rust binary. All 119 source files (4 .xls +
115 .xlsx) are now processed by xlsxread with per-file format detection.

* ci: build xlsxread before running database build job

Adds dtolnay/rust-toolchain@stable and Swatinem/rust-cache@v2 steps
before the xlsxread build and database generation steps. Node/pnpm
steps now follow the Rust build rather than preceding it.

* chore: add Rust build artifacts to .gitignore

* docs: update README build instructions for xlsxread pipeline

* chore(deps): drop xlsx and better-sqlite3 from package.json and lockfile
2026-05-19 16:33:20 +07:00
..