4 Commits

Author SHA1 Message Date
tiennm99 eead216338 feat: replace xlsx (SheetJS) build pipeline with Rust xlsxread CLI (#1)
* feat(xlsxread): vendor Rust binary cloned from thptqg2017

Copies the xlsxread Rust crate from thptqg2017@8b4a755 (chore/xlsxread-rust).
Adds format_detect_2016 module with per-file column-layout auto-detection
mirroring detectFormat() in scripts/build-database.js (lines 63-87):
  - separate-scores: SBD/HOTEN/TOAN... fixed columns (dhhanghai files)
  - mapped: header-derived SOBAODANH|SBD + DIEM_THI dynamic indices
  - default: positional 6-col layout (no header)

Extends ParsedRow with ten_cum_thi and gioi_tinh fields.
Adds SCORE_FIELDS_2016 (12 cols: tieng_duc/tieng_nhat; no khtn/khxh/tieng_nga).
Adds thptqg2016-data.toml config with 18-column schema and format_detection flag.
58 tests pass (50 unit + 8 integration), 0 failures.

* feat(build): wire build:db to xlsxread CLI

Replaces the Node.js build:db script with the xlsxread Rust binary.
Adds build:rust script for the cargo compile step in isolation.

* chore: remove deprecated build-database.js

Superseded by the xlsxread Rust binary. All 119 source files (4 .xls +
115 .xlsx) are now processed by xlsxread with per-file format detection.

* ci: build xlsxread before running database build job

Adds dtolnay/rust-toolchain@stable and Swatinem/rust-cache@v2 steps
before the xlsxread build and database generation steps. Node/pnpm
steps now follow the Rust build rather than preceding it.

* chore: add Rust build artifacts to .gitignore

* docs: update README build instructions for xlsxread pipeline

* chore(deps): drop xlsx and better-sqlite3 from package.json and lockfile
2026-05-19 16:33:20 +07:00
tiennm99 518796adab refactor: rename assets/ to data/ for raw Excel inputs 2026-04-14 21:20:10 +07:00
tiennm99 f63bf0f12b feat: add diacritics-insensitive name search
Add ho_ten_ascii column with normalized names (no diacritics, lowercase)
so users can search "nguyen van a" to find "NGUYỄN VĂN A".

- ASCII input searches against ho_ten_ascii column
- Vietnamese input searches both ho_ten and ho_ten_ascii
- Indexed for fast lookups
2026-04-14 20:15:20 +07:00
tiennm99 6a0b6cd9dd feat: add GitHub Pages app with custom SQL query support
- React + Vite + sql.js frontend with two-tab UI:
  - Quick search by exam ID (alphanumeric) or student name
  - Custom SQL query editor with 7 preset queries
- Build script handles all 5 Excel data formats (varied column
  orders, separate score columns, no-header, .xls/.xlsx)
- Database: 877,461 students with exam center, gender, 12 subjects
  (including 4 foreign languages: French, German, Japanese, Chinese)
- GitHub Actions CI/CD: build DB, gzip compress, deploy to Pages
- Safety: read-only queries only, auto LIMIT 1000, Ctrl+Enter shortcut
2026-04-14 20:05:38 +07:00