Files
thptqg2016/docs/codebase-summary.md

3.0 KiB

Codebase Summary

Directory layout

thptqg2016/
├── data/                       # Source Excel files (~100, mixed formats)
├── scripts/
│   └── build-database.js       # Parse Excel → SQLite (build-time, Node + better-sqlite3)
├── public/
│   └── thptqg2016.db           # Generated DB, gzipped during CI
├── src/
│   ├── main.jsx                # React entry
│   ├── App.jsx                 # Root: tabs, lookup logic, useSqlite wiring
│   ├── App.css / index.css     # Design tokens, dark mode, a11y styles
│   ├── hooks/
│   │   └── use-sqlite.js       # Fetch .db.gz + decompress + init sql.js
│   └── components/
│       ├── search-form.jsx     # Input for exam ID / full name
│       ├── score-table.jsx     # Result table for lookups
│       └── custom-query.jsx    # SQL editor + presets + result grid
├── .github/workflows/deploy.yml   # CI: build db → gzip → vite build → Pages
├── vite.config.js              # base: "/thptqg2016/"
└── eslint.config.js

Key modules

scripts/build-database.js

Build-time only. Reads every .xlsx/.xls in data/, detects the header format (three variants), parses the DIEM_THI string via regex or separate score columns, normalizes gender, derives a diacritics-stripped ho_ten_ascii column for accent-insensitive search, and inserts into SQLite with three indexes (ho_ten, ho_ten_ascii, ten_cum_thi).

src/hooks/use-sqlite.js

Streams .db.gz with download progress, decompresses via DecompressionStream("gzip"), loads sql.js (WASM served from the sql.js.org CDN), and returns { db, loading, error, progress }.

src/App.jsx

Two tabs: Lookup and Custom SQL. Lookup auto-detects exam IDs (regex ^[A-Z]{2,4}\d+$) vs names and picks one of three query paths: exact exam ID / ASCII LIKE / original + ASCII LIKE. Capped at 100 rows.

src/components/custom-query.jsx

Whitelists leading keywords (SELECT, PRAGMA, EXPLAIN, WITH), auto-appends LIMIT 1000 when missing, measures performance.now() execution time, and ships 7 preset analytics queries.

student table schema

so_bao_danh  TEXT PRIMARY KEY         -- exam ID
ho_ten       TEXT NOT NULL            -- full name
ho_ten_ascii TEXT NOT NULL            -- diacritics stripped, lowercased
ngay_sinh    TEXT                     -- date of birth
ten_cum_thi  TEXT                     -- exam cluster name
gioi_tinh    TEXT                     -- "Nam" | "Nữ" | NULL
toan, ngu_van, vat_ly, hoa_hoc,       -- REAL (nullable) subject scores
sinh_hoc, lich_su, dia_ly,
tieng_anh, tieng_phap, tieng_duc,
tieng_nhat, tieng_trung

Indexes: idx_ho_ten, idx_ho_ten_ascii, idx_ten_cum_thi.

Conventions

  • JS/JSX filenames: kebab-case (e.g., search-form.jsx, use-sqlite.js)
  • React components: PascalCase named exports
  • UI strings: Vietnamese (target audience)
  • Code comments: English; explain why, not what