Files
tiennm99 7140069d99 feat: replace xlsx (SheetJS) build pipeline with Rust xlsxread CLI (#1)
* feat(xlsxread): stage 0 scaffold with pinned deps and clap CLI skeleton

* feat(xlsxread): stage 1 reader — calamine sheet enumeration and header skip

* feat(xlsxread): stage 2 transform — to_ascii, score regex, validation with 38 unit tests

* feat(xlsxread): stage 3 writer — SQLite DDL, INSERT OR REPLACE, VACUUM, stats output

* feat(xlsxread): stage 4 audit — distinct SBD scan vs DB count, mirrors audit-row-counts.js output

* feat(xlsxread): stage 5 golden tests — in-process xlsx fixtures, 8 integration tests pass

* chore(xlsxread): commit Cargo.lock for reproducible Rust builds

* feat(build): wire root build:db scripts to xlsxread CLI

Replace node scripts/build-database*.js invocations with the Rust
xlsxread binary. Each build:db* script now calls `pnpm build:rust`
(cargo build --release) before invoking the xlsxread build subcommand
with the matching per-dataset config.

Drop xlsx and better-sqlite3 from devDependencies — no Node script
consumes them anymore. sql.js (runtime DB reader in the SPA) is
unaffected and remains in dependencies.

* ci: build xlsxread before running database build jobs

Add dtolnay/rust-toolchain@stable and Swatinem/rust-cache@v2
(workspaces: tools/xlsxread) for warm incremental Rust builds.

Replace the single `pnpm build:db:all` step with explicit xlsxread
invocations so CI doesn't call pnpm build:rust redundantly three times.
The binary is built once, then each of the three datasets is processed
in sequence.

* chore: remove deprecated xlsx-based build scripts

Delete scripts/build-database.js, build-database-old.js,
build-database-old2.js, build-lib.js, and audit-row-counts.js.
Functionality replaced by the xlsxread Rust CLI configured via
tools/xlsxread/configs/*.toml. History preserved in git; one-click
revert available via the chore/migration-backup-260519 branch.

* docs: update README build instructions for xlsxread pipeline

Replace Node.js + xlsx references with Rust + xlsxread workflow.
Update requirements (Node 24+, pnpm, Rust stable), quickstart, scripts
table, and project layout tree to reflect the current state after the
xlsx-based build scripts were removed.

* chore(deps): drop xlsx and better-sqlite3 from package.json and lockfile

Remove xlsx (SheetJS, vulnerable: GHSA-4r6h-8v6p-xvw6, GHSA-5pgg-2g8v-p4x9)
and better-sqlite3 from devDependencies. Both were only used by the now-deleted
Node build scripts. The Rust xlsxread CLI vendors SQLite via rusqlite-bundled;
no Node-side SQLite dependency is needed. `pnpm audit` returns clean.
2026-05-19 16:33:16 +07:00

76 lines
2.0 KiB
YAML

name: Deploy to GitHub Pages
on:
push:
branches: [main]
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
concurrency:
group: pages
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
with:
workspaces: tools/xlsxread
- uses: pnpm/action-setup@v4
- uses: actions/setup-node@v4
with:
node-version: '24'
cache: 'pnpm'
- name: Build xlsxread binary
run: cargo build --release --manifest-path tools/xlsxread/Cargo.toml
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build all databases
run: |
./tools/xlsxread/target/release/xlsxread build --schema tools/xlsxread/configs/thptqg2017-data.toml --input data --output public/thptqg2017.db
./tools/xlsxread/target/release/xlsxread build --schema tools/xlsxread/configs/thptqg2017-data-old.toml --input data-old --output public-old/thptqg2017.db
./tools/xlsxread/target/release/xlsxread build --schema tools/xlsxread/configs/thptqg2017-data-old2.toml --input data-old2 --output public-old2/thptqg2017.db
- name: Compress databases
run: |
gzip -kf -9 public/thptqg2017.db
gzip -kf -9 public-old/thptqg2017.db
gzip -kf -9 public-old2/thptqg2017.db
- name: Build all site variants
run: pnpm build:all
- name: Drop uncompressed DB files from dist
run: |
rm -f dist/thptqg2017.db
rm -f dist/old/thptqg2017.db
rm -f dist/old2/thptqg2017.db
- uses: actions/upload-pages-artifact@v3
with:
path: dist
deploy:
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- id: deployment
uses: actions/deploy-pages@v4