Files
goclaw/skills/docx/scripts/accept_changes.py
T
Viet Tran ace07509b7 feat(skills): system skills integration — toggle, dep checking, per-item install (#161)
* feat(infra): add runtime package support for skills

Install nodejs, npm, pandoc, github-cli + pre-install Python packages
(openpyxl, pandas, python-pptx, markitdown) and Node packages
(docx, pptxgenjs). Configure runtime dirs for agent pip/npm installs
with PIP_TARGET, NPM_CONFIG_PREFIX, NODE_PATH to enable dynamic
package installation in read-only container environment.

* feat(infra): add bundled skills with runtime package support

- Add 5 bundled skills: docx, pdf, pptx, xlsx, skill-creator from container skills-store
- Wire GOCLAW_BUILTIN_SKILLS_DIR env var in gateway and CLI
- Support optional runtime packages alongside dynamic skill loading
- Update Dockerfile to COPY bundled-skills at /app/bundled-skills/
- Add PIP_CACHE_DIR in docker-entrypoint.sh for clean pip installs
- Document bundled skills in 14-skills-runtime.md section 6

* feat(infra): remove ai-multimodal skill directory from bundled skills

Remove the ai-multimodal skill package as part of consolidating runtime
package support for bundled skills. This directory is no longer needed
in the bundled skills structure.

* feat(ci): add semantic release and Docker Hub publishing

Add go-semantic-release workflow to auto-create semver tags on merge to
main. Extend docker-publish to push all variants to both GHCR and
Docker Hub (digitop/goclaw).

* feat(skills): add system skills infrastructure with is_system column, dep scanning, and seeder

- Migration 000017: add is_system boolean column with partial index
- Store layer: UpsertSystemSkill, delete protection, IsSystemSkill
- ListAccessible auto-includes system skills (no grants needed)
- ListWithGrantStatus returns is_system field
- Dependency scanner: auto-detect deps from scripts/ or skill-manifest.json
- Dependency checker: verify system binaries, Python/Node packages
- Seeder: seed bundled skills into DB on startup (idempotent via hash)
- Gateway wiring: GOCLAW_BUNDLED_SKILLS_DIR env for bundled skills
- HTTP: delete guard (403), slug conflict check (409), rescan-deps endpoint
- UI: System badge, hide delete for system skills, rescan deps button
- Agent skills tab: "Always available" for system skills
- i18n: en/vi/zh keys for system skills, deps scanning

* feat(skills): conditional system prompt, skill manifests, and Zip Slip fix

- System prompt: only show package list when python3/node are available
- Add skill-manifest.json for pdf, docx, xlsx, pptx bundled skills
- Fix Zip Slip vulnerability in office/unpack.py (all 3 copies)

* refactor(skills): extract shared office code to _shared/ and deduplicate

Move office scripts (pack, unpack, validate, schemas, validators) from
duplicated copies in docx/xlsx/pptx to skills/_shared/office/ with
symlinks. Remove soffice.py (non-functional in containers) and update
SKILL.md references to use soffice binary directly. Update seeder
copyDir to follow symlinks.

Removes ~45K lines of duplicate code across 3 skills.

* fix(skills): address code review findings for system skills integration

- H1: Remove dead symlink branch in copyDir (filepath.Walk follows symlinks)
- H3: Fix rescan-deps to query ALL skills (including archived) and re-activate
  when deps become available; add ListAllSkills() + Status field to SkillInfo
- H4: Add Status field to SkillCreateParams, stop overloading Visibility
- M1: Batch Python/Node dep checks into single subprocess per runtime
- M4: Add rows.Err() check in ListSkills to prevent caching partial results

* feat(skills): async dep checking with realtime WS events

Split Seed() into sync DB upsert + async CheckDepsAsync() goroutine.
Gateway startup no longer blocks on Python/Node subprocess dep checks.

- Seed() returns seeded skills list, all initially status="active"
- CheckDepsAsync() runs in background, emits skill.deps.checked per-skill
- skill.deps.complete event emitted when all checks finish
- Each failed dep check: archives skill + BumpVersion() for immediate
  cache invalidation so next agent turn picks up the change
- UI: use-query-invalidation listens to skill.deps.* events → auto-refresh
  skills list in realtime

* feat(skills): system skills integration with toggle, dep checking, and per-item install

- Add is_system, deps, enabled columns to skills table (migration 017)
- Seed bundled core skills (pdf, docx, pptx, xlsx, skill-creator) on startup
- PYTHONPATH-based dep detection — eliminates false positives from local modules
- Per-item dep install UI with individual status (installing/success/error)
- Enable/disable toggle for core and custom skills (independent of dep status)
- Re-run dep check when skill is toggled back on
- Inline skill thresholds: 40 skills / 5000 tokens before switching to search mode
- Fix UpsertSystemSkill: backfill null file_hash without bumping DB version
- Remove redundant skill-manifest.json files (replaced by deps JSONB column)
- Show author from frontmatter in custom skills tab
- Runtime checker for python3/pip3/node/npm availability
- WS events for dep checking/installing progress
- docs: add 15-core-skills-system.md, 16-skill-publishing.md

---------

Co-authored-by: Goon <duy@wearetopgroup.com>
2026-03-12 09:20:41 +07:00

136 lines
4.0 KiB
Python

"""Accept all tracked changes in a DOCX file using LibreOffice.
Requires LibreOffice (soffice) to be installed.
"""
import argparse
import logging
import shutil
import subprocess
from pathlib import Path
from office.soffice import get_soffice_env
logger = logging.getLogger(__name__)
LIBREOFFICE_PROFILE = "/tmp/libreoffice_docx_profile"
MACRO_DIR = f"{LIBREOFFICE_PROFILE}/user/basic/Standard"
ACCEPT_CHANGES_MACRO = """<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd">
<script:module xmlns:script="http://openoffice.org/2000/script" script:name="Module1" script:language="StarBasic">
Sub AcceptAllTrackedChanges()
Dim document As Object
Dim dispatcher As Object
document = ThisComponent.CurrentController.Frame
dispatcher = createUnoService("com.sun.star.frame.DispatchHelper")
dispatcher.executeDispatch(document, ".uno:AcceptAllTrackedChanges", "", 0, Array())
ThisComponent.store()
ThisComponent.close(True)
End Sub
</script:module>"""
def accept_changes(
input_file: str,
output_file: str,
) -> tuple[None, str]:
input_path = Path(input_file)
output_path = Path(output_file)
if not input_path.exists():
return None, f"Error: Input file not found: {input_file}"
if not input_path.suffix.lower() == ".docx":
return None, f"Error: Input file is not a DOCX file: {input_file}"
try:
output_path.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(input_path, output_path)
except Exception as e:
return None, f"Error: Failed to copy input file to output location: {e}"
if not _setup_libreoffice_macro():
return None, "Error: Failed to setup LibreOffice macro"
cmd = [
"soffice",
"--headless",
f"-env:UserInstallation=file://{LIBREOFFICE_PROFILE}",
"--norestore",
"vnd.sun.star.script:Standard.Module1.AcceptAllTrackedChanges?language=Basic&location=application",
str(output_path.absolute()),
]
try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=30,
check=False,
env=get_soffice_env(),
)
except subprocess.TimeoutExpired:
return (
None,
f"Successfully accepted all tracked changes: {input_file} -> {output_file}",
)
if result.returncode != 0:
return None, f"Error: LibreOffice failed: {result.stderr}"
return (
None,
f"Successfully accepted all tracked changes: {input_file} -> {output_file}",
)
def _setup_libreoffice_macro() -> bool:
macro_dir = Path(MACRO_DIR)
macro_file = macro_dir / "Module1.xba"
if macro_file.exists() and "AcceptAllTrackedChanges" in macro_file.read_text():
return True
if not macro_dir.exists():
subprocess.run(
[
"soffice",
"--headless",
f"-env:UserInstallation=file://{LIBREOFFICE_PROFILE}",
"--terminate_after_init",
],
capture_output=True,
timeout=10,
check=False,
env=get_soffice_env(),
)
macro_dir.mkdir(parents=True, exist_ok=True)
try:
macro_file.write_text(ACCEPT_CHANGES_MACRO)
return True
except Exception as e:
logger.warning(f"Failed to setup LibreOffice macro: {e}")
return False
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Accept all tracked changes in a DOCX file"
)
parser.add_argument("input_file", help="Input DOCX file with tracked changes")
parser.add_argument(
"output_file", help="Output DOCX file (clean, no tracked changes)"
)
args = parser.parse_args()
_, message = accept_changes(args.input_file, args.output_file)
print(message)
if "Error" in message:
raise SystemExit(1)