refactor(security): make the DEVA11Y-484 extraction guard directly unit-testable by maunilm · Pull Request #26 · browserstack/AccessibilityDevTools

maunilm · 2026-06-02T14:35:52Z

Stacked on #25. Targets the fix branch so this shows only the architecture delta; review/merge #25 first (or merge this in its place).

Why

#25 tested the guard via a hand-maintained mirror of the guard block plus a drift check, because SwiftPM command plugins can't be imported by a test target. That left the plugin's call-site wiring typecheck-only. This PR eliminates that gap by making the shipped code directly unit-testable.

How

SwiftPM command plugins can't link a library target (verified), so:

Sources/BrowserStackCLIKit — all download/extract/guard/run logic, Foundation-only. Diagnostics.remark → injected logger; forwardExit → thrown CLIExit (the library never calls exit()).
Sources/browserstack-accessibility-runner — thin executable that calls the library.
Plugins/BrowserStackAccessibilityLint — now a ~30-line shim: resolves the runner tool, forwards --working-directory + args, propagates the exit code.
Sources/cli-kit-tests — a plain executable test harness (no XCTest, so it runs under Command Line Tools and CI) that exercises the real library against live bsdtar and crafted bombs.

The previous mirror harness + check_drift.sh are deleted (no longer needed). The shell-wrapper integration tests are kept.

Verification

swift run cli-kit-tests → 28/28 green: extractLocalArchive legit/bomb, watchdog size + entry caps + bounded peak + SIGTERM mid-stream, locateExecutable entry cap, parseOverride/parseArguments/sanitizeArguments/extractVersion/footprint.
Shell suite still 36/36 green.
End-to-end from a consumer package: swift package scan built the runner under the SPM sandbox, downloaded the real CLI (1.34.5) over the network, extracted it through the guard, and ran it — proving the full plugin → runner → library chain works live, including sandbox network + cache-write.

Tradeoff (please weigh)

The plugin now triggers a one-time build of the runner executable on first invocation (cached afterward; pure-Foundation, no deps). The previous in-process plugin had no such cost. This is the price of making the shipped code importable and directly testable. If the team prefers to avoid any per-user build cost, #25's mirror-based approach remains the alternative.

Jira

DEVA11Y-484

🤖 Generated with Claude Code

…mb DoS [DEVA11Y-484] CWE-400 / OWASP A05. bsdtar was invoked with no decompressed-size or entry-count limit in both the Swift SPM plugin and the bash/zsh/fish CLI wrappers, so an attacker who can influence the download URL (the HTTPS-only --download-url / BROWSERSTACK_A11Y_CLI_DOWNLOAD_URL override, or TLS interception) could serve a decompression bomb that exhausts the developer/CI disk. Swift plugin (BrowserStackAccessibilityLint.swift): - curl now passes --max-filesize (100 MB) to cap the compressed download. - A background watchdog terminates bsdtar once the *decompressed* footprint on disk exceeds 200 MB (a pipe-level cap would only bound compressed bytes, which is useless against a bomb). Applied to both the remote and local extraction paths. - locateExecutable now bounds enumeration at 10,000 entries. Shell wrappers (bash/zsh/fish cli.sh): - curl --max-filesize caps the compressed download. - bsdtar output is piped through `head -c` (200 MB) with pipefail so an oversized archive aborts instead of filling the disk. Real CLI artifact is ~34 MB compressed / ~64 MB decompressed, so the caps leave ~3x headroom and do not affect legitimate downloads. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…on guard [DEVA11Y-484] Adds local integration tests (no mocks) that exercise the decompression-bomb guards against real curl/bsdtar/head and the real Swift watchdog, plus hardens the guard itself based on what the tests surfaced. Guard hardening (Plugins/BrowserStackAccessibilityLint.swift): - The watchdog now also terminates bsdtar on an entry-count ceiling, closing the "millions of tiny files" bomb that stays small on disk (previously only locateExecutable caught it, after the fact). - Added a post-extraction footprint check so detection is deterministic on fast disks: a bomb that finishes decompressing within a single 200ms poll interval is now caught and cleaned up rather than slipping past the live watchdog. - Refactored the guard into a self-contained, marked block of free functions so it can be mirrored and drift-checked. Tests (scripts/test/, run via run_tests.sh): - Shell: extracts the REAL download_binary from bash/zsh/fish verbatim and runs it against a local server (only the hardcoded URL is redirected, via a curl shim). - Swift: a mirror harness compiles the guard block verbatim and drives real curl/bsdtar; check_drift.sh fails CI if the mirror diverges from the plugin (SwiftPM command plugins can't be imported by a test target). - Scenarios: legit (downloads/extracts/runs), 400MB bomb, 20k-entry bomb, oversized (>100MB) download, corrupt archive, multi-file, missing URL. - Fixtures are bounded (≤400MB, gitignored) and bomb tests use a small cap, so a regressed guard can never exhaust the disk. Full run ~9s, disk usage flat. - CI: .github/workflows/extraction-guard-tests.yml runs the suite on macOS for PRs touching the download/extract path. 53/53 assertions green locally; real production artifact (34MB/64MB) verified to pass through the new extraction path and run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… live termination [DEVA11Y-484] Addresses gaps found by stress-testing the guard rather than just asserting the happy path: - Measured overshoot: at a 200ms poll, bsdtar could write ~270-380MB past the cap on a fast disk before the watchdog tripped (the cap was far softer than the "200 MB" message implied). Tightened the poll to 50ms — a 10MB cap now peaks at ~34MB and a 2GB bomb is killed at ~224MB. Documented the cap as an explicit SOFT ceiling whose purpose is preventing disk *exhaustion*, not exact byte enforcement. - Windows Expand-Archive path was completely unguarded. Added a platform-agnostic post-extraction footprint backstop in the common path (typecheckable on macOS) so Windows rejects + cleans up a bomb before the binary is used. - Strengthened tests to assert the LIVE watchdog fires (bsdtar SIGTERM, status 15) and that peak disk stays bounded below the bomb size — previously the bomb tests would have passed even if only the post-extraction check worked (which would let a multi-GB bomb fill the disk). - Added test_large_bomb.sh (opt-in via DEVA11Y_DEEP=1): proves a 2GB bomb is bounded to ~224MB. Kept out of the default CI run to keep it fast/bounded. - README now documents the real limitations: soft cap + overshoot, Windows is post-hoc only, the Swift suite tests a mirror (not the compiled plugin) with the call sites typecheck-only, and locateExecutable's cap is defense-in-depth. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…[DEVA11Y-484] Eliminates the mirror/drift-check gap from the previous test approach. SwiftPM command plugins can't link a library target, so the download/extract/guard logic is moved into a real library (BrowserStackCLIKit) that the shipped plugin runs via a thin `browserstack-accessibility-runner` executable. The library is exercised by real unit tests instead of a hand-maintained copy. - Sources/BrowserStackCLIKit: all logic (Foundation-only). Diagnostics.remark → injected logger; forwardExit → thrown CLIExit (library never exits the process). - Sources/browserstack-accessibility-runner: thin executable the plugin invokes. - Plugins/BrowserStackAccessibilityLint: now a ~30-line shim that resolves the runner tool and forwards args + working directory, propagating the exit code. - Sources/cli-kit-tests: plain executable test harness (no XCTest, so it runs under Command Line Tools and CI) hitting the real library against live bsdtar + crafted bombs — 28 checks: extractLocalArchive legit/bomb, watchdog size + entry + bounded peak + SIGTERM, locateExecutable entry cap, parseOverride/parseArguments/sanitize. - Retires scripts/test/swift-harness + check_drift.sh + test_swift_extraction.sh (the mirror is no longer needed). Keeps the shell-wrapper integration tests. - Verified end-to-end from a consumer package: `swift package scan` builds the runner under the SPM sandbox, downloads the real CLI over the network, extracts it through the guard, and runs it. Tradeoff: the plugin now triggers a one-time build of the runner executable on first use (cached after) — the price of making the shipped code importable/testable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

maunilm · 2026-06-02T14:36:40Z

Retargeted to main so CI runs and this stands as a complete, self-contained PR.

Two alternatives for DEVA11Y-484 — pick one:

fix(security): cap bsdtar extraction size to prevent decompression bomb DoS [DEVA11Y-484] #25 — minimal security fix + mirror-based tests. Lower risk, no per-user build cost, but the plugin's guard call-sites are typecheck-only (tested via a drift-checked mirror).
refactor(security): make the DEVA11Y-484 extraction guard directly unit-testable #26 (this) — same security fix, but the logic moves into a library the shipped plugin runs via a thin runner executable, so it's covered by real unit tests (no mirror). Cost: a one-time runner build on first plugin use.

Net diff vs main is clean (the mirror added in #25 is removed here, so it nets out).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

maunilm and others added 4 commits May 29, 2026 17:55

maunilm requested a review from a team as a code owner June 2, 2026 14:35

maunilm changed the base branch from fix/DEVA11Y-484-bsdtar-size-limit to main June 2, 2026 14:36

ci: trigger extraction-guard workflow on main-targeted PR [DEVA11Y-484]

b1ba3b0

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(security): make the DEVA11Y-484 extraction guard directly unit-testable#26

refactor(security): make the DEVA11Y-484 extraction guard directly unit-testable#26
maunilm wants to merge 5 commits into
mainfrom
refactor/DEVA11Y-484-testable-extraction

maunilm commented Jun 2, 2026 •

edited by atlassian Bot

Loading

Uh oh!

maunilm commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maunilm commented Jun 2, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

How

Verification

Tradeoff (please weigh)

Jira

Uh oh!

maunilm commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maunilm commented Jun 2, 2026 •

edited by atlassian Bot

Loading