Skip to content

refactor(security): make the DEVA11Y-484 extraction guard directly unit-testable#26

Open
maunilm wants to merge 5 commits into
mainfrom
refactor/DEVA11Y-484-testable-extraction
Open

refactor(security): make the DEVA11Y-484 extraction guard directly unit-testable#26
maunilm wants to merge 5 commits into
mainfrom
refactor/DEVA11Y-484-testable-extraction

Conversation

@maunilm
Copy link
Copy Markdown
Collaborator

@maunilm maunilm commented Jun 2, 2026

Stacked on #25. Targets the fix branch so this shows only the architecture delta; review/merge #25 first (or merge this in its place).

Why

#25 tested the guard via a hand-maintained mirror of the guard block plus a drift check, because SwiftPM command plugins can't be imported by a test target. That left the plugin's call-site wiring typecheck-only. This PR eliminates that gap by making the shipped code directly unit-testable.

How

SwiftPM command plugins can't link a library target (verified), so:

  • Sources/BrowserStackCLIKit — all download/extract/guard/run logic, Foundation-only. Diagnostics.remark → injected logger; forwardExit → thrown CLIExit (the library never calls exit()).
  • Sources/browserstack-accessibility-runner — thin executable that calls the library.
  • Plugins/BrowserStackAccessibilityLint — now a ~30-line shim: resolves the runner tool, forwards --working-directory + args, propagates the exit code.
  • Sources/cli-kit-tests — a plain executable test harness (no XCTest, so it runs under Command Line Tools and CI) that exercises the real library against live bsdtar and crafted bombs.

The previous mirror harness + check_drift.sh are deleted (no longer needed). The shell-wrapper integration tests are kept.

Verification

  • swift run cli-kit-tests28/28 green: extractLocalArchive legit/bomb, watchdog size + entry caps + bounded peak + SIGTERM mid-stream, locateExecutable entry cap, parseOverride/parseArguments/sanitizeArguments/extractVersion/footprint.
  • Shell suite still 36/36 green.
  • End-to-end from a consumer package: swift package scan built the runner under the SPM sandbox, downloaded the real CLI (1.34.5) over the network, extracted it through the guard, and ran it — proving the full plugin → runner → library chain works live, including sandbox network + cache-write.

Tradeoff (please weigh)

The plugin now triggers a one-time build of the runner executable on first invocation (cached afterward; pure-Foundation, no deps). The previous in-process plugin had no such cost. This is the price of making the shipped code importable and directly testable. If the team prefers to avoid any per-user build cost, #25's mirror-based approach remains the alternative.

Jira

DEVA11Y-484

🤖 Generated with Claude Code

maunilm and others added 4 commits May 29, 2026 17:55
…mb DoS [DEVA11Y-484]

CWE-400 / OWASP A05. bsdtar was invoked with no decompressed-size or
entry-count limit in both the Swift SPM plugin and the bash/zsh/fish CLI
wrappers, so an attacker who can influence the download URL (the
HTTPS-only --download-url / BROWSERSTACK_A11Y_CLI_DOWNLOAD_URL override,
or TLS interception) could serve a decompression bomb that exhausts the
developer/CI disk.

Swift plugin (BrowserStackAccessibilityLint.swift):
- curl now passes --max-filesize (100 MB) to cap the compressed download.
- A background watchdog terminates bsdtar once the *decompressed* footprint
  on disk exceeds 200 MB (a pipe-level cap would only bound compressed
  bytes, which is useless against a bomb). Applied to both the remote and
  local extraction paths.
- locateExecutable now bounds enumeration at 10,000 entries.

Shell wrappers (bash/zsh/fish cli.sh):
- curl --max-filesize caps the compressed download.
- bsdtar output is piped through `head -c` (200 MB) with pipefail so an
  oversized archive aborts instead of filling the disk.

Real CLI artifact is ~34 MB compressed / ~64 MB decompressed, so the caps
leave ~3x headroom and do not affect legitimate downloads.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on guard [DEVA11Y-484]

Adds local integration tests (no mocks) that exercise the decompression-bomb
guards against real curl/bsdtar/head and the real Swift watchdog, plus hardens
the guard itself based on what the tests surfaced.

Guard hardening (Plugins/BrowserStackAccessibilityLint.swift):
- The watchdog now also terminates bsdtar on an entry-count ceiling, closing the
  "millions of tiny files" bomb that stays small on disk (previously only
  locateExecutable caught it, after the fact).
- Added a post-extraction footprint check so detection is deterministic on fast
  disks: a bomb that finishes decompressing within a single 200ms poll interval
  is now caught and cleaned up rather than slipping past the live watchdog.
- Refactored the guard into a self-contained, marked block of free functions so
  it can be mirrored and drift-checked.

Tests (scripts/test/, run via run_tests.sh):
- Shell: extracts the REAL download_binary from bash/zsh/fish verbatim and runs it
  against a local server (only the hardcoded URL is redirected, via a curl shim).
- Swift: a mirror harness compiles the guard block verbatim and drives real
  curl/bsdtar; check_drift.sh fails CI if the mirror diverges from the plugin
  (SwiftPM command plugins can't be imported by a test target).
- Scenarios: legit (downloads/extracts/runs), 400MB bomb, 20k-entry bomb,
  oversized (>100MB) download, corrupt archive, multi-file, missing URL.
- Fixtures are bounded (≤400MB, gitignored) and bomb tests use a small cap, so a
  regressed guard can never exhaust the disk. Full run ~9s, disk usage flat.
- CI: .github/workflows/extraction-guard-tests.yml runs the suite on macOS for PRs
  touching the download/extract path.

53/53 assertions green locally; real production artifact (34MB/64MB) verified to
pass through the new extraction path and run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… live termination [DEVA11Y-484]

Addresses gaps found by stress-testing the guard rather than just asserting the
happy path:

- Measured overshoot: at a 200ms poll, bsdtar could write ~270-380MB past the cap
  on a fast disk before the watchdog tripped (the cap was far softer than the
  "200 MB" message implied). Tightened the poll to 50ms — a 10MB cap now peaks at
  ~34MB and a 2GB bomb is killed at ~224MB. Documented the cap as an explicit SOFT
  ceiling whose purpose is preventing disk *exhaustion*, not exact byte enforcement.
- Windows Expand-Archive path was completely unguarded. Added a platform-agnostic
  post-extraction footprint backstop in the common path (typecheckable on macOS)
  so Windows rejects + cleans up a bomb before the binary is used.
- Strengthened tests to assert the LIVE watchdog fires (bsdtar SIGTERM, status 15)
  and that peak disk stays bounded below the bomb size — previously the bomb tests
  would have passed even if only the post-extraction check worked (which would let
  a multi-GB bomb fill the disk).
- Added test_large_bomb.sh (opt-in via DEVA11Y_DEEP=1): proves a 2GB bomb is
  bounded to ~224MB. Kept out of the default CI run to keep it fast/bounded.
- README now documents the real limitations: soft cap + overshoot, Windows is
  post-hoc only, the Swift suite tests a mirror (not the compiled plugin) with the
  call sites typecheck-only, and locateExecutable's cap is defense-in-depth.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…[DEVA11Y-484]

Eliminates the mirror/drift-check gap from the previous test approach. SwiftPM
command plugins can't link a library target, so the download/extract/guard logic
is moved into a real library (BrowserStackCLIKit) that the shipped plugin runs via
a thin `browserstack-accessibility-runner` executable. The library is exercised by
real unit tests instead of a hand-maintained copy.

- Sources/BrowserStackCLIKit: all logic (Foundation-only). Diagnostics.remark →
  injected logger; forwardExit → thrown CLIExit (library never exits the process).
- Sources/browserstack-accessibility-runner: thin executable the plugin invokes.
- Plugins/BrowserStackAccessibilityLint: now a ~30-line shim that resolves the
  runner tool and forwards args + working directory, propagating the exit code.
- Sources/cli-kit-tests: plain executable test harness (no XCTest, so it runs under
  Command Line Tools and CI) hitting the real library against live bsdtar + crafted
  bombs — 28 checks: extractLocalArchive legit/bomb, watchdog size + entry + bounded
  peak + SIGTERM, locateExecutable entry cap, parseOverride/parseArguments/sanitize.
- Retires scripts/test/swift-harness + check_drift.sh + test_swift_extraction.sh
  (the mirror is no longer needed). Keeps the shell-wrapper integration tests.
- Verified end-to-end from a consumer package: `swift package scan` builds the
  runner under the SPM sandbox, downloads the real CLI over the network, extracts it
  through the guard, and runs it.

Tradeoff: the plugin now triggers a one-time build of the runner executable on first
use (cached after) — the price of making the shipped code importable/testable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@maunilm maunilm requested a review from a team as a code owner June 2, 2026 14:35
@maunilm maunilm changed the base branch from fix/DEVA11Y-484-bsdtar-size-limit to main June 2, 2026 14:36
@maunilm
Copy link
Copy Markdown
Collaborator Author

maunilm commented Jun 2, 2026

Retargeted to main so CI runs and this stands as a complete, self-contained PR.

Two alternatives for DEVA11Y-484 — pick one:

Net diff vs main is clean (the mirror added in #25 is removed here, so it nets out).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant