Skip to content

Run StreamableHTTP transport tests in process instead of over sockets#2767

Draft
maxisbey wants to merge 1 commit into
mainfrom
maxisbey/deflake-streamable-http-tests
Draft

Run StreamableHTTP transport tests in process instead of over sockets#2767
maxisbey wants to merge 1 commit into
mainfrom
maxisbey/deflake-streamable-http-tests

Conversation

@maxisbey
Copy link
Copy Markdown
Contributor

@maxisbey maxisbey commented Jun 2, 2026

Final installment of the in-process test migration (#2764, #2765): tests/shared/test_streamable_http.py was the last file spawning uvicorn subprocesses on bind-then-close ports, the pattern that races under pytest-xdist when two workers pick the same ephemeral port. Two of its tests have flaked exactly this way under parallel load (test_server_validates_protocol_version_header, test_streamable_http_client_mcp_headers_override_defaults).

Closes #2704.

Motivation and Context

Same mechanism and fix as the previous two PRs. All four subprocess servers (basic, JSON-response, event-store, context) become in-process apps driven through StreamingASGITransport; with this, no test in the repo binds a port outside the one pre-bound websocket smoke test.

What changed beyond the plumbing swap:

  • The 19 raw-HTTP validation tests were synchronous requests-library tests; they're now anyio + httpx over the bridge with the same methods, headers, bodies, and assertions (including suppressing the library-default Accept header where the old tests did).
  • wait_for_server is deleted from tests/test_helpers.py — zero users remain. run_uvicorn_in_thread stays for tests/shared/test_ws.py.
  • Three # pragma: no cover in src/mcp/server/streamable_http.py are removed (close_standalone_sse_stream, its session-message callback, the JSON-mode Accept rejection): those lines were only ever executed inside the untraced subprocess; the migrated tests now cover them, and keeping the pragmas would fail strict-no-cover.
  • One stale src docstring note is dropped: close_standalone_sse_stream claimed client reconnection for standalone GET streams "is NOT implemented — this is a known gap", but the note was self-contradictory from the day it landed — the same commit (Add SSE polling support (SEP-1699) #1654) implemented the auto-reconnect and the test that proves it.
  • The long_running_with_checkpoints tool and the slow:// resource branch had no callers anywhere; they're deleted, which moves the tools/list count assertions from 10 to 9 in five tests. Unreachable handler branches elsewhere become dispatch asserts, since they now fail branch coverage instead of hiding in an untraced subprocess.
  • test_streamable_http_client_resumption's flag-polling loop (while not flag: sleep(0.1), unbounded) is now two anyio.Events awaited under fail_after(5).
  • test_get_sse_stream's 409 assertion is deterministic by construction (the first GET is held open across the second), where the old version's comment admitted it could race.

Deliberately retained: the genuinely time-based tests (SSE polling driven by retry_interval=500, the elapsed >= 0.4 retry-interval assertion, tool sleeps that create the disconnect windows) keep their real-clock behaviour and byte-identical assertions — in-process execution strictly tightens their margins versus the subprocess version. A follow-up may convert the reconnection choreography to event-store-sequenced waits (retry_interval=0 + the interaction suite's wait_until_stored pattern) and shave most of the file's remaining ~6s runtime; that's a behaviour-preserving quality pass kept out of this PR so the harness swap stays auditable.

How Has This Been Tested?

  • ./scripts/test green: 1526 passed, 100% line+branch coverage including tests/, strict-no-cover clean
  • The file 3× solo and together with the four previously-migrated files under pytest-xdist -n 4, 3× — stable; the nine resumption/reconnection tests 5× solo
  • Targeted coverage runs confirm the three de-pragma'd src lines (and both arcs of the branchy one) are executed by this file alone

Breaking Changes

None — test-only, plus pragma/docstring removals in src.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Several tests here overlap with tests/interaction/transports/ coverage (resumption-token replay, stream-close auto-reconnect, priming events, standalone GET delivery); deduplication is inventoried for a follow-up ticket, kept out of scope so this diff remains a pure harness swap.

AI Disclaimer

Final installment of the in-process test migration: this was the last
file spawning uvicorn subprocesses on bind-then-close ports with
readiness polling, which races under pytest-xdist when two workers pick
the same ephemeral port. Two tests in this file have flaked exactly
that way under parallel load.

All four subprocess servers (basic, JSON-response, event-store,
context-aware) become in-process apps served through the interaction
suite's StreamingASGITransport, held open by the session manager's
run() context. Raw `requests` calls become httpx calls against the
bridge client; the sync request-validation tests become anyio tests.
The second-GET-409 test now holds the first stream open by
construction, where the subprocess version noted it "might fail if the
first stream fully closed before this runs".

Assertions are unchanged, with documented exceptions now that the
server handlers run as traced in-process code:

- The long_running_with_checkpoints tool and the slow:// resource
  branch had no callers and are removed, so the expected tools/list
  count drops from 10 to 9 in five tests.
- Dead defensive arms become asserts (sampling non-text fallback,
  close_sse_stream truthiness checks, the context server's unknown-tool
  fallthrough and request checks), and the event store's
  replay-from-unknown-event arm becomes a lookup that requires a stored
  event, since unreachable branches now fail branch coverage instead of
  hiding in an untraced subprocess.
- test_client_crash_handled no longer sleeps between crashing clients;
  the bridge drains each client's teardown before the next connects.

Three pragmas in src/mcp/server/streamable_http.py covered only by the
formerly untraced subprocess (close_standalone_sse_stream, its session
message callback, and the JSON-mode Accept rejection) are now executed
by traced tests and removed.

With the last wait_for_server user migrated, the helper is deleted from
tests/test_helpers.py; run_uvicorn_in_thread stays for the websocket
smoke test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky streamable-HTTP/SSE tests: TOCTOU port race under pytest -n auto

1 participant