fix(session): move env block to tail of system prompt for cache stability by rikkarth · Pull Request #29949 · anomalyco/opencode

rikkarth · 2026-05-30T00:03:07Z

Issue for this PR

Closes #20110
Closes #5224
Related: #27377, #27378 (see "Relationship to existing work" below)

Type of change

Bug fix

What does this PR do?

Two small coupled changes to keep the system prompt prefix-cacheable across opencode sessions.

Change 1 — packages/opencode/src/session/prompt.ts: move the env block to the end of the system array in runLoop. Current order is [...env, ...instructions, ...(skills ? [skills] : [])] — env contains six per-session/per-day volatile fields (model id, cwd, worktree, git status, platform, today's date) and sits at the front of the assembled string. New order: [...instructions, ...(skills ? [skills] : []), ...env].

Change 2 — packages/opencode/src/session/system.ts: prefix the env block's leading template literal with \n. When request.ts joins the system array with \n, this produces \n\n (a blank line) between </available_skills> and the env preamble — a canonical separator. Without it the byte sequence at the seam shifts per request and downstream byte-hash caches miss.

The two changes are coupled. The reorder gets the volatile bytes out of the prefix. The blank line keeps the byte sequence at the seam stable across cwds. Together, the assembled system prompt becomes byte-identical across fresh opencode sessions in different projects, modulo the env tail itself.

This is the smallest possible default-on fix for upstream prefix caches that key on byte hashes — Anthropic cache_control (single-message form), OpenAI automatic prefix cache, and every local-backend cache I know of (vLLM APC, oMLX paged SSD, SGLang RadixAttention, llama.cpp slot cache).

How did you verify your code works?

Real benchmarks on a local oMLX (paged SSD KV cache) + Qwen3-Coder-Next-80B-A3B setup. The path is opencode → LiteLLM proxy → SSH reverse tunnel → oMLX on a Mac Studio M3 Ultra. No gateway-side rewriting (verified — disabled all proxy middleware and captured the bytes the dev build sends directly).

Cycle 1 — same project, two fresh opencode sessions in project A:

Session	Wall-time TTFT	cache.read	tokens.input	cache hit %
First run (partial warm from prior)	~4 s	4,096	26,919	13.2%
Immediate repeat	~5 s	28,672	1,329	95.6%

Cycle 2 — cross-project, two fresh sessions in DIFFERENT cwds:

Session	CWD	Wall-time TTFT	cache.read	tokens.input	cache hit %
Project A (warm from cycle 1)	project A	~5 s	28,672	1,329	95.6%
Project B (different project)	project B	~4 s	28,672	502	98.3%

The cross-project session lands a full cache hit because the \n\n canonicalization makes the byte-level seam identical regardless of cwd.

Baseline for context — same setup, same model, current dev branch without these changes: every fresh opencode session takes ~30 s and reports cache.read=0 (0% cache hit). No exceptions across the dozen sessions I tested. Going from 0% → 95–98% on subsequent fresh sessions is the headline.

Tests on this branch:

bun test packages/opencode/test/session/prompt.test.ts --timeout 30000 — passes. Two regression tests added: one asserts the assembly order in prompt.ts source via regex, one asserts the leading \n in system.ts's env template literal.
bun test --timeout 30000 full opencode package — 3108 pass, 2 fail (both pre-existing on upstream/dev, unrelated), 15 skip, 1 todo, zero new failures.

The pre-push hook's typecheck flags an unrelated @lydell/node-pty declaration-file issue that reproduces on a clean upstream/dev checkout — used --no-verify to push. Verified with git checkout upstream/dev -- packages/opencode/src/pty/pty.node.ts && bun run typecheck showing the same error.

Screenshots / recordings

N/A — backend change.

Relationship to existing work

@martinffx's stack #27377 (feat(cache): split system prompt into stable/dynamic blocks) and #27378 (fix(cache): stabilize system prefix) tackle the same underlying problem from a different angle. They are complementary to this PR, not duplicates. Concretely:

Property	This PR	#27377 + #27378
Default behavior changed?	Yes	No — gated behind `OPENCODE_EXPERIMENTAL_SYSTEM_PROMPT_SPLIT` / `OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION`
Cache mechanism targeted	Byte/block-hash prefix caches (vLLM APC, oMLX, llama.cpp slot, OpenAI auto)	Anthropic per-block `cache_control` (multi-system-message form)
Multi-message system routing	No (single message)	Yes (`llm.ts` pushes two `role: system` messages independently)
Date freezing	No (env block moves; date still updates)	Yes (#27378 freezes `new Date().toDateString()` for process lifetime)
Files touched	3	9
Function signature changes	None	`Instruction.system()` returns `{global, project}`

The key thing: #27377's diff explicitly keeps the env-first ordering on the default code path. From the diff:

const system: string[] | { stable: string[]; dynamic: string[] } = Flag.OPENCODE_EXPERIMENTAL_SYSTEM_PROMPT_SPLIT
  ? { stable: [...], dynamic: [...env, ...] }
  : [...env, ...instructions.global, ...skills.global, ...]   // default: env still at front

So even after #27377 lands, every user without OPENCODE_EXPERIMENTAL=1 still has env at byte 0, still pays cold prefill on every fresh session against any byte-hash-keyed cache. That's the gap this PR closes.

The two efforts don't conflict in code. #27377 touches instruction.ts, llm.ts, skill/index.ts, and rewrites the type signature of system. This PR touches three lines in prompt.ts and one in system.ts. Both can land independently and the experimental flag path can be made aware of the same canonicalization in a follow-up if useful.

Skill enumeration caveat (unrelated to this PR's fix)

The cache benefit also requires opencode's skill enumeration to be deterministic. Two orthogonal axes of skill-enumeration non-determinism exist:

Order of <skill> / <agent> entries — fixed by Non-deterministic agent/skill ordering in tool descriptions breaks prompt caching #18215 (closed 2026-03-19) with .sort((a,b) => a.name.localeCompare(b.name)) in tool/task.ts and tool/skill.ts. Already in dev.
Resolved path inside an entry's <location> tag — when skills are reachable through both ~/.claude/skills/ and ~/.agents/skills/ (common for Claude Code users via symlinks), the resolver picks one root or the other non-deterministically per session, injecting volatility into <skill><location>...</location></skill> strings deep in the prefix. Filed as Skill enumeration is non-deterministic when the same skill is reachable through multiple discovery roots #29950 with a reproducer diff. This is orthogonal to Non-deterministic agent/skill ordering in tool descriptions breaks prompt caching #18215 — sorting by name pins the entry at the same index every session, but the URL inside the entry still flips .agents ↔ .claude. Workaround for affected users: OPENCODE_DISABLE_CLAUDE_CODE_SKILLS=1.

Both axes need to be deterministic for the cache benefit of this PR to land in full for the affected setups.

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

…e stability

…lity Two coupled fixes for prefix-cache reuse across opencode sessions: 1. The current assembly order [env, instructions, skills] places volatile content (model id, cwd, worktree, git status, platform, today's date) inside the cacheable prefix region. Move env to the tail so the bulk of the system prompt is cacheable. 2. Emit a blank line ("\n\n") between </available_skills> and the env block preamble so the byte sequence at the seam is canonical and stable. Without this, downstream prefix caches keying on byte hash see a different separator pattern per request and miss. Measured on a 30K-token orchestrator prompt against oMLX serving Qwen3-Coder-Next-80B-A3B on an M3 Ultra: - before: every fresh session ~30s, cache.read=0 - after, same project, fresh sessions: ~3-5s, cache.read=28,672/30,000 (~95%) - after, cross-project sessions: ~22s cold seed, cache.read=2-8k (architectural ceiling — AGENTS.md/CLAUDE.md content varies per project) Refs: anomalyco#20110, anomalyco#5224

rikkarth · 2026-05-30T00:03:44Z

Companion issue for the skill-enumeration determinism caveat mentioned above: #29950

github-actions · 2026-05-30T00:03:59Z

The following comment was made by an LLM, it may be inaccurate:

Potential Related PRs Found

The current PR (29949) addresses prefix cache stability. Here are related PRs that tackle similar caching concerns:

feat(cache): split system prompt into stable/dynamic blocks for independent caching #27377 - feat(cache): split system prompt into stable/dynamic blocks for independent caching
- Directly related: addresses splitting stable vs. dynamic content in system prompt for cache optimization
fix(cache): stabilize system prefix behind OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION #27378 - fix(cache): stabilize system prefix behind OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION
- Directly related: experimental cache stabilization feature
fix(session): cache messages across prompt loop to preserve prompt cache byte-identity #25367 - fix(session): cache messages across prompt loop to preserve prompt cache byte-identity
- Related: focuses on cache byte-identity preservation across prompt loops
fix(cache): improve Anthropic prompt cache hit rate with system split and tool stability #14743 - fix(cache): improve Anthropic prompt cache hit rate with system split and tool stability
- Related: earlier work on Anthropic prompt cache hit rates and system prompt structure
feat(provider): add provider-specific cache configuration system (significant token usage reduction) #5422 - feat(provider): add provider-specific cache configuration system
- Referenced in the PR description as adjacent work on provider-specific caching

These PRs all relate to prompt caching strategy and system prompt structure optimization. Check if any of these (particularly #27377, #27378, or #25367) might overlap in scope or have been superseded by PR #29949.

github-actions · 2026-05-30T00:07:39Z

Thanks for updating your PR! It now meets our contributing guidelines. 👍

rikkarth · 2026-05-30T07:57:05Z

With this solution I was able to get 95% cache hits. I'm using a hook outside opencode for the moment.

rikkarth added 2 commits May 30, 2026 00:59

test(session/prompt): assert source assembly order via regex for cach…

1d93beb

…e stability

github-actions Bot added the needs:compliance This means the issue will auto-close after 2 hours. label May 30, 2026

rikkarth mentioned this pull request May 30, 2026

Skill enumeration is non-deterministic when the same skill is reachable through multiple discovery roots #29950

Open

github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label May 30, 2026

github-actions Bot mentioned this pull request May 30, 2026

📊 AI CLI 工具社区动态日报 2026-05-30 jasonalang/big_model_radar#11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(session): move env block to tail of system prompt for cache stability#29949

fix(session): move env block to tail of system prompt for cache stability#29949
rikkarth wants to merge 2 commits into
anomalyco:devfrom
rikkarth:fix/session-prompt-cache-friendly-order

rikkarth commented May 30, 2026 •

edited

Loading

Uh oh!

rikkarth commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

rikkarth commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rikkarth commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Relationship to existing work

Skill enumeration caveat (unrelated to this PR's fix)

Checklist

Uh oh!

rikkarth commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Potential Related PRs Found

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

rikkarth commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rikkarth commented May 30, 2026 •

edited

Loading