Fix extended thinking and allow per-repo context files (AGENTS.md, CLAUDE.md, etc.) to be included as context#2387
Conversation
Review Summary by QodoSupport Claude model overrides and repository context files in AI prompts
WalkthroughsDescription• Add support for overriding Claude extended thinking models via config • Include repository context files (AGENTS.md, CLAUDE.md, etc.) in AI prompts • Extend git provider interface with repo file content retrieval capability • Add comprehensive tests for new features and fix minor code issues Diagramflowchart LR
A["Config Settings"] -->|claude_extended_thinking_models_override| B["LiteLLM Handler"]
C["Repo Context Files"] -->|repo_context_files config| D["build_repo_context"]
D -->|fetches files| E["Git Provider"]
D -->|renders context| F["Instruction Files"]
F -->|injected into| G["AI Prompts"]
B -->|uses models| G
File Changes1. pr_agent/algo/__init__.py
|
Code Review by Qodo
Context used 1. test_apply_repo_settings leaks env var
|
|
Persistent review updated to latest commit 2b313ad |
2b313ad to
50de2c6
Compare
|
Persistent review updated to latest commit 50de2c6 |
50de2c6 to
e93cad2
Compare
|
Persistent review updated to latest commit e93cad2 |
| cache_key = _get_repo_context_cache_key(context_files, max_lines) | ||
| process_cache_key = _get_repo_context_process_cache_key(git_provider, context_files, max_lines) | ||
| if process_cache_key is not None and process_cache_key in _repo_context_process_cache: | ||
| return _repo_context_process_cache[process_cache_key] | ||
|
|
There was a problem hiding this comment.
1. Unbounded context process cache 🐞 Bug ☼ Reliability
build_repo_context() stores repo context in a module-level _repo_context_process_cache keyed by PR URL without TTL/eviction, so long-lived servers can accumulate entries indefinitely and can keep serving stale instruction files for a PR until process restart.
Agent Prompt
## Issue description
`build_repo_context()` uses a module-level dict (`_repo_context_process_cache`) as a process-wide cache keyed by PR URL. Because it has no TTL/max-size eviction, long-lived servers will:
1) grow memory unbounded as more PR URLs are processed, and
2) potentially serve stale instruction-file contents for the same PR URL even if the repo default-branch instruction files change.
## Issue Context
This cache is independent of any single provider instance and persists for the lifetime of the Python process. The repo runs as a long-lived FastAPI/uvicorn app in webhook deployments, so the cache lifetime is effectively “forever”.
## Fix Focus Areas
- pr_agent/algo/repo_context.py[13-14]
- pr_agent/algo/repo_context.py[136-140]
- pr_agent/algo/repo_context.py[175-185]
## How to fix
- Replace `_repo_context_process_cache` with a bounded/evicting structure (e.g., `functools.lru_cache(maxsize=...)`, or a TTL cache).
- If keeping cross-instance caching, include an invalidation strategy (TTL is simplest) so updated repo instruction files eventually refresh.
- Ensure empty-string entries also follow the eviction policy to avoid permanently pinning “missing” results for long-running processes.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
Hey, could you adjust your implementation for Gitea in I made it manually for now, but I think it would be interesting to include it in this PR as well. |
done! |
aae35a9 to
2e11c1f
Compare
|
Persistent review updated to latest commit 2e11c1f |
| import pytest | ||
| from jinja2 import Environment, StrictUndefined |
There was a problem hiding this comment.
1. test_repo_context.py import order 📘 Rule violation ⚙ Maintainability
The new test file has third-party imports out of isort order (pytest imported before jinja2), which will violate the repository’s Ruff/isort checks. This can fail CI linting and introduce style churn.
Agent Prompt
## Issue description
The imports in `tests/unittest/test_repo_context.py` are not ordered as isort/Ruff expects for third-party imports.
## Issue Context
`pyproject.toml` enables Ruff isort checks (`I001`/`I002`). The file currently imports `pytest` before `jinja2`, which is not deterministic/alphabetical.
## Fix Focus Areas
- tests/unittest/test_repo_context.py[3-5]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
believe this is a false positive; ruff puts back the order they currently are in
|
Persistent review updated to latest commit 145b93f |
|
Persistent review updated to latest commit a92f7b3 |
|
Persistent review updated to latest commit 982821f |
982821f to
817fa77
Compare
|
Persistent review updated to latest commit 817fa77 |
| class _RepoContextCache: | ||
| def __init__(self, max_size: int = REPO_CONTEXT_CACHE_MAX_SIZE, ttl_seconds: int = REPO_CONTEXT_CACHE_TTL_SECONDS): | ||
| self._max_size = max(1, int(max_size)) | ||
| self._ttl_seconds = max(0, int(ttl_seconds)) | ||
| self._entries = OrderedDict() | ||
|
|
||
| def copy(self): | ||
| cache = type(self)(max_size=self._max_size, ttl_seconds=self._ttl_seconds) | ||
| cache._entries = self._entries.copy() | ||
| return cache | ||
|
|
||
| def get(self, key, default=None): | ||
| entry = self._entries.get(key) | ||
| if entry is None: | ||
| return default | ||
|
|
||
| value, expires_at = entry | ||
| if expires_at <= time.monotonic(): | ||
| del self._entries[key] | ||
| return default | ||
|
|
||
| self._entries.move_to_end(key) | ||
| return value | ||
|
|
||
| def __setitem__(self, key, value): | ||
| self._entries[key] = (value, time.monotonic() + self._ttl_seconds) | ||
| self._entries.move_to_end(key) | ||
| while len(self._entries) > self._max_size: | ||
| self._entries.popitem(last=False) | ||
|
|
||
|
|
||
| _repo_context_process_cache = _RepoContextCache() | ||
|
|
||
|
|
||
| def _get_markdown_fence(content: str) -> str: | ||
| fence = MARKDOWN_FENCE | ||
| while fence in content: | ||
| fence += "`" | ||
| return fence | ||
|
|
||
|
|
||
| def _get_repo_context_cache_key(context_files: list, max_lines: int) -> tuple[tuple[tuple[str, str], ...], int]: | ||
| return tuple((type(file_path).__name__, str(file_path)) for file_path in context_files), max_lines | ||
|
|
||
|
|
||
| def _get_repo_context_process_cache_key(git_provider, context_files: list, max_lines: int) -> tuple | None: |
There was a problem hiding this comment.
1. Long type-hinted signatures 📘 Rule violation ⚙ Maintainability
pr_agent/algo/repo_context.py introduces function signatures that likely exceed the repo’s Ruff 120-character line-length limit, which can fail CI linting. Wrap these signatures across multiple lines (or refactor the type hints) to conform to the configured style.
Agent Prompt
## Issue description
New code adds long function signatures/type annotations that likely exceed Ruff's configured `line-length = 120`, which can cause lint/CI failures.
## Issue Context
Ruff is configured in `pyproject.toml` with `line-length = 120`, and the PR adds at least two long `def ... -> ...:` lines in the new `repo_context.py` module.
## Fix Focus Areas
- pr_agent/algo/repo_context.py[21-23]
- pr_agent/algo/repo_context.py[62-66]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
817fa77 to
f30a860
Compare
|
Code review by qodo was updated up to the latest commit f30a860 |
| if not self.owner or not self.repo: | ||
| self.logger.warning(f"Cannot get repo file content: owner or repo not set") | ||
| return "" | ||
|
|
||
| ref = self.base_sha or self.base_ref or self.sha | ||
| if not ref: | ||
| self.logger.warning(f"Cannot get repo file content: ref not set") | ||
| return "" |
There was a problem hiding this comment.
1. Useless f-strings in giteaprovider.get_repo_file_content() 📘 Rule violation ⚙ Maintainability
self.logger.warning() uses f-strings without any interpolated values, which triggers Ruff/Pyflakes F541 and can fail CI lint checks. Replace them with regular string literals.
Agent Prompt
## Issue description
`GiteaProvider.get_repo_file_content()` logs warnings using `f"..."` strings that contain no `{}` placeholders, which Ruff/Pyflakes flags as `F541`.
## Issue Context
The repo uses Ruff with `F` (Pyflakes) enabled (`[tool.ruff].lint.select` includes `F`), so this can break lint/CI.
## Fix Focus Areas
- pr_agent/git_providers/gitea_provider.py[751-758]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| with patch("pr_agent.algo.repo_context.time.monotonic", side_effect=[100, 100, 2000, 2000, 2000]): | ||
| first_context = build_repo_context(first_provider) | ||
| second_context = build_repo_context(second_provider) | ||
|
|
||
| assert "Repo purpose" in first_context | ||
| assert "Changed repo purpose" in second_context | ||
| assert first_provider.requested_paths == ["AGENTS.md"] | ||
| assert second_provider.requested_paths == ["AGENTS.md"] | ||
|
|
||
|
|
||
| def test_build_repo_context_refreshes_empty_process_cache_after_ttl(repo_context_settings): | ||
| repo_context_settings.set("CONFIG.REPO_CONTEXT_FILES", ["AGENTS.md"]) | ||
| repo_context_settings.set("CONFIG.REPO_CONTEXT_MAX_LINES", 500) | ||
| first_provider = FakeProvider({}, pr_url="https://example.com/org/repo/pull/1") | ||
| second_provider = FakeProvider({"AGENTS.md": "Repo purpose"}, pr_url="https://example.com/org/repo/pull/1") | ||
|
|
||
| with patch("pr_agent.algo.repo_context.time.monotonic", side_effect=[100, 100, 2000, 2000, 2000]): | ||
| first_context = build_repo_context(first_provider) | ||
| second_context = build_repo_context(second_provider) | ||
|
|
There was a problem hiding this comment.
2. Repo context ttl tests fail 🐞 Bug ≡ Correctness
The TTL refresh tests patch time.monotonic() with only 5 side_effect values, but two build_repo_context() calls can invoke time.monotonic() 6 times (cache gets on the second call plus storing into both provider and process caches), leading to StopIteration and failing CI.
Agent Prompt
### Issue description
Two new TTL-related tests patch `pr_agent.algo.repo_context.time.monotonic` with a finite `side_effect` list of length 5. The production code can call `time.monotonic()` at least 6 times across the two `build_repo_context()` invocations in the test, so the mock runs out of values and raises `StopIteration`, breaking CI.
### Issue Context
`_RepoContextCache.__setitem__` always calls `time.monotonic()`, and `_RepoContextCache.get()` calls `time.monotonic()` when an entry exists. The tests do two `build_repo_context()` calls, which (after the first call populates caches) can trigger both cache `get()` checks and both cache `__setitem__` writes.
### Fix Focus Areas
- tests/unittest/test_repo_context.py[109-138]
### Suggested change
Update both `side_effect=[100, 100, 2000, 2000, 2000]` occurrences to include at least one more value, e.g.:
- `side_effect=[100, 100, 2000, 2000, 2000, 2000]`
(or replace with a callable/iterator that won’t exhaust, e.g. `itertools.chain([100, 100], itertools.repeat(2000))`).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
Code review by qodo was updated up to the latest commit 5dde4dc |
| def test_apply_repo_settings_merges_global_before_local_settings(monkeypatch): | ||
| settings = get_settings() | ||
| original_extra_instructions = settings.pr_reviewer.extra_instructions | ||
| original_enable_intro_text = settings.pr_reviewer.enable_intro_text | ||
| monkeypatch.setattr(utils, "get_git_provider_with_context", lambda pr_url: FakeSettingsProvider()) | ||
|
|
||
| try: | ||
| apply_repo_settings("https://github.example.com/org/service/pull/1") | ||
|
|
||
| assert settings.pr_reviewer.extra_instructions == "local" | ||
| assert settings.pr_reviewer.enable_intro_text is False | ||
| finally: | ||
| settings.pr_reviewer.extra_instructions = original_extra_instructions | ||
| settings.pr_reviewer.enable_intro_text = original_enable_intro_text | ||
|
|
There was a problem hiding this comment.
1. test_apply_repo_settings leaks env var 📘 Rule violation ☼ Reliability
The new unit test calls apply_repo_settings(), which mutates os.environ["AUTO_CAST_FOR_DYNACONF"], but the test does not explicitly set/reset that env var, leaking process-global state across tests and risking order-dependent failures.
Agent Prompt
## Issue description
`test_apply_repo_settings_merges_global_before_local_settings()` calls `apply_repo_settings()`, which sets `os.environ["AUTO_CAST_FOR_DYNACONF"]`. The test does not explicitly manage this environment variable, so the process-global state can leak into other tests.
## Issue Context
`apply_repo_settings()` mutates `AUTO_CAST_FOR_DYNACONF` unconditionally. Per the test hygiene rule, tests must explicitly set/delete env vars they affect and ensure cleanup to avoid flaky, order-dependent outcomes.
## Fix Focus Areas
- tests/unittest/test_git_provider_utils.py[46-60]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
/describe |
|
PR Description updated to latest commit (5dde4dc)
|
User description
Resolves #2357.
I backed out my other PR because it was too many changes (previously included "de-dupe inline comments") in one PR.
PR Type
Enhancement, Bug fix
Description
Add support for per-repository context files (AGENTS.md, CLAUDE.md, etc.) included in AI prompts
Allow overriding Claude extended thinking models via configuration
Support GitHub organization-level global settings repository
Implement repo context caching with TTL and size limits
Add repo file fetching for Gitea, GitLab, and GitHub providers
Diagram Walkthrough
File Walkthrough
15 files
Expand Claude extended thinking model supportAdd configurable Claude extended thinking model overrideNew module for repository context file handlingAdd base method for repository file content retrievalImplement repository file content fetching for GiteaAdd GitHub org-level settings and repo file fetchingImplement repository file content fetching for GitLabSupport multiple repository settings files with mergingInclude repository context in code suggestions promptsInclude repository context in description promptsInclude repository context in review promptsAdd repository context section to code suggestions templateAdd repository context section to non-decoupled templateAdd repository context section to description templateAdd repository context section to review template1 files
Add repo context and Claude model override configuration options6 files
Add tests for merged global and local settingsAdd tests for Gitea repository file content fetchingNew tests for GitHub org-level settings supportAdd tests for GitLab repository file content fetchingNew tests for Claude extended thinking model overrideComprehensive tests for repository context functionality2 files
Document GitHub organization-level settings supportDocument global settings and GitHub Enterprise support