prio-queue: use cascade-down sift for faster extract-min by spkrka · Pull Request #2132 · gitgitgadget/git

spkrka · 2026-05-30T17:07:59Z

Summary

Replace the standard sift-down in prio_queue_get() with a cascade-down approach that halves the number of comparisons and reduces data movement from 3 copies per level (swap) to 1 copy per level.

The standard extract-min places the last array element at the root, then sifts it down. At each level this requires two comparisons (left vs right child, then element vs winner) and, when the element is larger, a swap (three 16-byte copies).

The cascade approach saves the root, then promotes the smaller child into the vacant slot at each level — one comparison and one copy. The hole sinks to a leaf, where the saved element is placed and sifted up if needed (typically zero levels, since the last array element tends to be a low-priority leaf value).

For a heap of depth d, work per extract drops from 2d comparisons + 3d copies to d comparisons + d copies.

prio_queue_replace() is simplified to a plain get+put sequence. The previous implementation shared sift_down_root() with get, but the cascade approach no longer accommodates that cleanly. This is fine in practice: replace is only called from pop_most_recent_commit() (fetch-pack, object-name, walker) and show-branch — none of which appear in any hot path.

Performance

Profiling git rev-list --count on a 2.5M-commit monorepo shows sift_down_root dropping from 8.2% to 0.4% of total runtime, effectively eliminated as significant overhead.

Synthetic benchmark

10M put+get cycles with ascending integer keys, CPU-pinned, 3 runs per data point. Consistent improvement across all queue sizes, no regressions:

queue width	baseline	cascade	speedup
10	0.457s	0.428s	1.07x
100	0.845s	0.692s	1.22x
1,000	1.207s	0.980s	1.23x
10,000	1.652s	1.338s	1.23x
100,000	2.208s	1.795s	1.23x

End-to-end benchmarks

All benchmarks use git-bench with 1 warmup run followed by 10 timed runs. Each configuration is built from the same source tree and tested on the same repo in alternating order.

Large monorepo (2.5M commits, wide DAG, 8.4K remote branches) — range of ~200 first-parent / ~434 total commits:

Command	baseline	cascade	speedup
rev-list --count A..B	2.08s	1.97s	1.03x
log --oneline A..B	2.06s	2.01s	1.04x

linux kernel (1.4M commits) — range v5.0..v6.0 (311K commits):

Command	baseline	cascade	speedup
rev-list --count v5.0..v6.0	455ms	440ms	1.04x

git.git (81K commits) — range v2.0.0..v2.45.0 (37K commits):

Command	baseline	cascade	speedup
rev-list --count v2.0..v2.45	81ms	80ms	~1.00x

The improvement scales with DAG width: wider DAGs produce larger priority queues, amplifying the per-level savings. In small or narrow repositories the priority queues stay shallow and the sift-down cost is already negligible, so the change is not noticeable.

Testing

Existing test coverage is thorough:

t/unit-tests/u-prio-queue.c (7 tests): verifies ordering invariants for sorted put/get, interleaved operations, replace, empty-queue edge cases, and LIFO mode at heap depths of 3-4.
t6600-test-reach.sh (50 tests): exercises real commit-graph traversals through merge-base, is-ancestor, rev-list topo-order, for-each-ref ahead-behind, and branch --merged/--contains.

All tests pass.

Replace the standard sift-down in prio_queue_get() with a cascade-down approach. The standard approach places the last array element at the root, then sifts it down. At each level this requires two comparisons (left vs right child, then element vs winner) and, when the element is larger, a swap (three 16-byte copies). The cascade approach instead promotes the smaller child into the vacant root slot at each level — one comparison and one copy. The vacancy sinks to a leaf, where the last array element is placed and sifted up if needed — typically zero levels since the last array element tends to be large. For a heap of depth d, work per extract drops from 2d comparisons + 3d copies to d comparisons + d copies: roughly half the comparisons and a third of the data movement. Simplify prio_queue_replace() to a plain get+put sequence. The previous implementation shared sift_down_root() with get, but the cascade approach no longer accommodates that cleanly. This is fine in practice: replace is only called from pop_most_recent_commit() (fetch-pack, object-name, walker) and show-branch — none of which appear in any hot path. A synthetic benchmark (10M put+get cycles with ascending keys, CPU-pinned, 3 runs per data point) shows consistent improvement across all queue sizes, with no regressions: queue width baseline cascade speedup ------------------------------------------------ 10 0.457s 0.428s 1.07x 100 0.845s 0.692s 1.22x 1,000 1.207s 0.980s 1.23x 10,000 1.652s 1.338s 1.23x 100,000 2.208s 1.795s 1.23x In end-to-end git commands the improvement is modest because sift_down_root is only ~8% of total runtime. Profiling rev-list --count on a 2.5M-commit monorepo shows sift_down_root dropping from 8.2% to 0.4% of total runtime. The improvement scales with DAG width: wider DAGs produce larger priority queues, amplifying the per-level savings. In small or narrow repos the queues stay shallow and the effect is negligible. Signed-off-by: Kristofer Karlsson <krka@spotify.com>

spkrka force-pushed the cascade-sift-down branch 6 times, most recently from df494d3 to 2038c08 Compare May 30, 2026 17:26

spkrka force-pushed the cascade-sift-down branch from 2038c08 to a45f027 Compare May 30, 2026 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prio-queue: use cascade-down sift for faster extract-min#2132

prio-queue: use cascade-down sift for faster extract-min#2132
spkrka wants to merge 1 commit into
gitgitgadget:masterfrom
spkrka:cascade-sift-down

spkrka commented May 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spkrka commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance

Synthetic benchmark

End-to-end benchmarks

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

spkrka commented May 30, 2026 •

edited

Loading