Skip to content

Duplicate entry hardening#2096

Open
newren wants to merge 5 commits into
gitgitgadget:masterfrom
newren:duplicate-entry-hardening
Open

Duplicate entry hardening#2096
newren wants to merge 5 commits into
gitgitgadget:masterfrom
newren:duplicate-entry-hardening

Conversation

@newren
Copy link
Copy Markdown

@newren newren commented Apr 20, 2026

We had some corrupt trees with duplicate entries in real world repositories, which triggered an assertion failure in merge-ort. Further, the corrupt tree creation in the third party tool would have been avoided had verify_cache() correctly checked for D/F conflicts. Provide fixes for both issues, including 3 preparatory changes for the merge-ort fix.

cc: Patrick Steinhardt ps@pks.im

newren added 5 commits April 20, 2026 14:50
traverse_trees_wrapper() saves entries from a first pass through
traverse_trees() and then replays them through the real callback
(collect_merge_info_callback).  However, the replay loop silently
discards the callback return value.  This means any error reported by
the callback during replay -- including a future check for malformed
trees -- would be ignored, allowing the merge to proceed with corrupt
state.

Capture the return value, stop the loop on negative (error) returns,
and propagate the error to the caller.  Note that the callback returns
a positive mask value on success, so we normalize non-negative returns
to 0 for the caller.

Signed-off-by: Elijah Newren <newren@gmail.com>
collect_merge_info() has set info.show_all_errors = 1 since
d2bc199 (merge-ort: implement a very basic collect_merge_info(),
2020-12-13).  This setting was copied from unpack-trees.c where it
controls batching of error messages for porcelain display, but
merge-ort has no such error-batching logic and never needed it.

With show_all_errors set, traverse_trees() captures a negative callback
return but continues processing remaining entries rather than stopping
immediately.  Removing the setting restores the default behavior where
a negative return from collect_merge_info_callback() breaks out of the
traversal loop right away, allowing a future commit to exit early when
a corrupt tree is detected.

Signed-off-by: Elijah Newren <newren@gmail.com>
clear_or_reinit_internal_opts() is responsible for cleaning up the
various data structures in merge_options_internal.  It already handles
many renames-related structures (dirs_removed, dir_renames,
relevant_sources, cached_pairs, deferred, etc.) but does not free
renames->pairs[].queue.

In the normal code path, resolve_and_process_renames() frees
pairs[s].queue and reinitializes it with diff_queue_init() before
clear_or_reinit_internal_opts() runs, so the omission is harmless.
However, if collect_merge_info() encounters an error and returns early
(before resolve_and_process_renames() is ever called), any diff pairs
already queued by collect_rename_info()/add_pair() will have their
backing array leaked.

Fix this by freeing renames->pairs[].queue in the cleanup function.
In the normal path the pointer is already NULL (from the earlier
diff_queue_init() in resolve_and_process_renames()), so free(NULL) is
a safe no-op.

Signed-off-by: Elijah Newren <newren@gmail.com>
Trees with duplicate entries are malformed; fsck reports "contains
duplicate file entries" for them.  merge-ort has from the beginning
assumed that we would never hit such trees.  It was written with the
assumption that traverse_trees() calls collect_merge_info_callback() at
most once per path.  The "sanity checks" in that callback (added in
d2bc199 (merge-ort: implement a very basic collect_merge_info(),
2020-12-13)) verify properties of each individual call but not that
invariant.  The strmap_put() in setup_path_info() silently overwrites
the entry from any prior call for the same path, because it assumed
there would be no other path.  Unfortunately, supplemental data
structures for various optimizations could still be tweaked before the
extra paths were overwritten, and those data structures not matching
expected state could trip various assertions.

Change the return type of setup_path_info() from void to int to allow us
to detect this case, and abort the merge with a clear error message when
it occurs.

Signed-off-by: Elijah Newren <newren@gmail.com>
verify_cache() checks that the index does not contain both "path" and
"path/file" before writing a tree.  It does this by comparing only
adjacent entries, relying on the assumption that "path/file" would
immediately follow "path" in sorted order.  Unfortunately, this
assumption does not always hold.  For example:

    docs                     <-- submodule entry
    docs-internal/README.md  <-- intervening entry
    docs/requirements.txt    <-- D/F conflict, NOT adjacent to "docs"

When this happens, verify_cache() silently misses the D/F conflict and
write-tree produces a corrupt tree object containing duplicate entries
(one for the submodule "docs" and one for the tree "docs").

I could not find any caller in current git that both allows the index to
get into this state and then tries to write it out without doing other
checks beyond the verify_cache() call in cache_tree_update(), but
verify_cache() is documented as a safety net for preventing corrupt
trees and should actually provide that guarantee.  A downstream consumer
that relied solely on cache_tree_update()'s internal checking via
verify_cache() to prevent duplicate tree entries was bitten by the gap.

Add a test that constructs a corrupt index directly (bypassing the D/F
checks in add_index_entry) and verifies that write-tree now rejects it.

Signed-off-by: Elijah Newren <newren@gmail.com>
@newren
Copy link
Copy Markdown
Author

newren commented Apr 21, 2026

/submit

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Apr 21, 2026

Submitted as pull.2096.git.1776731171.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2096/newren/duplicate-entry-hardening-v1

To fetch this version to local tag pr-2096/newren/duplicate-entry-hardening-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2096/newren/duplicate-entry-hardening-v1

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Apr 26, 2026

This branch is now known as en/ort-harden-against-corrupt-trees.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Apr 26, 2026

This patch series was integrated into seen via git@5ceb988.

@gitgitgadget gitgitgadget Bot added the seen label Apr 26, 2026
@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Apr 28, 2026

This patch series was integrated into seen via git@b2979e0.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 1, 2026

This patch series was integrated into seen via git@5a8e1e0.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 3, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 4, 2026

This patch series was integrated into seen via git@62d334b.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 9, 2026

This patch series was integrated into seen via git@6099d30.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 9, 2026

This patch series was integrated into seen via git@1d317d6.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 11, 2026

This patch series was integrated into seen via git@a1e3c69.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 11, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 11, 2026

This patch series was integrated into seen via git@b7807b6.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 11, 2026

This patch series was integrated into seen via git@8cb15cc.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 12, 2026

This patch series was integrated into seen via git@a090bdb.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 12, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 13, 2026

This patch series was integrated into seen via git@64f0aa6.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 14, 2026

This patch series was integrated into seen via git@1300a5a.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 15, 2026

This patch series was integrated into seen via git@b3f3685.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 18, 2026

This patch series was integrated into seen via git@fc35baf.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 18, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 19, 2026

This patch series was integrated into seen via git@5edc804.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 20, 2026

This patch series was integrated into seen via git@a510b62.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 20, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 21, 2026

This patch series was integrated into seen via git@3924c8d.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 21, 2026

This patch series was integrated into seen via git@4944388.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 21, 2026

This patch series was integrated into seen via git@c7d9b18.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 23, 2026

This patch series was integrated into seen via git@798f6ed.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 23, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 25, 2026

This patch series was integrated into seen via git@87efa79.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 25, 2026

This patch series was integrated into seen via git@4333158.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 25, 2026

This patch series was integrated into seen via git@e4a467a.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 26, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 27, 2026

This patch series was integrated into seen via git@167edf0.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 27, 2026

There was a status update in the "Cooking" section about the branch en/ort-harden-against-corrupt-trees on the Git mailing list:

"ort" merge backend handles merging corrupt trees better by
aborting when it should.

Needs review.
source: <pull.2096.git.1776731171.gitgitgadget@gmail.com>

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 29, 2026

This patch series was integrated into seen via git@0e818b5.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 30, 2026

This patch series was integrated into seen via git@817b54e.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 31, 2026

This patch series was integrated into seen via git@bd6e6ee.

Comment thread merge-ort.c
@@ -1008,18 +1008,20 @@ static int traverse_trees_wrapper(struct index_state *istate,
info->traverse_path = renames->callback_data_traverse_path;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> traverse_trees_wrapper() saves entries from a first pass through
> traverse_trees() and then replays them through the real callback
> (collect_merge_info_callback).  However, the replay loop silently
> discards the callback return value.  This means any error reported by
> the callback during replay -- including a future check for malformed
> trees -- would be ignored, allowing the merge to proceed with corrupt
> state.
>
> Capture the return value, stop the loop on negative (error) returns,
> and propagate the error to the caller.  Note that the callback returns
> a positive mask value on success, so we normalize non-negative returns
> to 0 for the caller.

All makes perfect sense.

How would the externally visible behaviour change at this step?

Upon an error from the callback, we used to keep going and processed
other callback data in the renames structure.  We now leave the rest
unprocessed.

The caller of this helper would never have seen a failure, but now
they will.  Both callers, collect_merge_info_callback() and
handle_deferred_entries(), are reacting to a negative "error" return
well (perhaps because they sometimes call traverse_trees() in the
same control flow, which does return an error already), so
presumably there is no downside caused by aborting the innermost
process upon the first error return.



> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-ort.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 00923ce3cd..4b8e32209d 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1008,18 +1008,20 @@ static int traverse_trees_wrapper(struct index_state *istate,
>  	info->traverse_path = renames->callback_data_traverse_path;
>  	info->fn = old_fn;
>  	for (i = old_offset; i < renames->callback_data_nr; ++i) {
> -		info->fn(n,
> -			 renames->callback_data[i].mask,
> -			 renames->callback_data[i].dirmask,
> -			 renames->callback_data[i].names,
> -			 info);
> +		ret = info->fn(n,
> +			       renames->callback_data[i].mask,
> +			       renames->callback_data[i].dirmask,
> +			       renames->callback_data[i].names,
> +			       info);
> +		if (ret < 0)
> +			break;
>  	}
>  
>  	renames->callback_data_nr = old_offset;
>  	free(renames->callback_data_traverse_path);
>  	renames->callback_data_traverse_path = old_callback_data_traverse_path;
>  	info->traverse_path = NULL;
> -	return 0;
> +	return ret < 0 ? ret : 0;
>  }
>  
>  static void setup_path_info(struct merge_options *opt,

Comment thread merge-ort.c
@@ -1738,7 +1740,6 @@ static int collect_merge_info(struct merge_options *opt,
setup_traverse_info(&info, opt->priv->toplevel_dir);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> collect_merge_info() has set info.show_all_errors = 1 since
> d2bc1994f363 (merge-ort: implement a very basic collect_merge_info(),
> 2020-12-13).  This setting was copied from unpack-trees.c where it
> controls batching of error messages for porcelain display, but
> merge-ort has no such error-batching logic and never needed it.
>
> With show_all_errors set, traverse_trees() captures a negative callback
> return but continues processing remaining entries rather than stopping
> immediately.  Removing the setting restores the default behavior where
> a negative return from collect_merge_info_callback() breaks out of the
> traversal loop right away, allowing a future commit to exit early when
> a corrupt tree is detected.

Nice spotting.  As the error handling eventually is to die without
making any further damange, returning early without seeing "more
errors" is a good change.

>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-ort.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 4b8e32209d..74e9636020 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1740,7 +1740,6 @@ static int collect_merge_info(struct merge_options *opt,
>  	setup_traverse_info(&info, opt->priv->toplevel_dir);
>  	info.fn = collect_merge_info_callback;
>  	info.data = opt;
> -	info.show_all_errors = 1;
>  
>  	if (repo_parse_tree(opt->repo, merge_base) < 0 ||
>  	    repo_parse_tree(opt->repo, side1) < 0 ||

Comment thread merge-ort.c
renames->callback_data_traverse_path = old_callback_data_traverse_path;
info->traverse_path = NULL;
return 0;
return ret < 0 ? ret : 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> Trees with duplicate entries are malformed; fsck reports "contains
> duplicate file entries" for them.  merge-ort has from the beginning
> assumed that we would never hit such trees.  It was written with the
> assumption that traverse_trees() calls collect_merge_info_callback() at
> most once per path.  The "sanity checks" in that callback (added in
> d2bc1994f363 (merge-ort: implement a very basic collect_merge_info(),
> 2020-12-13)) verify properties of each individual call but not that
> invariant.  The strmap_put() in setup_path_info() silently overwrites
> the entry from any prior call for the same path, because it assumed
> there would be no other path.  Unfortunately, supplemental data
> structures for various optimizations could still be tweaked before the
> extra paths were overwritten, and those data structures not matching
> expected state could trip various assertions.
>
> Change the return type of setup_path_info() from void to int to allow us
> to detect this case, and abort the merge with a clear error message when
> it occurs.

OK.

> @@ -1081,9 +1081,11 @@ static void setup_path_info(struct merge_options *opt,
>  			 */
>  			mi->is_null = 1;
>  	}
> -	strmap_put(&opt->priv->paths, fullpath, mi);
> +	if (strmap_put(&opt->priv->paths, fullpath, mi))
> +		return error(_("tree has duplicate entries for '%s'"), fullpath);

OK.  I was wondering what _other_ kind of malformed trees would the
updated code by this change is prepared to handle (most notably,
tree entries must be sorted, and one way to detect duplicate is to
remember one single path that we saw earlier, which would work as
long as the entries are sorted).  This "ah, we saw that path already"
approach is much more robust in that it does not have to depend on a
sorted tree.

Makes sense.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Jun 1, 2026

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> We had some corrupt trees with duplicate entries in real world repositories,
> which triggered an assertion failure in merge-ort. Further, the corrupt tree
> creation in the third party tool would have been avoided had verify_cache()
> correctly checked for D/F conflicts. Provide fixes for both issues,
> including 3 preparatory changes for the merge-ort fix.
>
> Elijah Newren (5):
>   merge-ort: propagate callback errors from traverse_trees_wrapper()
>   merge-ort: drop unnecessary show_all_errors from collect_merge_info()
>   merge-ort: free diff pairs queue in clear_or_reinit_internal_opts()
>   merge-ort: abort merge when trees have duplicate entries
>   cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts

This is a fix to an important corner of our system, but somehow left
in "Needs review" state for much longer than I would have liked, so
even though I am officially on vacation ;-), I took some time to
read these through (by the way it was a pleasant read, thank you).

I wonder if we create a rule like

    Those of you who have more than 30 commits in our project are
    expected to review one topic (or more) from other contributors
    for every three patches you send and ask for reviews by others.

it would help balance the patch vs review ratio, perhaps?

Comment thread cache-tree.c
@@ -192,22 +192,62 @@ static int verify_cache(struct index_state *istate, int flags)
for (i = 0; i + 1 < istate->cache_nr; i++) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> I could not find any caller in current git that both allows the index to
> get into this state and then tries to write it out without doing other
> checks beyond the verify_cache() call in cache_tree_update(), but
> verify_cache() is documented as a safety net for preventing corrupt
> trees and should actually provide that guarantee.

Oh, absolutely.  This kind of tightening is very much appreciated.

> diff --git a/cache-tree.c b/cache-tree.c
> index 7881b42aa2..f11844fe72 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -192,22 +192,62 @@ static int verify_cache(struct index_state *istate, int flags)
>  	for (i = 0; i + 1 < istate->cache_nr; i++) {
>  		/* path/file always comes after path because of the way
>  		 * the cache is sorted.  Also path can appear only once,
> -		 * which means conflicting one would immediately follow.
> +		 * so path/file is likely the immediately following path
> +		 * but might be separated if there is e.g. a
> +		 * path-internal/... file.
>  		 */
>  		const struct cache_entry *this_ce = istate->cache[i];
>  		const struct cache_entry *next_ce = istate->cache[i + 1];
>  		const char *this_name = this_ce->name;
>  		const char *next_name = next_ce->name;
>  		int this_len = ce_namelen(this_ce);
> +		const char *conflict_name = NULL;
> +
>  		if (this_len < ce_namelen(next_ce) &&
> -		    next_name[this_len] == '/' &&
> +		    next_name[this_len] <= '/' &&
>  		    strncmp(this_name, next_name, this_len) == 0) {
> +			if (next_name[this_len] == '/') {
> +				conflict_name = next_name;
> +			} else if (next_name[this_len] < '/') {
> +				/*
> +				 * The immediately next entry shares our
> +				 * prefix but sorts before "path/" (e.g.,
> +				 * "path-internal" between "path" and
> +				 * "path/file", since '-' (0x2D) < '/'
> +				 * (0x2F)).  Binary search to find where
> +				 * "path/" would be and check for a D/F
> +				 * conflict there.
> +				 */
> +				struct cache_entry *other;
> +				struct strbuf probe = STRBUF_INIT;
> +				int pos;
> +
> +				strbuf_add(&probe, this_name, this_len);
> +				strbuf_addch(&probe, '/');
> +				pos = index_name_pos_sparse(istate,
> +							    probe.buf,
> +							    probe.len);
> +				strbuf_release(&probe);
> +
> +				if (pos < 0)
> +					pos = -pos - 1;
> +				if (pos >= (int)istate->cache_nr)
> +					continue;
> +				other = istate->cache[pos];
> +				if (ce_namelen(other) > this_len &&
> +				    other->name[this_len] == '/' &&
> +				    !strncmp(this_name, other->name, this_len))
> +					conflict_name = other->name;
> +			}
> +		}

The narrow and tall comment block is a sign that this loop is
getting too deeply nested.  I wonder if it makes it easier to follow
if we extract this new logic into a small helper function on its
own?

What the code checks and how it does so both make sense to me, though.

Thanks.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Jun 1, 2026

Patrick Steinhardt wrote on the Git mailing list (how to reply to this email):

On Mon, Jun 01, 2026 at 09:33:10PM +0900, Junio C Hamano wrote:
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > We had some corrupt trees with duplicate entries in real world repositories,
> > which triggered an assertion failure in merge-ort. Further, the corrupt tree
> > creation in the third party tool would have been avoided had verify_cache()
> > correctly checked for D/F conflicts. Provide fixes for both issues,
> > including 3 preparatory changes for the merge-ort fix.
> >
> > Elijah Newren (5):
> >   merge-ort: propagate callback errors from traverse_trees_wrapper()
> >   merge-ort: drop unnecessary show_all_errors from collect_merge_info()
> >   merge-ort: free diff pairs queue in clear_or_reinit_internal_opts()
> >   merge-ort: abort merge when trees have duplicate entries
> >   cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts
> 
> This is a fix to an important corner of our system, but somehow left
> in "Needs review" state for much longer than I would have liked, so
> even though I am officially on vacation ;-), I took some time to
> read these through (by the way it was a pleasant read, thank you).

Honestly, I always shy away from the merge-related subsystems. It has a
lot of subtleties that I don't have any experience with, so I never
really consider my input to be helpful here.

> I wonder if we create a rule like
> 
>     Those of you who have more than 30 commits in our project are
>     expected to review one topic (or more) from other contributors
>     for every three patches you send and ask for reviews by others.

Heh, that would make me condense patch series into fewer patches ;)

> it would help balance the patch vs review ratio, perhaps?

It's a good question. I typically try to aim for reviewing series on the
mailing list at least every second day, and I always encourage other
folks in my team to do the same. But recently I (well, rather we)
haven't really been able to due to the current situation at GitLab,
which forces us to put almost all of our focus towards a different
project for a while.

Overall I agree that everyone who is a core contributor should also make
reviews part of their regular worflow. At least for corporate
contributors that might also make it easier to communicate this to their
respective employers. Regardless of that, my expectation is that there
will be times where it works well, and other times where it works less
well.

Patrick

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Jun 1, 2026

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented Jun 2, 2026

This patch series was integrated into seen via git@db3b688.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant