Duplicate entry hardening#2096
Conversation
traverse_trees_wrapper() saves entries from a first pass through traverse_trees() and then replays them through the real callback (collect_merge_info_callback). However, the replay loop silently discards the callback return value. This means any error reported by the callback during replay -- including a future check for malformed trees -- would be ignored, allowing the merge to proceed with corrupt state. Capture the return value, stop the loop on negative (error) returns, and propagate the error to the caller. Note that the callback returns a positive mask value on success, so we normalize non-negative returns to 0 for the caller. Signed-off-by: Elijah Newren <newren@gmail.com>
collect_merge_info() has set info.show_all_errors = 1 since d2bc199 (merge-ort: implement a very basic collect_merge_info(), 2020-12-13). This setting was copied from unpack-trees.c where it controls batching of error messages for porcelain display, but merge-ort has no such error-batching logic and never needed it. With show_all_errors set, traverse_trees() captures a negative callback return but continues processing remaining entries rather than stopping immediately. Removing the setting restores the default behavior where a negative return from collect_merge_info_callback() breaks out of the traversal loop right away, allowing a future commit to exit early when a corrupt tree is detected. Signed-off-by: Elijah Newren <newren@gmail.com>
clear_or_reinit_internal_opts() is responsible for cleaning up the various data structures in merge_options_internal. It already handles many renames-related structures (dirs_removed, dir_renames, relevant_sources, cached_pairs, deferred, etc.) but does not free renames->pairs[].queue. In the normal code path, resolve_and_process_renames() frees pairs[s].queue and reinitializes it with diff_queue_init() before clear_or_reinit_internal_opts() runs, so the omission is harmless. However, if collect_merge_info() encounters an error and returns early (before resolve_and_process_renames() is ever called), any diff pairs already queued by collect_rename_info()/add_pair() will have their backing array leaked. Fix this by freeing renames->pairs[].queue in the cleanup function. In the normal path the pointer is already NULL (from the earlier diff_queue_init() in resolve_and_process_renames()), so free(NULL) is a safe no-op. Signed-off-by: Elijah Newren <newren@gmail.com>
Trees with duplicate entries are malformed; fsck reports "contains duplicate file entries" for them. merge-ort has from the beginning assumed that we would never hit such trees. It was written with the assumption that traverse_trees() calls collect_merge_info_callback() at most once per path. The "sanity checks" in that callback (added in d2bc199 (merge-ort: implement a very basic collect_merge_info(), 2020-12-13)) verify properties of each individual call but not that invariant. The strmap_put() in setup_path_info() silently overwrites the entry from any prior call for the same path, because it assumed there would be no other path. Unfortunately, supplemental data structures for various optimizations could still be tweaked before the extra paths were overwritten, and those data structures not matching expected state could trip various assertions. Change the return type of setup_path_info() from void to int to allow us to detect this case, and abort the merge with a clear error message when it occurs. Signed-off-by: Elijah Newren <newren@gmail.com>
verify_cache() checks that the index does not contain both "path" and
"path/file" before writing a tree. It does this by comparing only
adjacent entries, relying on the assumption that "path/file" would
immediately follow "path" in sorted order. Unfortunately, this
assumption does not always hold. For example:
docs <-- submodule entry
docs-internal/README.md <-- intervening entry
docs/requirements.txt <-- D/F conflict, NOT adjacent to "docs"
When this happens, verify_cache() silently misses the D/F conflict and
write-tree produces a corrupt tree object containing duplicate entries
(one for the submodule "docs" and one for the tree "docs").
I could not find any caller in current git that both allows the index to
get into this state and then tries to write it out without doing other
checks beyond the verify_cache() call in cache_tree_update(), but
verify_cache() is documented as a safety net for preventing corrupt
trees and should actually provide that guarantee. A downstream consumer
that relied solely on cache_tree_update()'s internal checking via
verify_cache() to prevent duplicate tree entries was bitten by the gap.
Add a test that constructs a corrupt index directly (bypassing the D/F
checks in add_index_entry) and verifies that write-tree now rejects it.
Signed-off-by: Elijah Newren <newren@gmail.com>
|
/submit |
|
Submitted as pull.2096.git.1776731171.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
This branch is now known as |
|
This patch series was integrated into seen via git@5ceb988. |
|
This patch series was integrated into seen via git@b2979e0. |
|
This patch series was integrated into seen via git@5a8e1e0. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@62d334b. |
|
This patch series was integrated into seen via git@6099d30. |
|
This patch series was integrated into seen via git@1d317d6. |
|
This patch series was integrated into seen via git@a1e3c69. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@b7807b6. |
|
This patch series was integrated into seen via git@8cb15cc. |
|
This patch series was integrated into seen via git@a090bdb. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@64f0aa6. |
|
This patch series was integrated into seen via git@1300a5a. |
|
This patch series was integrated into seen via git@b3f3685. |
|
This patch series was integrated into seen via git@fc35baf. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@5edc804. |
|
This patch series was integrated into seen via git@a510b62. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@3924c8d. |
|
This patch series was integrated into seen via git@4944388. |
|
This patch series was integrated into seen via git@c7d9b18. |
|
This patch series was integrated into seen via git@798f6ed. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@87efa79. |
|
This patch series was integrated into seen via git@4333158. |
|
This patch series was integrated into seen via git@e4a467a. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@167edf0. |
|
There was a status update in the "Cooking" section about the branch "ort" merge backend handles merging corrupt trees better by aborting when it should. Needs review. source: <pull.2096.git.1776731171.gitgitgadget@gmail.com> |
|
This patch series was integrated into seen via git@0e818b5. |
|
This patch series was integrated into seen via git@817b54e. |
|
This patch series was integrated into seen via git@bd6e6ee. |
| @@ -1008,18 +1008,20 @@ static int traverse_trees_wrapper(struct index_state *istate, | |||
| info->traverse_path = renames->callback_data_traverse_path; | |||
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
>
> traverse_trees_wrapper() saves entries from a first pass through
> traverse_trees() and then replays them through the real callback
> (collect_merge_info_callback). However, the replay loop silently
> discards the callback return value. This means any error reported by
> the callback during replay -- including a future check for malformed
> trees -- would be ignored, allowing the merge to proceed with corrupt
> state.
>
> Capture the return value, stop the loop on negative (error) returns,
> and propagate the error to the caller. Note that the callback returns
> a positive mask value on success, so we normalize non-negative returns
> to 0 for the caller.
All makes perfect sense.
How would the externally visible behaviour change at this step?
Upon an error from the callback, we used to keep going and processed
other callback data in the renames structure. We now leave the rest
unprocessed.
The caller of this helper would never have seen a failure, but now
they will. Both callers, collect_merge_info_callback() and
handle_deferred_entries(), are reacting to a negative "error" return
well (perhaps because they sometimes call traverse_trees() in the
same control flow, which does return an error already), so
presumably there is no downside caused by aborting the innermost
process upon the first error return.
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> merge-ort.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 00923ce3cd..4b8e32209d 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1008,18 +1008,20 @@ static int traverse_trees_wrapper(struct index_state *istate,
> info->traverse_path = renames->callback_data_traverse_path;
> info->fn = old_fn;
> for (i = old_offset; i < renames->callback_data_nr; ++i) {
> - info->fn(n,
> - renames->callback_data[i].mask,
> - renames->callback_data[i].dirmask,
> - renames->callback_data[i].names,
> - info);
> + ret = info->fn(n,
> + renames->callback_data[i].mask,
> + renames->callback_data[i].dirmask,
> + renames->callback_data[i].names,
> + info);
> + if (ret < 0)
> + break;
> }
>
> renames->callback_data_nr = old_offset;
> free(renames->callback_data_traverse_path);
> renames->callback_data_traverse_path = old_callback_data_traverse_path;
> info->traverse_path = NULL;
> - return 0;
> + return ret < 0 ? ret : 0;
> }
>
> static void setup_path_info(struct merge_options *opt,| @@ -1738,7 +1740,6 @@ static int collect_merge_info(struct merge_options *opt, | |||
| setup_traverse_info(&info, opt->priv->toplevel_dir); | |||
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
>
> collect_merge_info() has set info.show_all_errors = 1 since
> d2bc1994f363 (merge-ort: implement a very basic collect_merge_info(),
> 2020-12-13). This setting was copied from unpack-trees.c where it
> controls batching of error messages for porcelain display, but
> merge-ort has no such error-batching logic and never needed it.
>
> With show_all_errors set, traverse_trees() captures a negative callback
> return but continues processing remaining entries rather than stopping
> immediately. Removing the setting restores the default behavior where
> a negative return from collect_merge_info_callback() breaks out of the
> traversal loop right away, allowing a future commit to exit early when
> a corrupt tree is detected.
Nice spotting. As the error handling eventually is to die without
making any further damange, returning early without seeing "more
errors" is a good change.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
> merge-ort.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 4b8e32209d..74e9636020 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1740,7 +1740,6 @@ static int collect_merge_info(struct merge_options *opt,
> setup_traverse_info(&info, opt->priv->toplevel_dir);
> info.fn = collect_merge_info_callback;
> info.data = opt;
> - info.show_all_errors = 1;
>
> if (repo_parse_tree(opt->repo, merge_base) < 0 ||
> repo_parse_tree(opt->repo, side1) < 0 ||| renames->callback_data_traverse_path = old_callback_data_traverse_path; | ||
| info->traverse_path = NULL; | ||
| return 0; | ||
| return ret < 0 ? ret : 0; |
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
>
> Trees with duplicate entries are malformed; fsck reports "contains
> duplicate file entries" for them. merge-ort has from the beginning
> assumed that we would never hit such trees. It was written with the
> assumption that traverse_trees() calls collect_merge_info_callback() at
> most once per path. The "sanity checks" in that callback (added in
> d2bc1994f363 (merge-ort: implement a very basic collect_merge_info(),
> 2020-12-13)) verify properties of each individual call but not that
> invariant. The strmap_put() in setup_path_info() silently overwrites
> the entry from any prior call for the same path, because it assumed
> there would be no other path. Unfortunately, supplemental data
> structures for various optimizations could still be tweaked before the
> extra paths were overwritten, and those data structures not matching
> expected state could trip various assertions.
>
> Change the return type of setup_path_info() from void to int to allow us
> to detect this case, and abort the merge with a clear error message when
> it occurs.
OK.
> @@ -1081,9 +1081,11 @@ static void setup_path_info(struct merge_options *opt,
> */
> mi->is_null = 1;
> }
> - strmap_put(&opt->priv->paths, fullpath, mi);
> + if (strmap_put(&opt->priv->paths, fullpath, mi))
> + return error(_("tree has duplicate entries for '%s'"), fullpath);
OK. I was wondering what _other_ kind of malformed trees would the
updated code by this change is prepared to handle (most notably,
tree entries must be sorted, and one way to detect duplicate is to
remember one single path that we saw earlier, which would work as
long as the entries are sorted). This "ah, we saw that path already"
approach is much more robust in that it does not have to depend on a
sorted tree.
Makes sense.|
Junio C Hamano wrote on the Git mailing list (how to reply to this email): "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> We had some corrupt trees with duplicate entries in real world repositories,
> which triggered an assertion failure in merge-ort. Further, the corrupt tree
> creation in the third party tool would have been avoided had verify_cache()
> correctly checked for D/F conflicts. Provide fixes for both issues,
> including 3 preparatory changes for the merge-ort fix.
>
> Elijah Newren (5):
> merge-ort: propagate callback errors from traverse_trees_wrapper()
> merge-ort: drop unnecessary show_all_errors from collect_merge_info()
> merge-ort: free diff pairs queue in clear_or_reinit_internal_opts()
> merge-ort: abort merge when trees have duplicate entries
> cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts
This is a fix to an important corner of our system, but somehow left
in "Needs review" state for much longer than I would have liked, so
even though I am officially on vacation ;-), I took some time to
read these through (by the way it was a pleasant read, thank you).
I wonder if we create a rule like
Those of you who have more than 30 commits in our project are
expected to review one topic (or more) from other contributors
for every three patches you send and ask for reviews by others.
it would help balance the patch vs review ratio, perhaps?
|
| @@ -192,22 +192,62 @@ static int verify_cache(struct index_state *istate, int flags) | |||
| for (i = 0; i + 1 < istate->cache_nr; i++) { | |||
There was a problem hiding this comment.
Junio C Hamano wrote on the Git mailing list (how to reply to this email):
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> I could not find any caller in current git that both allows the index to
> get into this state and then tries to write it out without doing other
> checks beyond the verify_cache() call in cache_tree_update(), but
> verify_cache() is documented as a safety net for preventing corrupt
> trees and should actually provide that guarantee.
Oh, absolutely. This kind of tightening is very much appreciated.
> diff --git a/cache-tree.c b/cache-tree.c
> index 7881b42aa2..f11844fe72 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -192,22 +192,62 @@ static int verify_cache(struct index_state *istate, int flags)
> for (i = 0; i + 1 < istate->cache_nr; i++) {
> /* path/file always comes after path because of the way
> * the cache is sorted. Also path can appear only once,
> - * which means conflicting one would immediately follow.
> + * so path/file is likely the immediately following path
> + * but might be separated if there is e.g. a
> + * path-internal/... file.
> */
> const struct cache_entry *this_ce = istate->cache[i];
> const struct cache_entry *next_ce = istate->cache[i + 1];
> const char *this_name = this_ce->name;
> const char *next_name = next_ce->name;
> int this_len = ce_namelen(this_ce);
> + const char *conflict_name = NULL;
> +
> if (this_len < ce_namelen(next_ce) &&
> - next_name[this_len] == '/' &&
> + next_name[this_len] <= '/' &&
> strncmp(this_name, next_name, this_len) == 0) {
> + if (next_name[this_len] == '/') {
> + conflict_name = next_name;
> + } else if (next_name[this_len] < '/') {
> + /*
> + * The immediately next entry shares our
> + * prefix but sorts before "path/" (e.g.,
> + * "path-internal" between "path" and
> + * "path/file", since '-' (0x2D) < '/'
> + * (0x2F)). Binary search to find where
> + * "path/" would be and check for a D/F
> + * conflict there.
> + */
> + struct cache_entry *other;
> + struct strbuf probe = STRBUF_INIT;
> + int pos;
> +
> + strbuf_add(&probe, this_name, this_len);
> + strbuf_addch(&probe, '/');
> + pos = index_name_pos_sparse(istate,
> + probe.buf,
> + probe.len);
> + strbuf_release(&probe);
> +
> + if (pos < 0)
> + pos = -pos - 1;
> + if (pos >= (int)istate->cache_nr)
> + continue;
> + other = istate->cache[pos];
> + if (ce_namelen(other) > this_len &&
> + other->name[this_len] == '/' &&
> + !strncmp(this_name, other->name, this_len))
> + conflict_name = other->name;
> + }
> + }
The narrow and tall comment block is a sign that this loop is
getting too deeply nested. I wonder if it makes it easier to follow
if we extract this new logic into a small helper function on its
own?
What the code checks and how it does so both make sense to me, though.
Thanks.|
Patrick Steinhardt wrote on the Git mailing list (how to reply to this email): On Mon, Jun 01, 2026 at 09:33:10PM +0900, Junio C Hamano wrote:
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > We had some corrupt trees with duplicate entries in real world repositories,
> > which triggered an assertion failure in merge-ort. Further, the corrupt tree
> > creation in the third party tool would have been avoided had verify_cache()
> > correctly checked for D/F conflicts. Provide fixes for both issues,
> > including 3 preparatory changes for the merge-ort fix.
> >
> > Elijah Newren (5):
> > merge-ort: propagate callback errors from traverse_trees_wrapper()
> > merge-ort: drop unnecessary show_all_errors from collect_merge_info()
> > merge-ort: free diff pairs queue in clear_or_reinit_internal_opts()
> > merge-ort: abort merge when trees have duplicate entries
> > cache-tree: fix verify_cache() to catch non-adjacent D/F conflicts
>
> This is a fix to an important corner of our system, but somehow left
> in "Needs review" state for much longer than I would have liked, so
> even though I am officially on vacation ;-), I took some time to
> read these through (by the way it was a pleasant read, thank you).
Honestly, I always shy away from the merge-related subsystems. It has a
lot of subtleties that I don't have any experience with, so I never
really consider my input to be helpful here.
> I wonder if we create a rule like
>
> Those of you who have more than 30 commits in our project are
> expected to review one topic (or more) from other contributors
> for every three patches you send and ask for reviews by others.
Heh, that would make me condense patch series into fewer patches ;)
> it would help balance the patch vs review ratio, perhaps?
It's a good question. I typically try to aim for reviewing series on the
mailing list at least every second day, and I always encourage other
folks in my team to do the same. But recently I (well, rather we)
haven't really been able to due to the current situation at GitLab,
which forces us to put almost all of our focus towards a different
project for a while.
Overall I agree that everyone who is a core contributor should also make
reviews part of their regular worflow. At least for corporate
contributors that might also make it easier to communicate this to their
respective employers. Regardless of that, my expectation is that there
will be times where it works well, and other times where it works less
well.
Patrick |
|
User |
|
This patch series was integrated into seen via git@db3b688. |
We had some corrupt trees with duplicate entries in real world repositories, which triggered an assertion failure in merge-ort. Further, the corrupt tree creation in the third party tool would have been avoided had verify_cache() correctly checked for D/F conflicts. Provide fixes for both issues, including 3 preparatory changes for the merge-ort fix.
cc: Patrick Steinhardt ps@pks.im