git-commit-vandalism

Author	SHA1	Message	Date
Ævar Arnfjörð Bjarmason	6269f8eaad	treewide: always have a valid "index_state.repo" member When the "repo" member was added to "the_index" in [1] the repo_read_index() was made to populate it, but the unpopulated "the_index" variable didn't get the same treatment. Let's do that in initialize_the_repository() when we set it up, and likewise for all of the current callers initialized an empty "struct index_state". This simplifies code that needs to deal with "the_index" or a custom "struct index_state", we no longer need to second-guess this part of the "index_state" deep in the stack. A recent example of such second-guessing is the "istate->repo ? istate->repo : the_repository" code in [2]. We can now simply use "istate->repo". We're doing this by making use of the INDEX_STATE_INIT() macro (and corresponding function) added in [3], which now have mandatory "repo" arguments. Because we now call index_state_init() in repository.c's initialize_the_repository() we don't need to handle the case where we have a "repo->index" whose "repo" member doesn't match the "repo" we're setting up, i.e. the "Complete the double-reference" code in repo_read_index() being altered here. That logic was originally added in [1], and was working around the lack of what we now have in initialize_the_repository(). For "fsmonitor-settings.c" we can remove the initialization of a NULL "r" argument to "the_repository". This was added back in [4], and was needed at the time for callers that would pass us the "r" from an "istate->repo". Before this change such a change to "fsmonitor-settings.c" would segfault all over the test suite (e.g. in t0002-gitfile.sh). This change has wider eventual implications for "fsmonitor-settings.c". The reason the other lazy loading behavior in it is required (starting with "if (!r->settings.fsmonitor) ..." is because of the previously passed "r" being "NULL". I have other local changes on top of this which move its configuration reading to "prepare_repo_settings()" in "repo-settings.c", as we could now start to rely on it being called for our "r". But let's leave all of that for now, and narrowly remove this particular part of the lazy-loading. 1. `1fd9ae517c` (repository: add repo reference to index_state, 2021-01-23) 2. `ee1f0c242e` (read-cache: add index.skipHash config option, 2023-01-06) 3. `2f6b1eb794` (cache API: add a "INDEX_STATE_INIT" macro/function, add release_index(), 2023-01-12) 4. `1e0ea5c431` (fsmonitor: config settings are repository-specific, 2022-03-25) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-17 14:32:06 -08:00
Ævar Arnfjörð Bjarmason	2f6b1eb794	cache API: add a "INDEX_STATE_INIT" macro/function, add release_index() Hopefully in some not so distant future, we'll get advantages from always initializing the "repo" member of the "struct index_state". To make that easier let's introduce an initialization macro & function. The various ad-hoc initialization of the structure can then be changed over to it, and we can remove the various "0" assignments in discard_index() in favor of calling index_state_init() at the end. While not strictly necessary, let's also change the CALLOC_ARRAY() of various "struct index_state *" to use an ALLOC_ARRAY() followed by index_state_init() instead. We're then adding the release_index() function and converting some callers (including some of these allocations) over to it if they either won't need to use their "struct index_state" again, or are just about to call index_state_init(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-16 10:46:58 -08:00
Kyle Meyer	42db324c0f	merge-recursive: fix variable typo in error message Signed-off-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-11-27 10:26:10 +09:00
Ævar Arnfjörð Bjarmason	5cf88fd8b0	git-compat-util.h: use "UNUSED", not "UNUSED(var)" As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of `9b24034754` (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:49:48 -07:00
Jeff King	555ff1c8a4	mark unused read_tree_recursive() callback parameters We pass a callback to read_tree_recursive(), but not every callback needs every parameter. Let's mark the unused ones to satisfy -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:56 -07:00
Jeff King	02c3c59e62	hashmap: mark unused callback parameters Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:55 -07:00
Junio C Hamano	2da81d1efb	Merge branch 'ab/plug-leak-in-revisions' Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...	2022-06-07 14:10:56 -07:00
Junio C Hamano	538dc459a0	Merge branch 'ep/maint-equals-null-cocci' Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-20 15:26:59 -07:00
Junio C Hamano	afe8a9070b	tree-wide: apply equals-null.cocci Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-02 09:50:37 -07:00
Ævar Arnfjörð Bjarmason	2108fe4a19	revisions API users: add straightforward release_revisions() Add a release_revisions() to various users of "struct rev_list" in those straightforward cases where we only need to add the release_revisions() call to the end of a block, and don't need to e.g. refactor anything to use a "goto cleanup" pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-13 23:56:08 -07:00
Ævar Arnfjörð Bjarmason	f260505142	string_list API users: use string_list_init_{no,}dup Follow-up on the introduction of string_list_init_nodup() and string_list_init_dup() in the series merged in `bd4232fac3` (Merge branch 'ab/struct-init', 2021-07-16) and convert code that implicitly relied on xcalloc() being equivalent to the initializer to use xmalloc() and string_list_init_{no,}dup() instead. In the case of get_unmerged() in merge-recursive.c we used the combination of xcalloc() and assigning "1" to "strdup_strings" to get what we'd get via string_list_init_dup(), let's use that instead. Adjacent code in cmd_format_patch() will be changed in a subsequent commit, since we're changing that let's change the other in-tree patterns that do the same. Let's also convert a "x == NULL" to "!x" per our CodingGuidelines, as we need to change the "if" line anyway. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-13 23:56:08 -07:00
Junio C Hamano	430883a70c	Merge branch 'ab/object-file-api-updates' Object-file API shuffling. * ab/object-file-api-updates: object-file API: pass an enum to read_object_with_reference() object-file.c: add a literal version of write_object_file_prepare() object-file API: have hash_object_file() take "enum object_type" object API: rename hash_object_file_literally() to write_() object-file API: split up and simplify check_object_signature() object API users + docs: check <0, not !0 with check_object_signature() object API docs: move check_object_signature() docs to cache.h object API: correct "buf" v.s. "map" mismatch in .c and *.h object-file API: have write_object_file() take "enum object_type" object-file API: add a format_object_header() function object-file API: return "void", not "int" from hash_object_file() object-file.c: split up declaration of unrelated variables	2022-03-16 17:53:08 -07:00
Ævar Arnfjörð Bjarmason	c80d226a04	object-file API: have write_object_file() take "enum object_type" Change the write_object_file() function to take an "enum object_type" instead of a "const char type". Its callers either passed {commit,tree,blob,tag}_type and can pass the corresponding OBJ_ type instead, or were hardcoding strings like "blob". This avoids the back & forth fragility where the callers of write_object_file() would have the enum type, and convert it themselves via type_name(). We do have to now do that conversion ourselves before calling write_object_file_prepare(), but those codepaths will be similarly adjusted in subsequent commits. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Elijah Newren	6054d1aac3	merge-ort: format messages slightly different for use in headers When users run git show --remerge-diff $MERGE_COMMIT or git log -p --remerge-diff ... stdout is not an appropriate location to dump conflict messages, but we do want to provide them to users. We will include them in the diff headers instead...but for that to work, we need for any multiline messages to replace newlines with both a newline and a space. Add a new flag to signal when we want these messages modified in such a fashion, and use it in path_msg() to modify these messages this way. Also, allow a special prefix to be specified for these headers. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-02 10:02:27 -08:00
Elijah Newren	35f6967161	ll-merge: make callers responsible for showing warnings Since some callers may want to send warning messages to somewhere other than stdout/stderr, stop printing "warning: Cannot merge binary files" from ll-merge and instead modify the return status of ll_merge() to indicate when a merge of binary files has occurred. Message printing probably does not belong in a "low-level merge" anyway. This commit continues printing the message as-is, just from the callers instead of within ll_merge(). Future changes will start handling the message differently in the merge-ort codepath. There was one special case here: the callers in rerere.c do NOT check for and print such a message; since those code paths explicitly skip over binary files, there is no reason to check for a return status of LL_MERGE_BINARY_CONFLICT or print the related message. Note that my methodology included first modifying ll_merge() to return a struct, so that the compiler would catch all the callers for me and ensure I had modified all of them. After modifying all of them, I then changed the struct to an enum. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-02 10:02:27 -08:00
Junio C Hamano	162a13b855	Merge branch 'jt/no-abuse-alternate-odb-for-submodules' Follow through the work to use the repo interface to access submodule objects in-process, instead of abusing the alternate object database interface. * jt/no-abuse-alternate-odb-for-submodules: submodule: trace adding submodule ODB as alternate submodule: pass repo to check_has_commit() object-file: only register submodule ODB if needed merge-{ort,recursive}: remove add_submodule_odb() refs: peeling non-the_repository iterators is BUG refs: teach arbitrary repo support to iterators refs: plumb repo into ref stores	2021-10-25 16:06:56 -07:00
Junio C Hamano	a7c2daa06d	Merge branch 'en/removing-untracked-fixes' Various fixes in code paths that move untracked files away to make room. * en/removing-untracked-fixes: Documentation: call out commands that nuke untracked files/directories Comment important codepaths regarding nuking untracked files/dirs unpack-trees: avoid nuking untracked dir in way of locally deleted file unpack-trees: avoid nuking untracked dir in way of unmerged file Change unpack_trees' 'reset' flag into an enum Remove ignored files by default when they are in the way unpack-trees: make dir an internal-only struct unpack-trees: introduce preserve_ignored to unpack_trees_options read-tree, merge-recursive: overwrite ignored files by default checkout, read-tree: fix leak of unpack_trees_options.dir t2500: add various tests for nuking untracked files	2021-10-13 15:15:57 -07:00
Jonathan Tan	155b517d5c	merge-{ort,recursive}: remove add_submodule_odb() After the parent commit and some of its ancestors, the only place commits are being accessed through alternates is in the user-facing message formatting code. Fix those, and remove the add_submodule_odb() calls. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-08 15:06:06 -07:00
Junio C Hamano	921c795c25	Merge branch 'jt/add-submodule-odb-clean-up' More code paths that use the hack to add submodule's object database to the set of alternate object store have been cleaned up. * jt/add-submodule-odb-clean-up: revision: remove "submodule" from opt struct repository: support unabsorbed in repo_submodule_init submodule: remove unnecessary unabsorbed fallback	2021-10-06 13:40:11 -07:00
Elijah Newren	04988c8d18	unpack-trees: introduce preserve_ignored to unpack_trees_options Currently, every caller of unpack_trees() that wants to ensure ignored files are overwritten by default needs to: * allocate unpack_trees_options.dir * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags * call setup_standard_excludes AND then after the call to unpack_trees() needs to * call dir_clear() * deallocate unpack_trees_options.dir That's a fair amount of boilerplate, and every caller uses identical code. Make this easier by instead introducing a new boolean value where the default value (0) does what we want so that new callers of unpack_trees() automatically get the appropriate behavior. And move all the handling of unpack_trees_options.dir into unpack_trees() itself. While preserve_ignored = 0 is the behavior we feel is the appropriate default, we defer fixing commands to use the appropriate default until a later commit. So, this commit introduces several locations where we manually set preserve_ignored=1. This makes it clear where code paths were previously preserving ignored files when they should not have been; a future commit will flip these to instead use a value of 0 to get the behavior we want. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	491a7575f1	read-tree, merge-recursive: overwrite ignored files by default This fixes a long-standing patchwork of ignored files handling in read-tree and merge-recursive, called out and suggested by Junio long ago. Quoting from commit `dcf0c16ef1` ("core.excludesfile clean-up" 2007-11-16): git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. ... On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). History shows each of these were partially or fully fixed: * clean was taught the new trick in `1617adc7a0` ("Teach git clean to use setup_standard_excludes()", 2007-11-14). * read-tree was primarily used by checkout & merge scripts. checkout and merge later became builtins and were both fixed to use the new setup_standard_excludes() handling in `fc001b526c` ("checkout,merge: loosen overwriting untracked file check based on info/exclude", 2011-11-27). So the primary users were fixed, though read-tree itself was not. * merge-recursive has now been replaced as the default merge backend by merge-ort. merge-ort fixed this by using setup_standard_excludes() starting early in its implementation; see commit `6681ce5cf6` ("merge-ort: add implementation of checkout()", 2020-12-13), largely due to its design depending on checkout() and thus being influenced by the checkout code. However, merge-recursive itself was not fixed here, in part because its design meant it had difficulty differentiating between untracked files, ignored files, leftover tracked files that haven't been removed yet due to order of processing files, and files written by itself due to collisions). Make the conversion more complete by now handling read-tree and handling at least the unpack_trees() portion of merge-recursive. While merge-recursive is on its way out, fixing the unpack_trees() portion is easy and facilitates some of the later changes in this series. Note that fixing read-tree makes the --exclude-per-directory option to read-tree useless, so we remove it from the documentation (though we continue to accept it if passed). The read-tree changes happen to fix a bug in t1013. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Junio C Hamano	6295f87b5f	Merge branch 'jt/add-submodule-odb-clean-up' into jt/no-abuse-alternate-odb-for-submodules * jt/add-submodule-odb-clean-up: revision: remove "submodule" from opt struct repository: support unabsorbed in repo_submodule_init submodule: remove unnecessary unabsorbed fallback	2021-09-22 17:11:09 -07:00
Junio C Hamano	a16dd13740	Merge branch 'ds/mergies-with-sparse-index' Various mergy operations have been prepared to work efficiently with the sparse index. * ds/mergies-with-sparse-index: sparse-index: integrate with cherry-pick and rebase sequencer: ensure full index if not ORT strategy t1092: add cherry-pick, rebase tests merge-ort: expand only for out-of-cone conflicts merge: make sparse-aware with ORT diff: ignore sparse paths in diffstat	2021-09-20 15:20:45 -07:00
Derrick Stolee	a33806398a	merge: make sparse-aware with ORT Allow 'git merge' to operate without expanding a sparse index, at least not immediately. The index still will be expanded in a few cases: 1. If the merge strategy is 'recursive', then we enable command_requires_full_index at the start of the merge_recursive() method. We expect sparse-index users to also have the 'ort' strategy enabled. 2. With the 'ort' strategy, if the merge results in a conflicted file, then we expand the index before updating the working tree. The loop that iterates over the worktree replaces index entries and tracks 'origintal_cache_nr' which can become completely wrong if the index expands in the middle of the operation. This safety valve is important before that loop starts. A later change will focus this to only expand if we indeed have a conflict outside of the sparse-checkout cone. 3. Other merge strategies are executed as a 'git merge-X' subcommand, and those strategies are currently protected with the 'command_requires_full_index' guard. Some test updates are required, including a mistaken 'git checkout -b' that did not specify the base branch, causing merges to be fast-forward merges. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-09 15:49:04 -07:00
Jonathan Tan	10a0d6ae64	revision: remove "submodule" from opt struct Clean up a TODO in revision.h by removing the "submodule" field from struct setup_revision_opt. This field is only used to specify the ref store to use, so use rev_info->repo to determine the ref store instead. The only users of this field are merge-ort.c and merge-recursive.c. However, both these files specify the superproject as rev_info->repo and the submodule as setup_revision_opt->submodule. In order to be able to pass the submodule as rev_info->repo, all commits must be parsed with the submodule explicitly specified; this patch does that as well. (An incremental solution in which only some commits are parsed with explicit submodule will not work, because if the same commit is parsed twice in different repositories, there will be 2 heap-allocated object structs corresponding to that commit, and any flag set by the revision walking mechanism on one of them will not be reflected onto the other.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-09 14:09:30 -07:00
René Scharfe	2dee7e6105	merge-recursive: use fspathcmp() in path_hashmap_cmp() Call fspathcmp() instead of open-coding it. This shortens the code and makes it less repetitive. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-30 09:44:12 -07:00
René Scharfe	7431842325	use fspathhash() everywhere `cf2dc1c238` (speed up alt_odb_usable() with many alternates, 2021-07-07) introduced the function fspathhash() for calculating path hashes while respecting the configuration option core.ignorecase. Call it instead of open-coding it; the resulting code is shorter and less repetitive. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-30 12:14:27 -07:00
Junio C Hamano	268055bfde	Merge branch 'en/rename-limits-doc' Documentation on "git diff -l<n>" and diff.renameLimit have been updated, and the defaults for these limits have been raised. * en/rename-limits-doc: rename: bump limit defaults yet again diffcore-rename: treat a rename_limit of 0 as unlimited doc: clarify documentation for rename/copy limits diff: correct warning message when renameLimit exceeded	2021-07-28 13:18:03 -07:00
Junio C Hamano	dd6d3c90ee	Merge branch 'ab/attribute-format' Many "printf"-like helper functions we have have been annotated with __attribute__() to catch placeholder/parameter mismatches. * ab/attribute-format: advice.h: add missing __attribute__((format)) & fix usage .h: add a few missing __attribute__((format)) .c static functions: add missing __attribute__((format)) sequencer.c: move static function to avoid forward decl *.c static functions: don't forward-declare __attribute__	2021-07-28 13:17:59 -07:00
Junio C Hamano	bd4232fac3	Merge branch 'ab/struct-init' Code cleanup around struct_type_init() functions. * ab/struct-init: string-list.h users: change to use _{nodup,dup}() string-list.[ch]: add a string_list_init_{nodup,dup}() dir.[ch]: replace dir_init() with DIR_INIT .c _init(): define in terms of corresponding _INIT macro .h: move some _INIT to designated initializers	2021-07-16 17:42:53 -07:00
Junio C Hamano	d3b88be1b4	Merge branch 'en/merge-dir-rename-corner-case-fix' The merge code had funny interactions between content based rename detection and directory rename detection. * en/merge-dir-rename-corner-case-fix: merge-recursive: handle rename-to-self case merge-ort: ensure we consult df_conflict and path_conflicts t6423: test directory renames causing rename-to-self	2021-07-16 17:42:45 -07:00
Elijah Newren	94b82d5686	rename: bump limit defaults yet again These were last bumped in commit `92c57e5c1d` (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-15 16:54:34 -07:00
Ævar Arnfjörð Bjarmason	48ca53cac4	*.c static functions: add missing __attribute__((format)) Add missing __attribute__((format)) function attributes to various "static" functions that take printf arguments. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-13 15:20:20 -07:00
Ævar Arnfjörð Bjarmason	bc40dfb10a	string-list.h users: change to use *_{nodup,dup}() Change all in-tree users of the string_list_init(LIST, BOOL) API to use string_list_init_{nodup,dup}(LIST) instead. As noted in the preceding commit let's leave the now-unused string_list_init() wrapper in-place for any in-flight users, it can be removed at some later date. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:32:22 -07:00
Elijah Newren	3585d0ea23	merge-recursive: handle rename-to-self case Directory rename detection can cause transitive renames, e.g. if the two different sides of history each do one half of: A/file -> B/file B/ -> C/ then directory rename detection transitively renames to give us A/file -> C/file However, when C/ == A/, note that this gives us A/file -> A/file. merge-recursive assumed that any rename D -> E would have D != E. While that is almost always true, the above is a special case where it is not. So we cannot do things like delete the rename source, we cannot assume that a file existing at path E implies a rename/add conflict and we have to be careful about what stages end up in the output. This change feels a bit hackish. It took me surprisingly many hours to find, and given merge-recursive's design causing it to attempt to enumerate all combinations of edge and corner cases with special code for each combination, I'm worried there are other similar fixes needed elsewhere if we can just come up with the right special testcase. Perhaps an audit would rule it out, but I have not the energy. merge-recursive deserves to die, and since it is on its way out anyway, fixing this particular bug narrowly will have to be good enough. Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:10 -07:00
Andrei Rybak	abcb66c614	*: fix typos which duplicate a word Fix typos in documentation, code comments, and RelNotes which repeat various words. In trivial cases, just delete the duplicated word and rewrap text, if needed. Reword the affected sentence in Documentation/RelNotes/1.8.4.txt for it to make sense. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-14 10:16:06 +09:00
Junio C Hamano	5feebddd86	Merge branch 'js/merge-already-up-to-date-message-reword' A few variants of informational message "Already up-to-date" has been rephrased. * js/merge-already-up-to-date-message-reword: merge: fix swapped "up to date" message components merge(s): apply consistent punctuation to "up to date" messages	2021-05-11 15:27:22 +09:00
Junio C Hamano	aaa3c8065d	Merge branch 'bc/hash-transition-interop-part-1' SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id	2021-05-10 16:59:46 +09:00
Eric Sunshine	80cde95eec	merge(s): apply consistent punctuation to "up to date" messages Although the various "Already up to date" messages resulting from merge attempts share identical phrasing, they use a mix of punctuation ranging from "." to "!" and even "Yeeah!", which leads to extra work for translators. Ease the job of translators by settling upon "." as punctuation for all such messages. While at it, take advantage of printf_ln() to further ease the translation task so translators need not worry about line termination, and fix a case of missing line termination in the (unused) merge_ort_nonrecursive() function. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:14:56 +09:00
Junio C Hamano	8e97852919	Merge branch 'ds/sparse-index-protections' Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...	2021-04-30 13:50:26 +09:00
brian m. carlson	14228447c9	hash: provide per-algorithm null OIDs Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
Junio C Hamano	7bec8e7fa6	Merge branch 'en/ort-readiness' Plug the ort merge backend throughout the rest of the system, and start testing it as a replacement for the recursive backend. * en/ort-readiness: Add testing with merge-ort merge strategy t6423: mark remaining expected failure under merge-ort as such Revert "merge-ort: ignore the directory rename split conflict for now" merge-recursive: add a bunch of FIXME comments documenting known bugs merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict t: mark several submodule merging tests as fixed under merge-ort merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries t6428: new test for SKIP_WORKTREE handling and conflicts merge-ort: support subtree shifting merge-ort: let renormalization change modify/delete into clean delete merge-ort: have ll_merge() use a special attr_index for renormalization merge-ort: add a special minimal index just for renormalization merge-ort: use STABLE_QSORT instead of QSORT where required	2021-04-16 13:53:34 -07:00
Derrick Stolee	f7ef64be0c	merge-recursive: ensure full index Before iterating over all cache entries, ensure that a sparse index is expanded to a full index to avoid unexpected behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:47:37 -07:00
Derrick Stolee	847a9e5d4f	*: remove 'const' qualifier for struct index_state Several methods specify that they take a 'struct index_state' pointer with the 'const' qualifier because they intend to only query the data, not change it. However, we will be introducing a step very low in the method stack that might modify a sparse-index to become a full index in the case that our queries venture inside a sparse-directory entry. This change only removes the 'const' qualifiers that are necessary for the following change which will actually modify the implementation of index_name_stage_pos(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:46:00 -07:00
Junio C Hamano	ad16f748f2	Merge branch 'ab/read-tree' Code simplification by removing support for a caller that is long gone. * ab/read-tree: tree.h API: simplify read_tree_recursive() signature tree.h API: expose read_tree_1() as read_tree_at() archive: stop passing "stage" through read_tree_recursive() ls-files: refactor away read_tree() ls-files: don't needlessly pass around stage variable tree.c API: move read_tree() into builtin/ls-files.c ls-files tests: add meaningful --with-tree tests show tests: add test for "git show <tree>"	2021-03-30 14:35:37 -07:00
Ævar Arnfjörð Bjarmason	47957485b3	tree.h API: simplify read_tree_recursive() signature Simplify the signature of read_tree_recursive() to omit the "base", "baselen" and "stage" arguments. No callers of it use these parameters for anything anymore. The last function to call read_tree_recursive() with a non-"" path was read_tree_recursive() itself, but that was changed in `ffd31f661d` (Reimplement read_tree_recursive() using tree_entry_interesting(), 2011-03-25). The last user of the "stage" parameter went away in the last commit, and even that use was mere boilerplate. So let's remove those and rename the read_tree_recursive() function to just read_tree(). We had another read_tree() function that I've refactored away in preceding commits, since all in-tree users read trees recursively with a callback we can change the name to signify that this is the norm. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 16:09:26 -07:00
Elijah Newren	816147e7ba	merge-recursive: add a bunch of FIXME comments documenting known bugs The plan is to just delete merge-recursive, but not until everyone is comfortable with merge-ort as a replacement. Given that I haven't switched all callers of merge-recursive over yet (e.g. git-am still uses merge-recursive), maybe there's some value documenting known bugs in the algorithm in case we end up keeping it or someone wants to dig it up in the future. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
René Scharfe	ca56dadb4b	use CALLOC_ARRAY Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 16:00:09 -08:00
Elijah Newren	b0ca120554	commit: move reverse_commit_list() from merge-recursive Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 21:56:39 -08:00
Elijah Newren	6da1a25814	hashmap: provide deallocation function names hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-02 12:15:50 -08:00

1 2 3 4 5 ...

718 Commits