Plug various leans reported by LSAN.
* ah/plugleaks:
builtin/rm: avoid leaking pathspec and seen
builtin/rebase: release git_format_patch_opt too
builtin/for-each-ref: free filter and UNLEAK sorting.
mailinfo: also free strbuf lists when clearing mailinfo
builtin/checkout: clear pending objects after diffing
builtin/check-ignore: clear_pathspec before returning
builtin/bugreport: don't leak prefixed filename
branch: FREE_AND_NULL instead of NULL'ing real_ref
bloom: clear each bloom_key after use
ls-files: free max_prefix when done
wt-status: fix multiple small leaks
revision: free remainder of old commit list in limit_list
"git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.
* ps/rev-list-object-type-filter:
rev-list: allow filtering of provided items
pack-bitmap: implement combined filter
pack-bitmap: implement object type filter
list-objects: implement object type filter
list-objects: support filtering by tag and commit
list-objects: move tag processing into its own function
revision: mark commit parents as NOT_USER_GIVEN
uploadpack.txt: document implication of `uploadpackfilter.allow`
"git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.
* mt/add-rm-in-sparse-checkout:
rm: honor sparse checkout patterns
add: warn when asked to update SKIP_WORKTREE entries
refresh_index(): add flag to ignore SKIP_WORKTREE entries
pathspec: allow to ignore SKIP_WORKTREE entries on index matching
add: make --chmod and --renormalize honor sparse checkouts
t3705: add tests for `git add` in sparse checkouts
add: include magic part of pathspec on --refresh error
Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.
* ps/config-global-override:
t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
config: allow overriding of global and system configuration
config: unify code paths to get global config paths
config: rename `git_etc_config()`
"git (branch|tag) --format=..." has been micro-optimized.
* zh/format-ref-array-optim:
ref-filter: reuse output buffer
ref-filter: get rid of show_ref_array_item
The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.
* mt/parallel-checkout-part-2:
parallel-checkout: add design documentation
parallel-checkout: support progress displaying
parallel-checkout: add configuration options
parallel-checkout: make it truly parallel
unpack-trees: add basic support for parallel checkout
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.
* ds/sparse-index-protections: (47 commits)
name-hash: use expand_to_path()
sparse-index: expand_to_path()
name-hash: don't add directories to name_hash
revision: ensure full index
resolve-undo: ensure full index
read-cache: ensure full index
pathspec: ensure full index
merge-recursive: ensure full index
entry: ensure full index
dir: ensure full index
update-index: ensure full index
stash: ensure full index
rm: ensure full index
merge-index: ensure full index
ls-files: ensure full index
grep: ensure full index
fsck: ensure full index
difftool: ensure full index
commit: ensure full index
checkout: ensure full index
...
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.
* ds/maintenance-prefetch-fix:
maintenance: respect remote.*.skipFetchAll
maintenance: use 'git fetch --prefetch'
fetch: add --prefetch option
maintenance: simplify prefetch logic
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).
* jk/promisor-optim:
revision: avoid parsing with --exclude-promisor-objects
lookup_unknown_object(): take a repository argument
is_promisor_object(): free tree buffer after parsing
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.
cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.
LSAN output from t0020:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
#2 0x9ac07a in xmalloc wrapper.c:62:9
#3 0x873277 in parse_pathspec pathspec.c:582:2
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac2a6 in xrealloc wrapper.c:126:8
#2 0x93b14d in strbuf_grow strbuf.c:98:2
#3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
#4 0x93f726 in xstrvfmt strbuf.c:979:2
#5 0x93f8b3 in xstrfmt strbuf.c:989:8
#6 0x92ad8a in prefix_path_gently setup.c:115:15
#7 0x873a8d in init_pathspec_item pathspec.c:439:11
#8 0x87334f in parse_pathspec pathspec.c:589:3
#9 0x646ffa in cmd_rm builtin/rm.c:266:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 15 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac048 in xstrdup wrapper.c:29:14
#2 0x873ba2 in init_pathspec_item pathspec.c:468:20
#3 0x87334f in parse_pathspec pathspec.c:589:3
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac392 in xcalloc wrapper.c:140:8
#2 0x647108 in cmd_rm builtin/rm.c:294:9
#3 0x4cd91d in run_builtin git.c:467:11
#4 0x4cb5f3 in handle_builtin git.c:719:3
#5 0x4ccf47 in run_argv git.c:808:4
#6 0x4caf49 in cmd_main git.c:939:19
#7 0x69dbfe in main common-main.c:52:11
#8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
options.git_format_patch_opt can be populated during cmd_rebase's setup,
and will therefore leak on return. Although we could just UNLEAK all of
options, we choose to strbuf_release() the individual member, which matches
the existing pattern (where we're freeing invidual members of options).
Leak found when running t0021:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac296 in xrealloc wrapper.c:126:8
#2 0x93b13d in strbuf_grow strbuf.c:98:2
#3 0x93bd3a in strbuf_add strbuf.c:295:2
#4 0x60ae92 in strbuf_addstr strbuf.h:304:2
#5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69dbfe in main common-main.c:52:11
#11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sorting might be a list allocated in ref_default_sorting() (in this case
it's a fixed single item list, which has nevertheless been xcalloc'd),
or it might be a list allocated in parse_opt_ref_sorting(). In either
case we could free these lists - but instead we UNLEAK as we're at the
end of cmd_for_each_ref. (There's no existing implementation of
clear_ref_sorting(), and writing a loop to free the list seems more
trouble than it's worth.)
filter.with_commit/no_commit are populated via
OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via
parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because
free_commit_list() already exists, we choose to use that over an UNLEAK.
LSAN output from t0041:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac252 in xcalloc wrapper.c:140:8
#2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32
#3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13
#4 0x4cd91d in run_builtin git.c:467:11
#5 0x4cb5f3 in handle_builtin git.c:719:3
#6 0x4ccf47 in run_argv git.c:808:4
#7 0x4caf49 in cmd_main git.c:939:19
#8 0x69dabe in main common-main.c:52:11
#9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9abf54 in do_xmalloc wrapper.c:41:8
#2 0x9abf2a in xmalloc wrapper.c:62:9
#3 0x717486 in commit_list_insert commit.c:540:33
#4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2
#5 0x869bb5 in get_value parse-options.c:181:11
#6 0x8677dc in parse_long_opt parse-options.c:378:10
#7 0x8659bd in parse_options_step parse-options.c:817:11
#8 0x867fcd in parse_options parse-options.c:870:10
#9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dabe in main common-main.c:52:11
#15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_pending_object() populates rev.pending, we need to take care of
clearing it once we're done.
This code is run close to the end of a checkout, therefore this leak
seems like it would have very little impact. See also LSAN output
from t0020 below:
Direct leak of 2048 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc46 in xrealloc wrapper.c:126:8
#2 0x83e3a3 in add_object_array_with_path object.c:337:3
#3 0x8f672a in add_pending_object_with_path revision.c:329:2
#4 0x8eaeab in add_pending_object_with_mode revision.c:336:2
#5 0x8eae9d in add_pending_object revision.c:342:2
#6 0x5154a0 in show_local_changes builtin/checkout.c:602:2
#7 0x513b00 in merge_working_tree builtin/checkout.c:979:3
#8 0x512cb3 in switch_branches builtin/checkout.c:1242:9
#9 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#10 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#12 0x4cd91d in run_builtin git.c:467:11
#13 0x4cb5f3 in handle_builtin git.c:719:3
#14 0x4ccf47 in run_argv git.c:808:4
#15 0x4caf49 in cmd_main git.c:939:19
#16 0x69e43e in main common-main.c:52:11
#17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_pathspec() allocates new memory into pathspec, therefore we need
to free it when we're done.
An UNLEAK would probably be just as good here - but clear_pathspec() is
not much more work so we might as well use it. check_ignore() is either
called once directly from cmd_check_ignore() (in which case the leak
really doesnt matter), or it can be called multiple times in a loop from
check_ignore_stdin_paths(), in which case we're potentially leaking
multiple times - but even in this scenario the leak is so small as to
have no real consequence.
Found while running t0008:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9aca44 in do_xmalloc wrapper.c:41:8
#2 0x9aca1a in xmalloc wrapper.c:62:9
#3 0x873c17 in parse_pathspec pathspec.c:582:2
#4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69e43e in main common-main.c:52:11
#11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc46 in xrealloc wrapper.c:126:8
#2 0x93baed in strbuf_grow strbuf.c:98:2
#3 0x93d696 in strbuf_vaddf strbuf.c:392:3
#4 0x9400c6 in xstrvfmt strbuf.c:979:2
#5 0x940253 in xstrfmt strbuf.c:989:8
#6 0x92b72a in prefix_path_gently setup.c:115:15
#7 0x87442d in init_pathspec_item pathspec.c:439:11
#8 0x873cef in parse_pathspec pathspec.c:589:3
#9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#11 0x4cd91d in run_builtin git.c:467:11
#12 0x4cb5f3 in handle_builtin git.c:719:3
#13 0x4ccf47 in run_argv git.c:808:4
#14 0x4caf49 in cmd_main git.c:939:19
#15 0x69e43e in main common-main.c:52:11
#16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac9e8 in xstrdup wrapper.c:29:14
#2 0x874542 in init_pathspec_item pathspec.c:468:20
#3 0x873cef in parse_pathspec pathspec.c:589:3
#4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69e43e in main common-main.c:52:11
#11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
prefix_filename() returns newly allocated memory, and strbuf_addstr()
doesn't take ownership of its inputs. Therefore we have to make sure to
store and free prefix_filename()'s result.
As this leak is in cmd_bugreport(), we could just as well UNLEAK the
prefix - but there's no good reason not to just free it properly. This
leak was found while running t0091, see output below:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc66 in xrealloc wrapper.c:126:8
#2 0x93baed in strbuf_grow strbuf.c:98:2
#3 0x93c6ea in strbuf_add strbuf.c:295:2
#4 0x69f162 in strbuf_addstr ./strbuf.h:304:2
#5 0x69f083 in prefix_filename abspath.c:277:2
#6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9
#7 0x4cd91d in run_builtin git.c:467:11
#8 0x4cb5f3 in handle_builtin git.c:719:3
#9 0x4ccf47 in run_argv git.c:808:4
#10 0x4caf49 in cmd_main git.c:939:19
#11 0x69df9e in main common-main.c:52:11
#12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
common_prefix() returns a new string, which we store in max_prefix -
this string needs to be freed to avoid a leak. This leak is happening
in cmd_ls_files, hence is of no real consequence - an UNLEAK would be
just as good, but we might as well free the string properly.
Leak found while running t0002, see output below:
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ab1b4 in do_xmalloc wrapper.c:41:8
#2 0x9ab248 in do_xmallocz wrapper.c:75:8
#3 0x9ab22a in xmallocz wrapper.c:83:9
#4 0x9ab2d7 in xmemdupz wrapper.c:99:16
#5 0x78d6a4 in common_prefix dir.c:191:15
#6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16
#7 0x4cd92d in run_builtin git.c:453:11
#8 0x4cb5fa in handle_builtin git.c:704:3
#9 0x4ccf57 in run_argv git.c:771:4
#10 0x4caf49 in cmd_main git.c:902:19
#11 0x69ce2e in main common-main.c:52:11
#12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.
* jk/pack-objects-bitmap-progress-fix:
pack-objects: update "nr_seen" progress based on pack-reused count
When we use `git for-each-ref`, every ref will allocate
its own output strbuf and error strbuf. But we can reuse
the final strbuf for each step ref's output. The error
buffer will also be reused, despite the fact that the git
will exit when `format_ref_array_item()` return a non-zero
value and output the contents of the error buffer.
The performance for `git for-each-ref` on the Git repository
itself with performance testing tool `hyperfine` changes from
23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. Optimization is relatively
minor.
At the same time, we apply this optimization to `git tag -l`
and `git branch -l`.
This approach is similar to the one used by 79ed0a5
(cat-file: use a single strbuf for all output, 2018-08-14)
to speed up the cat-file builtin.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inlining the exported function `show_ref_array_item()`,
which is not providing the right level of abstraction,
simplifies the API and can unlock improvements at the
former call sites.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's two callsites which assemble global config paths, once in the
config loading code and once in the git-config(1) builtin. We're about
to implement a way to override global config paths via an environment
variable which would require us to adjust both sites.
Unify both code paths into a single `git_global_config()` function which
returns both paths for `~/.gitconfig` and the XDG config file. This will
make the subsequent patch which introduces the new envvar easier to
implement.
No functional changes are expected from this patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git_etc_gitconfig()` function retrieves the system-level path of
the configuration file. We're about to introduce a way to override it
via an environment variable, at which point the name of this function
would start to become misleading.
Rename the function to `git_system_config()` as a preparatory step.
While at it, the function is also refactored to pass memory ownership to
the caller. This is done to better match semantics of
`git_global_config()`, which is going to be introduced in the next
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When providing an object filter, it is currently impossible to also
filter provided items. E.g. when executing `git rev-list HEAD` , the
commit this reference points to will be treated as user-provided and is
thus excluded from the filtering mechanism. This makes it harder than
necessary to properly use the new `--filter=object:type` filter given
that even if the user wants to only see blobs, he'll still see commits
of provided references.
Improve this by introducing a new `--filter-provided-objects` option
to the git-rev-parse(1) command. If given, then all user-provided
references will be subject to filtering.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use multiple worker processes to distribute the queued entries and call
write_pc_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could affect
performance due to lock contention in the kernel. Work stealing (or any
other format of re-distribution) is not implemented yet.
The protocol between the main process and the workers is quite simple.
They exchange binary messages packed in pkt-line format, and use
PKT-FLUSH to mark the end of input (from both sides). The main process
starts the communication by sending N pkt-lines, each corresponding to
an item that needs to be written. These packets contain all the
necessary information to load, smudge, and write the blob associated
with each item. Then it waits for the worker to send back N pkt-lines
containing the results for each item. The resulting packet must contain:
the identification number of the item that it refers to, the status of
the operation, and the lstat() data gathered after writing the file (iff
the operation was successful).
For now, checkout always uses a hardcoded value of 2 workers, only to
demonstrate that the parallel checkout framework correctly divides and
writes the queued entries. The next patch will add user configurations
and define a more reasonable default, based on tests with the said
settings.
Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New log.diffMerges configuration variable sets the format that
--diff-merges=on will be using. The default is "separate".
t4013: add the following tests for log.diffMerges config:
* Test that wrong values are denied.
* Test that the value of log.diffMerges properly affects both
--diff-merges=on and -m.
t9902: fix completion tests for log.d* to match log.diffMerges.
Added documentation for log.diffMerges.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.
* en/ort-readiness:
Add testing with merge-ort merge strategy
t6423: mark remaining expected failure under merge-ort as such
Revert "merge-ort: ignore the directory rename split conflict for now"
merge-recursive: add a bunch of FIXME comments documenting known bugs
merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
t: mark several submodule merging tests as fixed under merge-ort
merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
t6428: new test for SKIP_WORKTREE handling and conflicts
merge-ort: support subtree shifting
merge-ort: let renormalization change modify/delete into clean delete
merge-ort: have ll_merge() use a special attr_index for renormalization
merge-ort: add a special minimal index just for renormalization
merge-ort: use STABLE_QSORT instead of QSORT where required
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'prefetch' maintenance task previously forced the following refspec
for each remote:
+refs/heads/*:refs/prefetch/<remote>/*
If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.
The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.
Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.
Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.
Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:
* Negative refspecs are preserved.
* Refspecs without a destination are removed.
* Refspecs whose source starts with "refs/tags/" are removed.
* Other refspecs are placed within "refs/prefetch/".
Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.
There are some interesting cases that are worth testing.
An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".
It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.
Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid missing files.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one so we do not miss blobs to scan. Later, this can
integrate more carefully with sparse indexes with proper testing.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When verifying all blobs reachable from the index, ensure that a sparse
index has been expanded to a full one to avoid missing some blobs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index has
been expanded to a full one to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two loops iterate over all cache entries, so ensure that a sparse
index is expanded to a full index before we do so.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries in the checkout builtin, ensure
that we have a full index to avoid any unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before we iterate over all cache entries, ensure that the index is not
sparse. This loop in checkout_all() might be safe to iterate over a
sparse index, but let's put this protection here until it can be
carefully tested.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several methods specify that they take a 'struct index_state' pointer
with the 'const' qualifier because they intend to only query the data,
not change it. However, we will be introducing a step very low in the
method stack that might modify a sparse-index to become a full index in
the case that our queries venture inside a sparse-directory entry.
This change only removes the 'const' qualifiers that are necessary for
the following change which will actually modify the implementation of
index_name_stage_pos().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.
* tb/pack-preferred-tips-to-give-bitmap:
builtin/pack-objects.c: respect 'pack.preferBitmapTips'
t/helper/test-bitmap.c: initial commit
pack-bitmap: add 'test_bitmap_commits()' helper
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.
We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.
After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).
But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.
This worked correctly once upon a time, but was broken in 5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:
- prior to the pack-reuse fixes from a14aebeac3 (Merge branch
'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
never kicked in anyway
- the output looks _kind of_ correct. The "backwards" moment is hard
to catch, because we overwrite the old progress number with the new
one, and the larger number is displayed only for a second. So unless
you look at that exact second, you just see the much smaller value,
counting up to the number of non-reused objects (though of course if
you catch it in stderr, or look at GIT_TRACE_PACKET from a server
with bitmaps, you can see both values).
This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per 5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.
We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git add` refrains from adding or updating index entries that are
outside the current sparse checkout, but `git rm` doesn't follow the
same restriction. This is somewhat counter-intuitive and inconsistent.
So make `rm` honor the sparsity rules and advise on how to remove
SKIP_WORKTREE entries just like `add` does. Also add some tests for the
new behavior.
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git add` already refrains from updating SKIP_WORKTREE entries, but it
silently exits with zero code when it is asked to do so. Instead, let's
warn the user and display a hint on how to update these entries.
Note that we only warn the user whey they give a pathspec item that
matches no eligible path for updating, but it does match one or more
SKIP_WORKTREE entries. A warning was chosen over erroring out right away
to reproduce the same behavior `add` already exhibits with ignored
files. This also allow users to continue their workflow without having
to invoke `add` again with only the eligible paths (as those will have
already been added).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new enum parameter to `add_pathspec_matches_against_index()` and
`find_pathspecs_matching_against_index()`, allowing callers to specify
whether these function should attempt to match SKIP_WORKTREE entries or
not. This will be used in a future patch to make `git add` display a
warning when it is asked to update SKIP_WORKTREE entries.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git add --refresh <pathspec>` doesn't find any matches for the
given pathspec, it prints an error message using the `match` field of
the `struct pathspec_item`. However, this field doesn't contain the
magic part of the pathspec. Instead, let's use the `original` field.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>