The formatting in t7900-subtree.sh isn't even consistent throughout the
file. Fix that; make it consistent throughout the file.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test-lib.sh's `test_count`, instead instead of having
t7900-subtree.sh do its own book-keeping with `subtree_test_count` that
has to be explicitly incremented by calling `next_test`.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the tests had been converted to support
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`, but `contrib/subtree/t/`
hadn't.
Convert it. Most of the mentions of 'master' can just be replaced with
'HEAD'.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running `make -C contrib/subtree/ test` creates a `git-subtree` executable
in the root of the repo. Add it to the .gitignore so that anyone hacking
on subtree won't have to deal with that noise.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git repack -A -d` is run in a partial clone, `pack-objects`
is invoked twice: once to repack all promisor objects, and once to
repack all non-promisor objects. The latter `pack-objects` invocation
is with --exclude-promisor-objects and --unpack-unreachable, which
loosens all objects unused during this invocation. Unfortunately,
this includes promisor objects.
Because the -d argument to `git repack` subsequently deletes all loose
objects also in packs, these just-loosened promisor objects will be
immediately deleted. However, this extra disk churn is unnecessary in
the first place. For example, in a newly-cloned partial repo that
filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up
unpacking all trees and commits into the filesystem because every
object, in this particular case, is a promisor object. Depending on
the repo size, this increases the disk usage considerably: In my copy
of the linux.git, the object directory peaked 26GB of more disk usage.
In order to avoid this extra disk churn, pass the names of the promisor
packfiles as --keep-pack arguments to the second invocation of
`pack-objects`. This informs `pack-objects` that the promisor objects
are already in a safe packfile and, therefore, do not need to be
loosened.
For testing, we need to validate whether any object was loosened.
However, the "evidence" (loosened objects) is deleted during the
process which prevents us from inspecting the object directory.
Instead, let's teach `pack-objects` to count loosened objects and
emit via trace2 thus allowing inspecting the debug events after the
process is finished. This new event is used on the added regression
test.
Lastly, add a new perf test to evaluate the performance impact
made by this changes (tested on git.git):
Test HEAD^ HEAD
----------------------------------------------------------
5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2%
For a bigger repository, such as linux.git, the improvement is
even bigger:
Test HEAD^ HEAD
-------------------------------------------------------------------
5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1%
These improvements are particular big because every object in the
newly-cloned partial repository is a promisor object.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From the documentation for generating patch text with diff-related
commands, refer to the documentation for the diff attribute.
This attribute influences the way that patches are generated, but this
was previously not mentioned in e.g., the git-diff manpage.
Signed-off-by: Peter Oliver <git@mavit.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.
cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.
LSAN output from t0020:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
#2 0x9ac07a in xmalloc wrapper.c:62:9
#3 0x873277 in parse_pathspec pathspec.c:582:2
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac2a6 in xrealloc wrapper.c:126:8
#2 0x93b14d in strbuf_grow strbuf.c:98:2
#3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
#4 0x93f726 in xstrvfmt strbuf.c:979:2
#5 0x93f8b3 in xstrfmt strbuf.c:989:8
#6 0x92ad8a in prefix_path_gently setup.c:115:15
#7 0x873a8d in init_pathspec_item pathspec.c:439:11
#8 0x87334f in parse_pathspec pathspec.c:589:3
#9 0x646ffa in cmd_rm builtin/rm.c:266:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 15 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac048 in xstrdup wrapper.c:29:14
#2 0x873ba2 in init_pathspec_item pathspec.c:468:20
#3 0x87334f in parse_pathspec pathspec.c:589:3
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac392 in xcalloc wrapper.c:140:8
#2 0x647108 in cmd_rm builtin/rm.c:294:9
#3 0x4cd91d in run_builtin git.c:467:11
#4 0x4cb5f3 in handle_builtin git.c:719:3
#5 0x4ccf47 in run_argv git.c:808:4
#6 0x4caf49 in cmd_main git.c:939:19
#7 0x69dbfe in main common-main.c:52:11
#8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
options.git_format_patch_opt can be populated during cmd_rebase's setup,
and will therefore leak on return. Although we could just UNLEAK all of
options, we choose to strbuf_release() the individual member, which matches
the existing pattern (where we're freeing invidual members of options).
Leak found when running t0021:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac296 in xrealloc wrapper.c:126:8
#2 0x93b13d in strbuf_grow strbuf.c:98:2
#3 0x93bd3a in strbuf_add strbuf.c:295:2
#4 0x60ae92 in strbuf_addstr strbuf.h:304:2
#5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69dbfe in main common-main.c:52:11
#11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sorting might be a list allocated in ref_default_sorting() (in this case
it's a fixed single item list, which has nevertheless been xcalloc'd),
or it might be a list allocated in parse_opt_ref_sorting(). In either
case we could free these lists - but instead we UNLEAK as we're at the
end of cmd_for_each_ref. (There's no existing implementation of
clear_ref_sorting(), and writing a loop to free the list seems more
trouble than it's worth.)
filter.with_commit/no_commit are populated via
OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via
parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because
free_commit_list() already exists, we choose to use that over an UNLEAK.
LSAN output from t0041:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac252 in xcalloc wrapper.c:140:8
#2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32
#3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13
#4 0x4cd91d in run_builtin git.c:467:11
#5 0x4cb5f3 in handle_builtin git.c:719:3
#6 0x4ccf47 in run_argv git.c:808:4
#7 0x4caf49 in cmd_main git.c:939:19
#8 0x69dabe in main common-main.c:52:11
#9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9abf54 in do_xmalloc wrapper.c:41:8
#2 0x9abf2a in xmalloc wrapper.c:62:9
#3 0x717486 in commit_list_insert commit.c:540:33
#4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2
#5 0x869bb5 in get_value parse-options.c:181:11
#6 0x8677dc in parse_long_opt parse-options.c:378:10
#7 0x8659bd in parse_options_step parse-options.c:817:11
#8 0x867fcd in parse_options parse-options.c:870:10
#9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dabe in main common-main.c:52:11
#15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mailinfo.p_hdr_info/s_hdr_info are null-terminated lists of strbuf's,
with entries pointing either to NULL or an allocated strbuf. Therefore
we need to free those strbuf's (and not just the data they contain)
whenever we're done with a given entry. (See handle_header() where those
new strbufs are malloc'd.)
Once we no longer need the list (and not just its entries) we can switch
over to strbuf_list_free() instead of manually iterating over the list,
which takes care of those additional details for us. We can only do this
in clear_mailinfo() - in handle_commit_message() we are only clearing the
array contents but want to reuse the array itself, hence we can't use
strbuf_list_free() there.
However, strbuf_list_free() cannot handle a NULL input, and the lists we
are freeing might be NULL. Therefore we add a NULL check in
strbuf_list_free() to make it safe to use with a NULL input (which is a
pattern used by some of the other *_free() functions around git).
Leak output from t0023:
Direct leak of 72 byte(s) in 3 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac9f4 in do_xmalloc wrapper.c:41:8
#2 0x9ac9ca in xmalloc wrapper.c:62:9
#3 0x7f6cf7 in handle_header mailinfo.c:205:10
#4 0x7f5abf in check_header mailinfo.c:583:4
#5 0x7f5524 in mailinfo mailinfo.c:1197:3
#6 0x4dcc95 in parse_mail builtin/am.c:1167:6
#7 0x4d9070 in am_run builtin/am.c:1732:12
#8 0x4d5b7a in cmd_am builtin/am.c:2398:3
#9 0x4cd91d in run_builtin git.c:467:11
#10 0x4cb5f3 in handle_builtin git.c:719:3
#11 0x4ccf47 in run_argv git.c:808:4
#12 0x4caf49 in cmd_main git.c:939:19
#13 0x69e43e in main common-main.c:52:11
#14 0x7fc1fadfa349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 72 byte(s) leaked in 3 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_pending_object() populates rev.pending, we need to take care of
clearing it once we're done.
This code is run close to the end of a checkout, therefore this leak
seems like it would have very little impact. See also LSAN output
from t0020 below:
Direct leak of 2048 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc46 in xrealloc wrapper.c:126:8
#2 0x83e3a3 in add_object_array_with_path object.c:337:3
#3 0x8f672a in add_pending_object_with_path revision.c:329:2
#4 0x8eaeab in add_pending_object_with_mode revision.c:336:2
#5 0x8eae9d in add_pending_object revision.c:342:2
#6 0x5154a0 in show_local_changes builtin/checkout.c:602:2
#7 0x513b00 in merge_working_tree builtin/checkout.c:979:3
#8 0x512cb3 in switch_branches builtin/checkout.c:1242:9
#9 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#10 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#12 0x4cd91d in run_builtin git.c:467:11
#13 0x4cb5f3 in handle_builtin git.c:719:3
#14 0x4ccf47 in run_argv git.c:808:4
#15 0x4caf49 in cmd_main git.c:939:19
#16 0x69e43e in main common-main.c:52:11
#17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_pathspec() allocates new memory into pathspec, therefore we need
to free it when we're done.
An UNLEAK would probably be just as good here - but clear_pathspec() is
not much more work so we might as well use it. check_ignore() is either
called once directly from cmd_check_ignore() (in which case the leak
really doesnt matter), or it can be called multiple times in a loop from
check_ignore_stdin_paths(), in which case we're potentially leaking
multiple times - but even in this scenario the leak is so small as to
have no real consequence.
Found while running t0008:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9aca44 in do_xmalloc wrapper.c:41:8
#2 0x9aca1a in xmalloc wrapper.c:62:9
#3 0x873c17 in parse_pathspec pathspec.c:582:2
#4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69e43e in main common-main.c:52:11
#11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc46 in xrealloc wrapper.c:126:8
#2 0x93baed in strbuf_grow strbuf.c:98:2
#3 0x93d696 in strbuf_vaddf strbuf.c:392:3
#4 0x9400c6 in xstrvfmt strbuf.c:979:2
#5 0x940253 in xstrfmt strbuf.c:989:8
#6 0x92b72a in prefix_path_gently setup.c:115:15
#7 0x87442d in init_pathspec_item pathspec.c:439:11
#8 0x873cef in parse_pathspec pathspec.c:589:3
#9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#11 0x4cd91d in run_builtin git.c:467:11
#12 0x4cb5f3 in handle_builtin git.c:719:3
#13 0x4ccf47 in run_argv git.c:808:4
#14 0x4caf49 in cmd_main git.c:939:19
#15 0x69e43e in main common-main.c:52:11
#16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac9e8 in xstrdup wrapper.c:29:14
#2 0x874542 in init_pathspec_item pathspec.c:468:20
#3 0x873cef in parse_pathspec pathspec.c:589:3
#4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69e43e in main common-main.c:52:11
#11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
prefix_filename() returns newly allocated memory, and strbuf_addstr()
doesn't take ownership of its inputs. Therefore we have to make sure to
store and free prefix_filename()'s result.
As this leak is in cmd_bugreport(), we could just as well UNLEAK the
prefix - but there's no good reason not to just free it properly. This
leak was found while running t0091, see output below:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc66 in xrealloc wrapper.c:126:8
#2 0x93baed in strbuf_grow strbuf.c:98:2
#3 0x93c6ea in strbuf_add strbuf.c:295:2
#4 0x69f162 in strbuf_addstr ./strbuf.h:304:2
#5 0x69f083 in prefix_filename abspath.c:277:2
#6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9
#7 0x4cd91d in run_builtin git.c:467:11
#8 0x4cb5f3 in handle_builtin git.c:719:3
#9 0x4ccf47 in run_argv git.c:808:4
#10 0x4caf49 in cmd_main git.c:939:19
#11 0x69df9e in main common-main.c:52:11
#12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
real_ref was previously populated by dwim_ref(), which allocates new
memory. We need to make sure to free real_ref when discarding it.
(real_ref is already being freed at the end of create_branch() - but
if we discard it early then it will leak.)
This fixes the following leak found while running t0002-t0099:
Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x486954 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xdd6484 in xstrdup wrapper.c:29:14
#2 0xc0f658 in expand_ref refs.c:671:12
#3 0xc0ecf1 in repo_dwim_ref refs.c:644:22
#4 0x8b1184 in dwim_ref ./refs.h:162:9
#5 0x8b0b02 in create_branch branch.c:284:10
#6 0x550cbb in update_refs_for_switch builtin/checkout.c:1046:4
#7 0x54e275 in switch_branches builtin/checkout.c:1274:2
#8 0x548828 in checkout_branch builtin/checkout.c:1668:9
#9 0x541306 in checkout_main builtin/checkout.c:2025:9
#10 0x5395fa in cmd_checkout builtin/checkout.c:2077:8
#11 0x4d02a8 in run_builtin git.c:467:11
#12 0x4cbfe9 in handle_builtin git.c:719:3
#13 0x4cf04f in run_argv git.c:808:4
#14 0x4cb85a in cmd_main git.c:939:19
#15 0x820cf6 in main common-main.c:52:11
#16 0x7f30bd9dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fill_bloom_key() allocates memory into bloom_key, we need to clean that
up once the key is no longer needed.
This leak was found while running t0002-t0099. Although this leak is
happening in code being called from a test-helper, the same code is also
used in various locations around git, and can therefore happen during
normal usage too. Gabor's analysis shows that peak-memory usage during
'git commit-graph write' is reduced on the order of 10% for a selection
of larger repos (along with an even larger reduction if we override
modified path bloom filter limits):
https://lore.kernel.org/git/20210411072651.GF2947267@szeder.dev/
LSAN output:
Direct leak of 308 byte(s) in 11 object(s) allocated from:
#0 0x49a5e2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x6f4032 in xcalloc wrapper.c:140:8
#2 0x4f2905 in fill_bloom_key bloom.c:137:28
#3 0x4f34c1 in get_or_compute_bloom_filter bloom.c:284:4
#4 0x4cb484 in get_bloom_filter_for_commit t/helper/test-bloom.c:43:11
#5 0x4cb072 in cmd__bloom t/helper/test-bloom.c:97:3
#6 0x4ca7ef in cmd_main t/helper/test-tool.c:121:11
#7 0x4caace in main common-main.c:52:11
#8 0x7f798af95349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 308 byte(s) leaked in 11 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
common_prefix() returns a new string, which we store in max_prefix -
this string needs to be freed to avoid a leak. This leak is happening
in cmd_ls_files, hence is of no real consequence - an UNLEAK would be
just as good, but we might as well free the string properly.
Leak found while running t0002, see output below:
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ab1b4 in do_xmalloc wrapper.c:41:8
#2 0x9ab248 in do_xmallocz wrapper.c:75:8
#3 0x9ab22a in xmallocz wrapper.c:83:9
#4 0x9ab2d7 in xmemdupz wrapper.c:99:16
#5 0x78d6a4 in common_prefix dir.c:191:15
#6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16
#7 0x4cd92d in run_builtin git.c:453:11
#8 0x4cb5fa in handle_builtin git.c:704:3
#9 0x4ccf57 in run_argv git.c:771:4
#10 0x4caf49 in cmd_main git.c:902:19
#11 0x69ce2e in main common-main.c:52:11
#12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev.prune_data is populated (in multiple functions) via copy_pathspec,
and therefore needs to be cleared after running the diff in those
functions.
rev(_info).pending is populated indirectly via setup_revisions, and also
needs to be cleared once diffing is done.
These leaks were found while running t0008 or t0021. The rev.prune_data
leaks are small (80B) but noisy, hence I won't bother including their
logs - the rev.pending leaks are bigger, and can happen early in the
course of other commands, and therefore possibly more valuable to fix -
see example log from a rebase below:
Direct leak of 2048 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac2a6 in xrealloc wrapper.c:126:8
#2 0x83da03 in add_object_array_with_path object.c:337:3
#3 0x8f5d8a in add_pending_object_with_path revision.c:329:2
#4 0x8ea50b in add_pending_object_with_mode revision.c:336:2
#5 0x8ea4fd in add_pending_object revision.c:342:2
#6 0x8ea610 in add_head_to_pending revision.c:354:2
#7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2
#8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6
#9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac048 in xstrdup wrapper.c:29:14
#2 0x83da8d in add_object_array_with_path object.c:349:17
#3 0x8f5d8a in add_pending_object_with_path revision.c:329:2
#4 0x8ea50b in add_pending_object_with_mode revision.c:336:2
#5 0x8ea4fd in add_pending_object revision.c:342:2
#6 0x8ea610 in add_head_to_pending revision.c:354:2
#7 0x9b55f5 in has_uncommitted_changes wt-status.c:2474:2
#8 0x9b58c4 in require_clean_work_tree wt-status.c:2553:6
#9 0x606bcc in cmd_rebase builtin/rebase.c:1970:6
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7f2d18909349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 2053 byte(s) leaked in 2 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
limit_list() iterates over the original revs->commits list, and consumes
many of its entries via pop_commit. However we might stop iterating over
the list early (e.g. if we realise that the rest of the list is
uninteresting). If we do stop iterating early, list will be pointing to
the unconsumed portion of revs->commits - and we need to free this list
to avoid a leak. (revs->commits itself will be an invalid pointer: it
will have been free'd during the first pop_commit.)
However the list pointer is later reused to iterate over our new list,
but only for the limiting_can_increase_treesame() branch. We therefore
need to introduce a new variable for that branch - and while we're here
we can rename the original list to original_list as that makes its
purpose more obvious.
This leak was found while running t0090. It's not likely to be very
impactful, but it can happen quite early during some checkout
invocations, and hence seems to be worth fixing:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac084 in do_xmalloc wrapper.c:41:8
#2 0x9ac05a in xmalloc wrapper.c:62:9
#3 0x7175d6 in commit_list_insert commit.c:540:33
#4 0x71800f in commit_list_insert_by_date commit.c:604:9
#5 0x8f8d2e in process_parents revision.c:1128:5
#6 0x8f2f2c in limit_list revision.c:1418:7
#7 0x8f210e in prepare_revision_walk revision.c:3577:7
#8 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
#9 0x512f05 in switch_branches builtin/checkout.c:1250:3
#10 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#11 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#12 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#13 0x4cd91d in run_builtin git.c:467:11
#14 0x4cb5f3 in handle_builtin git.c:719:3
#15 0x4ccf47 in run_argv git.c:808:4
#16 0x4caf49 in cmd_main git.c:939:19
#17 0x69dc0e in main common-main.c:52:11
#18 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 48 byte(s) in 3 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac084 in do_xmalloc wrapper.c:41:8
#2 0x9ac05a in xmalloc wrapper.c:62:9
#3 0x717de6 in commit_list_append commit.c:1609:35
#4 0x8f1f9b in prepare_revision_walk revision.c:3554:12
#5 0x514170 in orphaned_commit_warning builtin/checkout.c:1185:6
#6 0x512f05 in switch_branches builtin/checkout.c:1250:3
#7 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#8 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#9 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7faaabd0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that all code paths correctly set the hash algorithm member of
struct object_id, write an object's hex representation using the hash
algorithm member embedded in it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are numerous places in the codebase where we assume we can
initialize data by zeroing all its bytes. However, when we do that with
a struct object_id, it leaves the structure with a zero value for the
algorithm, which is invalid.
We could forbid this pattern and require that all struct object_id
instances be initialized using oidclr, but this seems burdensome and
it's unnatural to most C programmers. Instead, if the algorithm is
zero, assume we wanted to use the default hash algorithm instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use struct object_id for the names of objects. It isn't intended to
be used for other hash values that don't name objects such as the pack
hash.
Because struct object_id will soon need to have its algorithm member
set, using it in this code path would mean that we didn't set that
member, only the hash member, which would result in a crash. For both
of these reasons, switch to using an unsigned char array of size
GIT_MAX_RAWSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea behind struct object_id is that it is supposed to represent the
identifier of a standard Git object or a special pseudo-object like the
all-zeros object ID. In this case, we have file hashes, which, while
similar, are distinct from the identifiers of objects.
Switch these code paths to use an unsigned char array. This is both
more logically consistent and it means that we need not set the
algorithm identifier for the struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most cases, when we load the hash of an object into a struct
object_id, we load it using one of the oid* or *_oid_hex functions.
However, for git show-index, we read it in directly using fread. As a
consequence, set the algorithm correctly so the objects can be used
correctly both now and in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Up until recently, object IDs did not have an algorithm member, only a
hash. Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms. Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.
Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that struct object_id has an algorithm field, we should populate it.
This will allow us to handle object IDs in any supported algorithm and
distinguish between them. Ensure that the field is written whenever we
write an object ID by storing it explicitly every time we write an
object. Set values for the empty blob and tree values as well.
In addition, use the algorithm field to compare object IDs. Note that
because we zero-initialize struct object_id in many places throughout
the codebase, we default to the default algorithm in cases where the
algorithm field is zero rather than explicitly initialize all of those
locations.
This leads to a branch on every comparison, but the alternative is to
compare the entire buffer each time and padding the buffer for SHA-1.
That alternative ranges up to 3.9% worse than this approach on the perf
t0001, t1450, and t1451.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we need our instances of struct object_id to be zero padded, we
can no longer cast unsigned char buffers to be pointers to struct
object_id. This file reads data out of the pack objects and then
inserts it directly into a linked list item which is a pointer to struct
object_id. Instead, let's have the linked list item hold its own struct
object_id and copy the data into it.
In addition, since these are not really pointers to struct object_id,
stop passing them around as such, and call them what they really are:
pointers to unsigned char.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're hashing a value which is going to be an object ID, we want to
zero-pad that value if necessary. To do so, use the final_oid_fn
instead of the final_fn anytime we're going to create an object ID to
ensure we perform this operation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To avoid the penalty of having to branch in hash comparison functions,
we'll want to always compare the full hash member in a struct object_id,
which will require that SHA-1 object IDs be zero-padded. To do so, add
a function which finalizes a hash context and writes it into an object
ID that performs this padding.
Move the definition of struct object_id and the constant definitions
higher up so we they are available for us to use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most places in the codebase, we use oidread to properly read an
object ID into a struct object_id. However, in the HTTP code, we end up
needing to parse a loose object path with a slash in it, so we can't do
that. Let's instead explicitly set the algorithm in this function so we
can rely on it in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll want oidread to automatically set the hash
algorithm member for an object ID we read into it, so ensure we use
oidread instead of hashcpy everywhere we're copying a hash value into a
struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we're working with multiple hash algorithms in the same repo,
it's best if we label each object ID with its algorithm so we can
determine how to format a given object ID. Add a member called algo to
struct object_id.
Performance testing on object ID-heavy workloads doesn't reveal a clear
change in performance. Out of performance tests t0001 and t1450, there
are slight variations in performance both up and down, but all
measurements are within the margin of error.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the placeholders %ah and %ch to format author date and committer
date, like --date=human does, which provides more humanity date output.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the description for the --date/format equivalency tests added
in 466fb6742d (pretty: provide a strict ISO 8601 date format,
2014-08-29) and 0df621172d (pretty: provide short date format,
2019-11-19) to be more meaningful.
This allows us to reword the comment added in the former commit to
refer to both tests, and any other future test, such as the in-flight
--date=human format being proposed in [1].
1. http://lore.kernel.org/git/pull.939.v2.git.1619275340051.gitgitgadget@gmail.com
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a needlessly complex test for the %aI/%cI date
formats (iso-strict) added in 466fb6742d (pretty: provide a strict
ISO 8601 date format, 2014-08-29) to instead use the same pattern used
to test %as/%cs since 0df621172d (pretty: provide short date format,
2019-11-19).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The $subcommand case statement in _git_stash() is quite repetitive.
Consolidate the cases together into one catch-all case to reduce the
repetition.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the introduction of the $__git_cmd_idx variable in e94fb44042
(git-completion.bash: pass $__git_subcommand_idx from __git_main(),
2021-03-24), completion functions were able to know the index at which
the git command is listed, allowing them to skip options that are given
to the underlying git itself, not the corresponding command (e.g.
`-C asdf` in `git -C asdf branch`).
While most of the changes here are self-explanatory, some bear further
explanation.
For the __git_find_on_cmdline() and __git_find_last_on_cmdline() pair of
functions, these functions are only ever called in the context of a git
command completion function. These functions will only care about words
after the command so we can safely ignore the words before this.
For _git_worktree(), this change is technically a no-op (once the
__git_find_last_on_cmdline change is also applied). It was in poor style
to have hard-coded on the index right after `worktree`. In case
`git worktree` were to ever learn to accept options, the current
situation would be inflexible.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In e94fb44042 (git-completion.bash: pass $__git_subcommand_idx from
__git_main(), 2021-03-24), the $__git_subcommand_idx variable was
introduced. Naming it after the index of the subcommand is needlessly
confusing as, when this variable is used, it is in the completion
functions for commands (e.g. _git_remote()) where for `git remote add`,
the `remote` is referred to as the command and `add` is referred to as
the subcommand.
Rename this variable so that it's obvious it's about git commands. While
we're at it, shorten up its name so that it's still readable without
being a handful to type.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to test whether the new GIT_CONFIG_SYSTEM environment variable
behaves as expected, we unset GIT_CONFIG_NOSYSTEM in one of our tests in
t1300. But because tests are not executed in a subshell, this unset
leaks into all subsequent tests and may thus cause them to fail in some
environments. These failures are easily reproducable with `make
prefix=/root test`.
Fix the issue by not using `sane_unset GIT_CONFIG_NOSYSTEM`, but instead
just manually add it to the environment of the two command invocations
which need it.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dependencies for config-list.h and command-list.h were broken
when the former was split out of the latter, which has been
corrected.
* sg/bugreport-fixes:
Makefile: add missing dependencies of 'config-list.h'
Documentation updates, with unrelated comment updates, too.
* ab/usage-error-docs:
api docs: document that BUG() emits a trace2 error event
api docs: document BUG() in api-error-handling.txt
usage.c: don't copy/paste the same comment three times
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.
* jk/pack-objects-bitmap-progress-fix:
pack-objects: update "nr_seen" progress based on pack-reused count
A bit of code clean-up and a lot of test clean-up around userdiff
area.
* ab/userdiff-tests:
blame tests: simplify userdiff driver test
blame tests: don't rely on t/t4018/ directory
userdiff: remove support for "broken" tests
userdiff tests: list builtin drivers via test-tool
userdiff tests: explicitly test "default" pattern
userdiff: add and use for_each_userdiff_driver()
userdiff style: normalize pascal regex declaration
userdiff style: declare patterns with consistent style
userdiff style: re-order drivers in alphabetical order