The exit code of the upstream in a pipe is ignored thus we should avoid
using it. By writing out the output of the git command to a file, we can
test the exit codes of both the commands.
Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free the commit graph when verify_commit_graph_lite() reports an error.
Credit to OSS-Fuzz for finding this leak.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index allows searching for objects across multiple
packs using one object list. The original design gains many of
these performance benefits by keeping the packs in the
multi-pack-index out of the packed_git list.
Unfortunately, this has one major drawback. If the multi-pack-index
covers thousands of packs, and a command loads many of those packs,
then we can hit the limit for open file descriptors. The
close_one_pack() method is used to limit this resource, but it
only looks at the packed_git list, and uses an LRU cache to prevent
thrashing.
Instead of complicating this close_one_pack() logic to include
direct references to the multi-pack-index, simply add the packs
opened by the multi-pack-index to the packed_git list. This
immediately solves the file-descriptor limit problem, but requires
some extra steps to avoid performance issues or other problems:
1. Create a multi_pack_index bit in the packed_git struct that is
one if and only if the pack was loaded from a multi-pack-index.
2. Skip packs with the multi_pack_index bit when doing object
lookups and abbreviations. These algorithms already check the
multi-pack-index before the packed_git struct. This has a very
small performance hit, as we need to walk more packed_git
structs. This is acceptable, since these operations run binary
search on the other packs, so this walk-and-ignore logic is
very fast by comparison.
3. When closing a multi-pack-index file, do not close its packs,
as those packs will be closed using close_all_packs(). In some
cases, such as 'git repack', we run 'close_midx()' without also
closing the packs, so we need to un-set the multi_pack_index bit
in those packs. This is necessary, and caught by running
t6501-freshen-objects.sh with GIT_TEST_MULTI_PACK_INDEX=1.
To manually test this change, I inserted trace2 logging into
close_pack_fd() and set pack_max_fds to 10, then ran 'git rev-list
--all --objects' on a copy of the Git repo with 300+ pack-files and
a multi-pack-index. The logs verified the packs are closed as
we read them beyond the file descriptor limit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much of the multi-pack-index code focuses on the multi_pack_index
struct, and so we only pass a pointer to the current one. However,
we will insert a dependency on the packed_git linked list in a
future change, so we will need a repository reference. Inserting
these parameters is a significant enough change to split out.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we ran something like
$ git checkout -b test master...
it would fail with the message
fatal: Not a valid object name: 'master...'.
This was caused by the call to `create_branch` where `start_name` is
expected to be a valid rev. However, git-checkout allows the branch to
be a valid _merge base_ rev (i.e. with a "...") so it was possible for
an invalid rev to be passed in.
Make `create_branch` accept a merge base rev so that this case does not
error out.
As a side-effect, teach git-branch how to handle merge base revs as
well.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, in t2018, if do_checkout failed to create `branch2`, the next
test-case would run `git branch -D branch2` but then fail because it was
expecting `branch2` to exist, even though it doesn't. As a result, an
early failure could cause a cascading failure of tests.
Make test-case responsible for cleaning up their own branches so that
future tests can start with a sane environment.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the option parsing machinery so that e.g. "clone --recurs ..."
doesn't error out because "clone" understands both "--recursive" and
"--recurse-submodules" to mean the same thing.
Initially "clone" just understood --recursive until the
--recurses-submodules alias was added in ccdd3da652 ("clone: Add the
--recurse-submodules option as alias for --recursive",
2010-11-04). Since bb62e0a99f ("clone: teach --recurse-submodules to
optionally take a pathspec", 2017-03-17) the longer form has been
promoted to the default.
But due to the way the options parsing machinery works this resulted
in the rather absurd situation of:
$ git clone --recurs [...]
error: ambiguous option: recurs (could be --recursive or --recurse-submodules)
Add OPT_ALIAS() to express this link between two or more options and use
it in git-clone. Multiple aliases of an option could be written as
OPT_ALIAS(0, "alias1", "original-name"),
OPT_ALIAS(0, "alias2", "original-name"),
...
The current implementation is not exactly optimal in this case. But we
can optimize it when it becomes a problem. So far we don't even have two
aliases of any option.
A big chunk of code is actually from Junio C Hamano.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In our "make coccicheck" rule, we currently feed each source file to its
own individual invocation of spatch. This has a few downsides:
- it repeats any overhead spatch has for starting up and reading the
patch file
- any included header files may get processed from multiple
invocations. This is slow (we see the same header files multiple
times) and may produce a resulting patch with repeated hunks (which
cannot be applied without further cleanup)
Ideally we'd just invoke a single instance of spatch per rule-file and
feed it all source files. But spatch can be rather memory hungry when
run in this way. I measured the peak RSS going from ~90MB for a single
file to ~1900MB for all files. Multiplied by multiple rule files being
processed at the same time (for "make -j"), this can make things slower
or even cause them to fail (e.g., this is reported to happen on our
Travis builds).
Instead, let's provide a tunable knob. We'll leave the default at "1",
but it can be cranked up to "999" for maximum CPU/memory tradeoff, or
people can find points in between that serve their particular machines.
Here are a few numbers running a single rule via:
SIZES='1 4 16 999'
RULE=contrib/coccinelle/object_id.cocci
for i in $SIZES; do
make clean
/usr/bin/time -o $i.out --format='%e | %U | %S | %M' \
make $RULE.patch SPATCH_BATCH_SIZE=$i
done
for i in $SIZES; do
printf '%4d | %s\n' $i "$(cat $i.out)"
done
which yields:
1 | 97.73 | 93.38 | 4.33 | 100128
4 | 52.80 | 51.14 | 1.69 | 135204
16 | 35.82 | 35.09 | 0.76 | 284124
999 | 23.30 | 23.13 | 0.20 | 1903852
The implementation is done with xargs, which should be widely available;
it's in POSIX, we rely on it already in the test suite. And "coccicheck"
is really a developer-only tool anyway, so it's not a big deal if
obscure systems can't run it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In previous patches, extern was mechanically removed from function
declarations without care to formatting, causing parameter lists to be
misaligned. Manually format changed sections such that the parameter
lists should be realigned.
Viewing this patch with 'git diff -w' should produce no output.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There has been a push to remove extern from function declarations.
Finish the job by removing all instances of "extern" for function
declarations in headers using sed.
This was done by running the following on my system with sed 4.2.2:
$ git ls-files \*.{c,h} |
grep -v ^compat/ |
xargs sed -i'' -e 's/^\(\s*\)extern \([^(]*([^*]\)/\1\2/'
Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.
Then, leftover instances of extern were found by running
$ git grep -w -C3 extern \*.{c,h}
and manually checking the output. No other instances were found.
Note that the regex used specifically excludes function variables which
_should_ be left as extern.
Not the most elegant way to do it but it gets the job done.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.
This was the Coccinelle patch used:
@@
type T;
identifier f;
@@
- extern
T f(...);
and it was run with:
$ git ls-files \*.{c,h} |
grep -v ^compat/ |
xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't cover the partial clone feature at all in t/perf. Let's at
least run a few basic tests so that we'll notice any regressions.
We'll do a no-blob clone, and split it into two parts: the actual object
transfer, and the subsequent checkout (which will of course require
another transfer to get the blobs). That will help us more clearly
assess the performance of each.
There are obviously a lot more possibilities besides just a no-blob
partial clone, but this should serve as a canary that alerts us to any
generic slow-downs (and we can add more tests later for cases that
aren't exercised here).
There are a few non-ideal things here that make this not an entirely
accurate test, but are probably OK for our purposes:
1. We have to do some extra prep/cleanup work inside the timing tests,
since they impact the on-disk state and the perf harness may run
each one multiple times.
In practice this is probably OK, since these bits should be much
less expensive than the operations we are measuring.
2. The clone time is likely to be dominated by the server's object
enumeration. In the real world, a repo large enough to drive people
to partial clones is likely to have reachability bitmaps enabled.
And in the opposite direction, our object transfer is happening at
the speed of a local pipe, whereas in the real world it would
bottle-neck on the network.
So any percentage speedups should be taken with a grain of salt.
But hopefully any regressions will produce enough of an effect to
be noticeable.
This script also demonstrates the recent improvement from dfa33a298d
(clone: do faster object check for partial clones, 2019-04-19):
Test dfa33a298d^ dfa33a298d
-------------------------------------------------------------------------
5600.2: clone without blobs 18.41(22.72+1.09) 6.83(11.65+0.50) -62.9%
5600.3: checkout of result 1.82(3.24+0.26) 1.84(3.24+0.26) +1.1%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix trace2_data_json_fl() to check for the presence of pfn_data_json_fl
in its targets, rather than pfn_data_fl, which is not actually called.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Mac's HFS where git sets core.precomposeUnicode to true automatically
by git init/clone, when a user creates a simple unicode refname (in NFC
format) such as españa:
$ git branch españa
different commands would display the branch name differently. For
example, git branch, git log --decorate, and git fast-export all used
65 73 70 61 c3 b1 61 (or "espa\xc3\xb1a")
(NFC form) while show-ref would use
65 73 70 61 6e cc 83 61 (or "espan\xcc\x83a")
(NFD form). A stress test for git filter-repo was tripped up by this
inconsistency, though digging in I found that the problems could
compound; for example, if the user ran
$ git pack-refs --all
and then tried to check out the branch, they would be met with:
$ git checkout españa
error: pathspec 'españa' did not match any file(s) known to git
$ git checkout españa --
fatal: invalid reference: españa
$ git branch
españa
* master
Note that the user could run the `git branch` command first and copy and
paste the `españa` portion of the output and still see the same two
errors. Also, if the user added --no-prune to the pack-refs command,
then they would see three branches: master, españa, and españa (those
last two are NFC vs. NFD forms, even if they render the same).
Further, if the user had the `españa` branch checked out before
running `git pack-refs --all`, the user would be greeted with (note
that I'm trimming trailing output with an ellipsis):
$ git rev-parse HEAD
fatal: ambiguous argument 'HEAD': unknown revision or path...
$ git status
On branch españa
No commits yet...
Or worse, if the user didn't check this stuff first, running `git
commit` will create a new commit with all changes of all of history
being squashed into it.
In addition to pack-refs, one could also get into this state with
upload-pack or anything that calls either pack-refs or upload-pack (e.g.
gc or clone).
Add code in a few places (pack-refs, show-ref, upload-pack) to check and
honor the setting of core.precomposeUnicode to avoid these bugs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On AIX, access(X_OK) may succeed when run as root even if the
execution isn't possible. This behavior is allowed by POSIX
which says:
... for a process with appropriate privileges, an implementation
may indicate success for X_OK even if execute permission is not
granted to any user.
It can lead hook programs to have their execution refused:
git commit -m content
fatal: cannot exec '.git/hooks/pre-commit': Permission denied
Add NEED_ACCESS_ROOT_HANDLER in order to use an access helper function.
It checks with stat if any executable flags is set when the current user
is root.
Signed-off-by: Clément Chigot <clement.chigot@atos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Declare FILENO_IS_A_MACRO on AIX
On AIX, fileno(fp) is a macro and need to use the work around already made for BSD's.
Signed-off-by: Clément Chigot <clement.chigot@atos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Build with gettext breaks on recent macOS w/ Homebrew when
/usr/local/bin is not on PATH, which has been corrected.
* js/macos-gettext-build:
macOS: make sure that gettext is found
The recently added feature to add addresses that are on
anything-by: trailers in 'git send-email' was found to be way too
eager and considered nonsense strings as if they can be legitimate
beginning of *-by: trailer. This has been tightened.
* bs/sendemail-tighten-anything-by:
send-email: don't cc *-by lines with '-' prefix
"git send-email" has been taught to use quoted-printable when the
payload contains carriage-return. The use of the mechanism is in
line with the design originally added the codepath that chooses QP
when the payload has overly long lines.
* bc/send-email-qp-cr:
send-email: default to quoted-printable when CR is present
"git submodule foreach <command> --quiet" did not pass the option
down correctly, which has been corrected.
* nd/submodule-foreach-quiet:
submodule foreach: fix "<command> --quiet" not being respected
The GETTEXT_POISON test option has been quite broken ever since it
was made runtime-tunable, which has been fixed.
* jc/gettext-test-fix:
gettext tests: export the restored GIT_TEST_GETTEXT_POISON
Code clean-up and a fix for "git fetch" by an explicit object name
(as opposed to fetching refs by name).
* jk/fetch-reachability-error-fix:
fetch: do not consider peeled tags as advertised tips
remote.c: make singular free_ref() public
fetch: use free_refs()
pkt-line: prepare buffer before handling ERR packets
upload-pack: send ERR packet for non-tip objects
t5530: check protocol response for "not our ref"
t5516: drop ok=sigpipe from unreachable-want tests
The code is updated to check the result of memory allocation before
it is used in more places, by using xmalloc and/or xcalloc calls.
* jk/xmalloc:
progress: use xmalloc/xcalloc
xdiff: use xmalloc/xrealloc
xdiff: use git-compat-util
test-prio-queue: use xmalloc
An underallocation in the code to read the untracked cache
extension has been corrected.
* js/untracked-cache-allocfix:
untracked cache: fix off-by-one
"git blame -- path" in a non-bare repository starts blaming from
the working tree, and the same command in a bare repository errors
out because there is no working tree by definition. The command
has been taught to instead start blaming from the commit at HEAD,
which is more useful.
* sg/blame-in-bare-start-at-head:
blame: default to HEAD in a bare repo when no start commit is given
Updating the display with progress message has been cleaned up to
deal better with overlong messages.
* sg/overlong-progress-fix:
progress: break too long progress bar lines
progress: clear previous progress update dynamically
progress: assemble percentage and counters in a strbuf before printing
progress: make display_progress() return void
While running "git diff" in a lazy clone, we can upfront know which
missing blobs we will need, instead of waiting for the on-demand
machinery to discover them one by one. Aim to achieve better
performance by batching the request for these promised blobs.
* jt/batch-fetch-blobs-in-diff:
diff: batch fetching of missing blobs
sha1-file: support OBJECT_INFO_FOR_PREFETCH
Performance fix for "rev-list --parents -- pathspec".
* jk/revision-rewritten-parents-in-prio-queue:
revision: use a prio_queue to hold rewritten parents
Performance fix around "git blame", especially in a linear history
(which is the norm we should optimize for).
* dk/blame-keep-origin-blob:
blame.c: don't drop origin blobs as eagerly
Conversion from unsigned char[20] to struct object_id continues.
* bc/hash-transition-16: (35 commits)
gitweb: make hash size independent
Git.pm: make hash size independent
read-cache: read data in a hash-independent way
dir: make untracked cache extension hash size independent
builtin/difftool: use parse_oid_hex
refspec: make hash size independent
archive: convert struct archiver_args to object_id
builtin/get-tar-commit-id: make hash size independent
get-tar-commit-id: parse comment record
hash: add a function to lookup hash algorithm by length
remote-curl: make hash size independent
http: replace sha1_to_hex
http: compute hash of downloaded objects using the_hash_algo
http: replace hard-coded constant with the_hash_algo
http-walker: replace sha1_to_hex
http-push: remove remaining uses of sha1_to_hex
http-backend: allow 64-character hex names
http-push: convert to use the_hash_algo
builtin/pull: make hash-size independent
builtin/am: make hash size independent
...
"git fast-import" update.
* en/fast-import-parsing-fix:
fast-import: fix erroneous handling of get-mark with empty orphan commits
fast-import: only allow cat-blob requests where it makes sense
fast-import: check most prominent commands first
git-fast-import.txt: fix wording about where ls command can appear
t9300: demonstrate bug with get-mark and empty orphan commits
Fix for protocol v2 support in "git fetch-pack" of shallow clones.
* jt/fetch-no-update-shallow-in-proto-v2:
fetch-pack: respect --no-update-shallow in v2
fetch-pack: call prepare_shallow_info only if v0
A progress indicator has been added to the "index-pack" step, which
often makes users wait for completion during "git clone".
* sg/index-pack-progress:
index-pack: show progress while checking objects
Code cleanup with more careful error checking before using data
read from the commit-graph file.
* ab/commit-graph-fixes:
commit-graph: improve & i18n error messages
commit-graph write: don't die if the existing graph is corrupt
commit-graph verify: detect inability to read the graph
commit-graph: don't pass filename to load_commit_graph_one_fd_st()
commit-graph: don't early exit(1) on e.g. "git status"
commit-graph: fix segfault on e.g. "git status"
commit-graph tests: test a graph that's too small
commit-graph tests: split up corrupt_graph_and_verify()
Fix various glitches in "git gc" around reflog handling.
* ab/gc-reflog:
gc: handle & check gc.reflogExpire config
reflog tests: assert lack of early exit with expiry="never"
reflog tests: test for the "points nowhere" warning
reflog tests: make use of "test_config" idiom
gc: refactor a "call me once" pattern
gc: convert to using the_hash_algo
gc: remove redundant check for gc_auto_threshold
"git checkout -m <other>" was about carrying the differences
between HEAD and the working-tree files forward while checking out
another branch, and ignored the differences between HEAD and the
index. The command has been taught to abort when the index and the
HEAD are different.
* nd/checkout-m:
checkout: prevent losing staged changes with --merge
read-tree: add --quiet
unpack-trees: rename "gently" flag to "quiet"
unpack-trees: keep gently check inside add_rejected_path
"git difftool" can now run outside a repository.
* js/difftool-no-index:
difftool: allow running outside Git worktrees with --no-index
parse-options: make OPT_ARGUMENT() more useful
difftool: remove obsolete (and misleading) comment