Add ABNF notation for capability 'report-status-v2' which extends
capability 'report-status' by adding additional option lines.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new introduced "proc-receive" hook may handle a command for a
pseudo-reference with a zero-old as its old-oid, while the hook may
create or update a reference with different name, different new-oid,
and different old-oid (the reference may exist already with a non-zero
old-oid). Current "report-status" protocol cannot report the status for
such reference rewrite.
Add new capability "report-status-v2" and new report protocol which is
not backward compatible for report of git-push.
If a user pushes to a pseudo-reference "refs/for/master/topic", and
"receive-pack" creates two new references "refs/changes/23/123/1" and
"refs/changes/24/124/1", for client without the knowledge of
"report-status-v2", "receive-pack" will only send "ok/ng" directives in
the report, such as:
ok ref/for/master/topic
But for client which has the knowledge of "report-status-v2",
"receive-pack" will use "option" directives to report more attributes
for the reference given by the above "ok/ng" directive.
ok refs/for/master/topic
option refname refs/changes/23/123/1
option new-oid <new-oid>
ok refs/for/master/topic
option refname refs/changes/24/124/1
option new-oid <new-oid>
The client will report two new created references to the end user.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands are fed to the "post-receive" hook, report options will
be parsed and the real old-oid, new-oid, reference name will feed to
the "post-receive" hook.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git calls an internal `execute_commands` function to handle commands
sent from client to `git-receive-pack`. Regardless of what references
the user pushes, git creates or updates the corresponding references if
the user has write-permission. A contributor who has no
write-permission, cannot push to the repository directly. So, the
contributor has to write commits to an alternate location, and sends
pull request by emails or by other ways. We call this workflow as a
distributed workflow.
It would be more convenient to work in a centralized workflow like what
Gerrit provided for some cases. For example, a read-only user who
cannot push to a branch directly can run the following `git push`
command to push commits to a pseudo reference (has a prefix "refs/for/",
not "refs/heads/") to create a code review.
git push origin \
HEAD:refs/for/<branch-name>/<session>
The `<branch-name>` in the above example can be as simple as "master",
or a more complicated branch name like "foo/bar". The `<session>` in
the above example command can be the local branch name of the client
side, such as "my/topic".
We cannot implement a centralized workflow elegantly by using
"pre-receive" + "post-receive", because Git will call the internal
function "execute_commands" to create references (even the special
pseudo reference) between these two hooks. Even though we can delete
the temporarily created pseudo reference via the "post-receive" hook,
having a temporary reference is not safe for concurrent pushes.
So, add a filter and a new handler to support this kind of workflow.
The filter will check the prefix of the reference name, and if the
command has a special reference name, the filter will turn a specific
field (`run_proc_receive`) on for the command. Commands with this filed
turned on will be executed by a new handler (a hook named
"proc-receive") instead of the internal `execute_commands` function.
We can use this "proc-receive" command to create pull requests or send
emails for code review.
Suggested by Junio, this "proc-receive" hook reads the commands,
push-options (optional), and send result using a protocol in pkt-line
format. In the following example, the letter "S" stands for
"receive-pack" and letter "H" stands for the hook.
# Version and features negotiation.
S: PKT-LINE(version=1\0push-options atomic...)
S: flush-pkt
H: PKT-LINE(version=1\0push-options...)
H: flush-pkt
# Send commands from server to the hook.
S: PKT-LINE(<old-oid> <new-oid> <ref>)
S: ... ...
S: flush-pkt
# Send push-options only if the 'push-options' feature is enabled.
S: PKT-LINE(push-option)
S: ... ...
S: flush-pkt
# Receive result from the hook.
# OK, run this command successfully.
H: PKT-LINE(ok <ref>)
# NO, I reject it.
H: PKT-LINE(ng <ref> <reason>)
# Fall through, let 'receive-pack' to execute it.
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option fall-through)
# OK, but has an alternate reference. The alternate reference name
# and other status can be given in options
H: PKT-LINE(ok <ref>)
H: PKT-LINE(option refname <refname>)
H: PKT-LINE(option old-oid <old-oid>)
H: PKT-LINE(option new-oid <new-oid>)
H: PKT-LINE(option forced-update)
H: ... ...
H: flush-pkt
After receiving a command, the hook will execute the command, and may
create/update different reference. For example, a command for a pseudo
reference "refs/for/master/topic" may create/update different reference
such as "refs/pull/123/head". The alternate reference name and other
status are given in option lines.
The list of commands returned from "proc-receive" will replace the
relevant commands that are sent from user to "receive-pack", and
"receive-pack" will continue to run the "execute_commands" function and
other routines. Finally, the result of the execution of these commands
will be reported to end user.
The reporting function from "receive-pack" to "send-pack" will be
extended in latter commit just like what the "proc-receive" hook reports
to "receive-pack".
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Topic "proc-receive-hook" will change the workflow and output of
git-push. Add some basic test cases in t5411 before introducing the new
topic.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing a new reference (not a head or tag), report it as a new
reference instead of a new branch.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'grep' check in test 4 of t7421 resulted in the failure of t7421 on
Windows due to a different error message
error: cannot spawn git: No such file or directory
instead of
fatal: exec 'rev-parse': cd to 'my-subm' failed: No such file or directory
Tighten up the check to compute 'src_abbrev' by guarding the
'verify_submodule_committish()' call using `p->status !='D'`, so that
the former isn't called in case of non-existent submodule directory,
consequently, there is no such error message on any execution
environment. The same need not be implemented for 'dst_abbrev' and is
rather redundant since the conditional 'if (S_ISGITLINK(p->mod_dst))'
already guards the 'verify_submodule_committish()' when we have a
status of 'D'.
Therefore, eliminate the 'grep' check in t7421. Instead, verify the
absence of an error message by doing a 'test_must_be_empty' on the
file containing the error.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Worktree administrative files can become corrupted or outdated due to
external factors. Although, it is often possible to recover from such
situations by hand-tweaking these files, doing so requires intimate
knowledge of worktree internals. While information necessary to make
such repairs manually can be obtained from git-worktree.txt and
gitrepository-layout.txt, we can assist users more directly by teaching
git-worktree how to repair its administrative files itself (at least to
some extent). Therefore, add a "git worktree repair" command which
attempts to correct common problems which may arise due to factors
beyond Git's control.
At this stage, the "repair" command is a mere skeleton; subsequent
commits will flesh out the functionality.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Description suggested --no-abbrev-commit negates --oneline as well as any other
option that implies --abbrev-commit. Fix it to say that it's --abbrev-commit
that is negated, not the option that implies it.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As child_process structure has an embedded strvec args for
formulating the command line, let's use it instead of using
an out-of-line argv[] whose length needs to be maintained
correctly.
Also, when spawning a git subcommand, omit it from the command list
and instead use the .git_cmd bit in the child_process structure.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allocate a child_env strvec but never free its memory. Instead, let's
just use the strvec that our child_process struct provides, which is
cleaned up automatically when we run the command.
And while we're moving the initialization of the child_process around,
let's switch it to use the official init function (zero-initializing it
works OK, since strvec is happy enough with that, but it sets a bad
example).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The child_process structure has an embedded strvec for formulating
the command line argument list these days, but code that predates
the wide use of it prepared a separate char *argv[] array and
manually set the child_process.argv pointer point at it.
Teach these old-style code to lose the separate argv[] array.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This ancient script runs "git-foo" all over the place, which is
OK for a scripted Porcelain in the Git suite, but asking "git" to
dispatch to subcommands is the usual way these days.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running it as "git remote-ext" and letting "git" dispatch to
"remote-ext" would just be fine and is more idiomatic.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 525e18c04b (midx: clear midx on repack, 2018-07-12), 'git repack'
learned to remove a multi-pack-index file if it added or removed a pack
from the object store.
This mechanism is a little over-eager, since it is only necessary to
drop a MIDX if 'git repack' removes a pack that the MIDX references.
Adding a pack outside of the MIDX does not require invalidating the
MIDX, and likewise for removing a pack the MIDX does not know about.
Teach 'git repack' to check for this by loading the MIDX, and checking
whether the to-be-removed pack is known to the MIDX. This requires a
slightly odd alternation to a test in t5319, which is explained with a
comment. A new test is added to show that the MIDX is left alone when
both packs known to it are marked as .keep, but two packs unknown to it
are removed and combined into one new pack.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7ba826290a (revision: add rev_input_given flag, 2017-08-02) added
a flag to rev_info to tell whether we got any revision arguments. As
explained there, this is necessary because some revision arguments may
not produce any pending traversal objects, but should still inhibit
default behaviors (e.g., a glob that matches nothing).
However, it only set the flag in the globbing code, but not for
revisions we get on the command-line or via stdin. This leads to two
problems:
- the command-line code keeps its own separate got_rev_arg flag; this
isn't wrong, but it's confusing and an extra maintenance burden
- even specifically-named rev arguments might end up not adding any
pending objects: if --ignore-missing is set, then specifying a
missing object is a noop rather than an error.
And that leads to some user-visible bugs:
- when deciding whether a default rev like "HEAD" should kick in, we
check both got_rev_arg and rev_input_given. That means that
"--ignore-missing $ZERO_OID" works on the command-line (where we set
got_rev_arg) but not on --stdin (where we don't)
- when rev-list decides whether it should complain that it wasn't
given a starting point, it relies on rev_input_given. So it can't
even get the command-line "--ignore-missing $ZERO_OID" right
Let's consistently set the flag if we got any revision argument. That
lets us clean up the redundant got_rev_arg, and fixes both of those bugs
(but note there are three new tests: we'll confirm the already working
git-log command-line case).
A few implementation notes:
- conceptually we want to set the flag whenever handle_revision_arg()
finds an actual revision arg ("handles" it, you might say). But it
covers a ton of cases with early returns. Rather than annotating
each one, we just wrap it and use its success exit-code to set the
flag in one spot.
- the new rev-list test is in t6018, which is titled to cover globs.
This isn't exactly a glob, but it made sense to stick it with the
other tests that handle the "even though we got a rev, we have no
pending objects" case, which are globs.
- the tests check for the oid of a missing object, which it's pretty
clear --ignore-missing should ignore. You can see the same behavior
with "--ignore-missing a-ref-that-does-not-exist", because
--ignore-missing treats them both the same. That's perhaps less
clearly correct, and we may want to change that in the future. But
the way the code and tests here are written, we'd continue to do the
right thing even if it does.
Reported-by: Bryan Turner <bturner@atlassian.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding the reference-transaction hook, there were concerns about
the performance impact it may have on setups which do not make use of
the new hook at all. After all, it gets executed every time a reftx is
prepared, committed or aborted, which linearly scales with the number of
reference-transactions created per session. And as there are code paths
like `git push` which create a new transaction for each reference to be
updated, this may translate to calling `find_hook()` quite a lot.
To address this concern, a cache was added with the intention to not
repeatedly do negative hook lookups. Turns out this cache caused a
regression, which was fixed via e5256c82e5 (refs: fix interleaving hook
calls with reference-transaction hook, 2020-08-07). In the process of
discussing the fix, we realized that the cache doesn't really help even
in the negative-lookup case. While performance tests added to benchmark
this did show a slight improvement in the 1% range, this really doesn't
warrent having a cache. Furthermore, it's quite flaky, too. E.g. running
it twice in succession produces the following results:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2%
1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5%
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5%
1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0%
One case notably absent from those benchmarks is a single executable
searching for the hook hundreds of times, which is exactly the case for
which the negative cache was added. p1400.2 will spawn a new update-ref
for each transaction and p1400.3 only has a single reference-transaction
for all reference updates. So this commit adds a third benchmark, which
performs an non-atomic push of a thousand references. This will create a
new reference transaction per reference. But even for this case, the
negative cache doesn't consistently improve performance:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7%
1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6%
1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2%
So let's just remove the cache altogether to simplify the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definitions of 'verify_submodule_committish()' and
'print_submodule_summary()' had wrong styling in terms of the asterisk
placement. Amend them.
Also, the warning printed in case of an unexpected file mode printed the
mode in decimal. Print it in octal for enhanced readability.
Reported-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate the parameters 'missing_{src,dst}' from the
'print_submodule_summary()' function call since they are not used
anywhere in the function.
Reported-by: Jeff King <peff@peff.net>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching with packfile URIs and transfer.fsckobjects=1, use the
--fsck-objects instead of the --strict flag when invoking index-pack so
that links are not checked, only objects. This is because incomplete
links are expected. (A subsequent connectivity check will be done when
all the packs have been downloaded regardless of whether
transfer.fsckobjects is set.)
This is similar to 98a2ea46c2 ("fetch-pack: do not check links for
partial fetch", 2018-03-15), but for packfile URIs instead of partial
clones.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dd4b732df7 ("upload-pack: send part of packfile response as uri",
2020-06-10) added the "only_packfile" parameter to get_pack() but did
not document it. Add documentation.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
88e2f9ed8e ("introduce fetch-object: fetch one promisor object",
2017-12-05) plumbed through the from_promisor parameter but did
not document it everywhere it appeared. Add the documentation.
(It also plumbed through the no_dependents parameter, but I have left
that alone because it is being removed in a commit under review [1].)
[1] https://lore.kernel.org/git/e8f16d69089a5011c355d5939c56fa53b7a1eb2d.1597184949.git.jonathantanmy@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The patch-id computation did not ignore the "incomplete last line"
marker like whitespaces.
* rs/patch-id-with-incomplete-line:
patch-id: ignore newline at end of file in diff_flush_patch_id()
Doc updates for subtree (in contrib/)
* dl/subtree-docs:
contrib/subtree: document 'push' does not take '--squash'
contrib/subtree: fix "unsure" for --message in the document
The recent addition of SHA-256 support is marked as experimental in
the documentation.
* ma/doc-sha-256-is-experimental:
Documentation: mark `--object-format=sha256` as experimental
Use more buffered I/O where we used to call many small write(2)s.
* rs/more-buffered-io:
upload-pack: use buffered I/O to talk to rev-list
midx: use buffered I/O to talk to pack-objects
connected: use buffered I/O to talk to rev-list
"ls-files -o" mishandled the top-level directory of another git
working tree that hangs in the current git working tree.
* en/dir-nonbare-embedded:
dir: avoid prematurely marking nonbare repositories as matches
t3000: fix some test description typos
The "--batch-size" option of "git multi-pack-index repack" command
is now used to specify that very small packfiles are collected into
one until the total size roughly exceeds it.
* ds/midx-repack-to-batch-size:
multi-pack-index: repack batches below --batch-size
The purpose of "git init --separate-git-dir" is to initialize a
new project with the repository separate from the working tree,
or, in the case of an existing project, to move the repository
(the .git/ directory) out of the working tree. It does not make
sense to use --separate-git-dir with a bare repository for which
there is no working tree, so disallow its use with bare
repositories.
* es/init-no-separate-git-dir-in-bare:
init: disallow --separate-git-dir with bare repository
A subsequent commit will make the quantum of work smaller, necessitating
more locking. This commit allows resolve_delta() to be called outside
the lock.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is refactoring 2 of 2 to simplify struct base_data.
Whenever we make a struct base_data, immediately calculate its delta
children. This eliminates confusion as to when the
{ref,ofs}_{first,last} fields are initialized.
Before this patch, the delta children were calculated at the last
possible moment. This allowed the members of struct base_data to be
populated in any order, superficially useful when we have the object
contents before the struct object_entry. But this makes reasoning about
the state of struct base_data more complicated, hence this patch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is refactoring 1 of 2 to simplify struct base_data.
In index-pack, each thread maintains a doubly-linked list of the delta
chain that it is currently processing (the "base" and "child" pointers
in struct base_data). When a thread exceeds the delta base cache limit
and needs to reclaim memory, it uses the "child" pointers to traverse
the lineage, reclaiming the memory of the eldest delta bases first.
A subsequent patch will perform memory reclaiming in a different way and
will thus no longer need the "child" pointer. Because the "child"
pointer is redundant even now, remove it so that the aforementioned
subsequent patch will be clearer. In the meantime, reclaim memory in the
reverse order of the "base" pointers.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
find_{ref,ofs}_delta_{,children} take an enum object_type parameter, but
the object type is already present in the name of the function. Remove
that parameter from these functions.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify that core.deltaBaseCacheLimit is per-thread, as can be seen from
the fact that cache usage (base_cache_used in struct thread_local in
builtin/index-pack.c) is tracked individually for each thread and
compared against delta_base_cache_limit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ensure that the [--first-parent] option is listed in the output of
"git bisect -h".
Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pathspec given to git checkout and git restore is used with both
tree_entry_interesting (via read_tree_recursive) and match_pathspec
(via ce_path_match). The latter effectively only supports recursive
matching regardless of the value of the pathspec flag "recursive",
which is unset here.
That causes different match results for pathspecs with wildcards, and
can lead checkout and restore in no-overlay mode to remove entries
instead of modifying them. Enable recursive matching for both checkout
and restore to make matching consistent.
Setting the flag in checkout_main() technically also affects git switch,
but since that command doesn't accept pathspecs at all this has no
actual consequence.
Reported-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com>
Initial-test-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If `user.name` and `user.email` have not been configured and the
user invokes:
git commit --author=...
without specifying the committer identity, then Git errors out with
a message asking the user to configure `user.name` and `user.email`
but doesn't tell the user which attribution was missing. This can be
confusing for a user new to Git who isn't aware of the distinction
between user, author, and committer.
Give such users a bit more help by extending the error message to
also say which attribution is expected.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'contents' atom does not show any error if used with 'trailers'
atom and colon is missing before trailers arguments.
e.g %(contents:trailersonly) works, while it shouldn't.
It is definitely not an expected behavior.
Let's fix this bug.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.
For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.
Although, that behaviour is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.
Let's allow the blob object names shown on the "index" line to be
abbreviated to arbitrary length given via the "--abbrev" option.
To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always show full object id
if `--full-index' is specified.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From 72f936b1 (t4013: make test hash independent, 2020-02-07),
we started to adjust metadata of git-diff's output in order to
ignore uninteresting metadata which is dependent of underlying hash
algorithm.
However, we forgot to special case all-zero object names, which is
special for missing objects, in consequence, we could't catch
possible future bugs where object names is all-zeros including but
not limited to:
* show intend-to-add entry
* deleted entry
* diff between index and working tree with new file
We also mistakenly munged file-modes as if they were object names
abbreviated to 6 hexadecimal digits.
In addition, in the upcoming change, we would like to test for
customizing the length of abbreviated blob objects on the index line,
which is not supported by current diff-processor logic.
Let's fix the bug for all-zero object names, and file modes.
While we're at it, support abbreviation of object names up to 16 bytes.
Based-on-patch-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, there are different tests for testing %(trailers) and
%(contents:trailers) causing redundant copy.
Its time to get rid of duplicate code.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While YAML allows different indentation styles as long as each block
is consistent, it is rather unusual to mix different indentations in
a single file. Adjust to use two-space indentation everywhere.
Signed-off-by: Adrian Moennich <adrian@planetcoding.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b8a2486f15 (index-pack: support multithreaded delta resolving,
2012-05-06) describes an experiment that shows that setting the number
of threads for index-pack higher than 3 does not help.
I repeated that experiment using a more modern version of Git and a more
modern CPU and got different results.
Here are timings for p5302 against linux.git run on my laptop, a Core
i9-9880H with 8 cores plus hyperthreading (so online-cpus returns 16):
5302.3: index-pack 0 threads 256.28(253.41+2.79)
5302.4: index-pack 1 threads 257.03(254.03+2.91)
5302.5: index-pack 2 threads 149.39(268.34+3.06)
5302.6: index-pack 4 threads 94.96(294.10+3.23)
5302.7: index-pack 8 threads 68.12(339.26+3.89)
5302.8: index-pack 16 threads 70.90(655.03+7.21)
5302.9: index-pack default number of threads 116.91(290.05+3.21)
You can see that wall-clock times continue to improve dramatically up to
the number of cores, but bumping beyond that (into hyperthreading
territory) does not help (and in fact hurts a little).
Here's the same experiment on a machine with dual Xeon 6230's, totaling
40 cores (80 with hyperthreading):
5302.3: index-pack 0 threads 310.04(302.73+6.90)
5302.4: index-pack 1 threads 310.55(302.68+7.40)
5302.5: index-pack 2 threads 178.17(304.89+8.20)
5302.6: index-pack 5 threads 99.53(315.54+9.56)
5302.7: index-pack 10 threads 72.80(327.37+12.79)
5302.8: index-pack 20 threads 60.68(357.74+21.66)
5302.9: index-pack 40 threads 58.07(454.44+67.96)
5302.10: index-pack 80 threads 59.81(720.45+334.52)
5302.11: index-pack default number of threads 134.18(309.32+7.98)
The results are similar; things stop improving at 40 threads. Curiously,
going from 20 to 40 really doesn't help much, either (and increases CPU
time considerably). So that may represent an actual barrier to
parallelism, where we lose out due to context-switching and loss of
cache locality, but don't reap the wall-clock benefits due to contention
of our coarse-grained locks.
So what's a good default value? It's clear that the current cap of 3 is
too low; our default values are 42% and 57% slower than the best times
on each machine. The results on the 40-core machine imply that 20
threads is an actual barrier regardless of the number of cores, so we'll
take that as a maximum. We get the best results on these machines at
half of the online-cpus value. That's presumably a result of the
hyperthreading. That's common on multi-core Intel processors, but not
necessarily elsewhere. But if we take it as an assumption, we can
perform optimally on hyperthreaded machines and still do much better
than the status quo on other machines, as long as we never half below
the current value of 3.
So that's what this patch does.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>