"git rebase --merge" as been reimplemented by reusing the internal
machinery used for "git rebase -i".
* en/rebase-merge-on-sequencer:
rebase: implement --merge via the interactive machinery
rebase: define linearization ordering and enforce it
git-legacy-rebase: simplify unnecessary triply-nested if
git-rebase, sequencer: extend --quiet option for the interactive machinery
am, rebase--merge: do not overlook --skip'ed commits with post-rewrite
t5407: add a test demonstrating how interactive handles --skip differently
rebase: fix incompatible options error message
rebase: make builtin and legacy script error messages the same
The git-p4 login ticket expiry test causes unreliable test
runs. Since the handling of ticket expiry in git-p4 is far
from polished anyway, let's remove it for now.
A better way to actually run the test is to create a python
"fake" version of "p4" which returns whatever expiry results
the test requires.
Ideally git-p4 would look at the expiry time before starting
any long operations, and cleanup gracefully if there is not
enough time left. But that's quite hard to do.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a smart HTTP server sends an error message via pkt-line, we detect
the error due to using PACKET_READ_DIE_ON_ERR_PACKET. This case was
added by 2d103c31c2 (pack-protocol.txt: accept error packets in any
context, 2018-12-29), but not covered by tests.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The getnanos() helper always gets the current time from our
getnanotime() facility. The caller cannot override it via TEST_DATE_NOW,
and hence we simply ignore the "now" parameter to the function. Let's
remove it, as it may mislead callers into thinking it does something.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch was contributed only as a tentative "we could introduce a
convenient short option if we do not want to change the default behavior
in the long run" patch, opening the discussion whether other people
agree with deprecating the current behavior in favor of the rescheduling
behavior.
But the consensus on the Git mailing list was that it would make sense
to show a warning in the near future, and flip the default
rebase.rescheduleFailedExec to reschedule failed `exec` commands by
default. See e.g.
<CAGZ79kZL5CRqCDRb6B-EedUm8Z_i4JuSF2=UtwwdRXMitrrOBw@mail.gmail.com>
So let's back out that patch that added the `-y` short option that we
agreed was not necessary or desirable.
This reverts commit 81ef8ee75d.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When GIT_SEQUENCE_EDITOR is set, the command was incorrectly
started when modes of "git rebase" that implicitly uses the
machinery for the interactive rebase are run, which has been
corrected.
* pw/no-editor-in-rebase-i-implicit:
implicit interactive rebase: don't run sequence editor
"git diff --color-moved --cc --stat -p" did not work well due to
funny interaction between a bug in color-moved and the rest, which
has been fixed.
* jk/diff-cc-stat-fixes:
combine-diff: treat --dirstat like --stat
combine-diff: treat --summary like --stat
combine-diff: treat --shortstat like --stat
combine-diff: factor out stat-format mask
diff: clear emitted_symbols flag after use
t4006: resurrect commented-out tests
"git checkout -b <new> [HEAD]" to create a new branch from the
current commit and check it out ought to be a no-op in the index
and the working tree in normal cases, but there are corner cases
that do require updates to the index and the working tree. Running
it immediately after "git clone --no-checkout" is one of these
cases that an earlier optimization kicked in incorrectly, which has
been fixed.
* bp/checkout-new-branch-optim:
checkout: fix regression in checkout -b on intitial checkout
checkout: add test demonstrating regression with checkout -b on initial commit
Asking "git check-attr" about a macro (e.g. "binary") on a specific
path did not work correctly, even though "git check-attr -a" listed
such a macro correctly. This has been corrected.
* jk/attr-macro-fix:
attr: do not mark queried macros as unset
On a case-insensitive filesystem, we failed to compare the part of
the path that is above the worktree directory in an absolute
pathname, which has been corrected.
* js/abspath-part-inside-repo:
abspath_part_inside_repo: respect core.ignoreCase
The codepath to show progress meter while writing out commit-graph
file has been improved.
* ab/commit-graph-write-progress:
commit-graph write: emit a percentage for all progress
commit-graph write: add itermediate progress
commit-graph write: remove empty line for readability
commit-graph write: add more descriptive progress output
commit-graph write: show progress for object search
commit-graph write: more descriptive "writing out" output
commit-graph write: add "Writing out" progress output
commit-graph: don't call write_graph_chunk_extra_edges() unnecessarily
commit-graph: rename "large edges" to "extra edges"
"git add --ignore-errors" did not work as advertised and instead
worked as an unintended synonym for "git add --renormalize", which
has been fixed.
* jk/add-ignore-errors-bit-assignment-fix:
add: use separate ADD_CACHE_RENORMALIZE flag
In Git for Windows, "git clone \\server\share\path" etc. that uses
UNC paths from command line had bad interaction with its shell
emulation.
* js/mingw-unc-path-w-backslashes:
mingw: special-case arguments to `sh`
mingw (t5580): document bug when cloning from backslashed UNC paths
"git fetch" and "git upload-pack" learned to send all exchange over
the sideband channel while talking the v2 protocol.
* jt/fetch-v2-sideband:
tests: define GIT_TEST_SIDEBAND_ALL
{fetch,upload}-pack: sideband v2 fetch response
sideband: reverse its dependency on pkt-line
pkt-line: introduce struct packet_writer
pack-protocol.txt: accept error packets in any context
Use packet_reader instead of packet_read_line
The codepath to read from the commit-graph file attempted to read
past the end of it when the file's table-of-contents was corrupt.
* js/commit-graph-chunk-table-fix:
Makefile: correct example fuzz build
commit-graph: fix buffer read-overflow
commit-graph, fuzz: add fuzzer for commit-graph
"git p4" failed to update a shelved change when there were moved
files, which has been corrected.
* ld/git-p4-shelve-update-fix:
git-p4: handle update of moved/copied files when updating a shelve
git-p4: add failing test for shelved CL update involving move/copy
Update the protocol message specification to allow only the limited
use of scaled quantities. This is ensure potential compatibility
issues will not go out of hand.
* js/filter-options-should-use-plain-int:
filter-options: expand scaled numbers
tree:<depth>: skip some trees even when collecting omits
list-objects-filter: teach tree:# how to handle >0
The in-core repository instances are passed through more codepaths.
* sb/more-repo-in-api: (23 commits)
t/helper/test-repository: celebrate independence from the_repository
path.h: make REPO_GIT_PATH_FUNC repository agnostic
commit: prepare free_commit_buffer and release_commit_memory for any repo
commit-graph: convert remaining functions to handle any repo
submodule: don't add submodule as odb for push
submodule: use submodule repos for object lookup
pretty: prepare format_commit_message to handle arbitrary repositories
commit: prepare logmsg_reencode to handle arbitrary repositories
commit: prepare repo_unuse_commit_buffer to handle any repo
commit: prepare get_commit_buffer to handle any repo
commit-reach: prepare in_merge_bases[_many] to handle any repo
commit-reach: prepare get_merge_bases to handle any repo
commit-reach.c: allow get_merge_bases_many_0 to handle any repo
commit-reach.c: allow remove_redundant to handle any repo
commit-reach.c: allow merge_bases_many to handle any repo
commit-reach.c: allow paint_down_to_common to handle any repo
commit: allow parse_commit* to handle any repo
object: parse_object to honor its repository argument
object-store: prepare has_{sha1, object}_file to handle any repo
object-store: prepare read_object_file to deal with any repo
...
This ensures that nothing breaks the basic functionality of describe for
bare repositories. Please note that --broken and --dirty need a working
tree.
Signed-off-by: Sebastian Staudt <koraktor@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't use NEED_WORK_TREE when running the git-describe builtin,
since you should be able to describe a commit even in a bare repository.
However, the --dirty flag does need a working tree. Since we don't call
setup_work_tree(), it uses whatever directory we happen to be in. That's
unlikely to match our index, meaning we'd say "dirty" even when the real
working tree is clean.
We can fix that by calling setup_work_tree() once we know that the user
has asked for --dirty.
The --broken option also needs a working tree. But because its
implementation calls git-diff-index we don‘t have to setup the working
tree in the git-describe process.
Signed-off-by: Sebastian Staudt <koraktor@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the new "ambiguous" result from cat-file --batch and
--batch-check. This is in t1512 instead of t1006 since
we need a repo with ambiguous object_id names.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users who want UTF-16 files in the working tree set the .gitattributes
like this:
test.txt working-tree-encoding=UTF-16
The unicode standard itself defines 3 allowed ways how to encode UTF-16.
The following 3 versions convert all back to 'g' 'i' 't' in UTF-8:
a) UTF-16, without BOM, big endian:
$ printf "\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000 g i t
b) UTF-16, with BOM, little endian:
$ printf "\377\376g\000i\000t\000" | iconv -f UTF-16 -t UTF-8 | od -c
0000000 g i t
c) UTF-16, with BOM, big endian:
$ printf "\376\377\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000 g i t
Git uses libiconv to convert from UTF-8 in the index into ITF-16 in the
working tree.
After a checkout, the resulting file has a BOM and is encoded in "UTF-16",
in the version (c) above.
This is what iconv generates, more details follow below.
iconv (and libiconv) can generate UTF-16, UTF-16LE or UTF-16BE:
d) UTF-16
$ printf 'git' | iconv -f UTF-8 -t UTF-16 | od -c
0000000 376 377 \0 g \0 i \0 t
e) UTF-16LE
$ printf 'git' | iconv -f UTF-8 -t UTF-16LE | od -c
0000000 g \0 i \0 t \0
f) UTF-16BE
$ printf 'git' | iconv -f UTF-8 -t UTF-16BE | od -c
0000000 \0 g \0 i \0 t
There is no way to generate version (b) from above in a Git working tree,
but that is what some applications need.
(All fully unicode aware applications should be able to read all 3 variants,
but in practise we are not there yet).
When producing UTF-16 as an output, iconv generates the big endian version
with a BOM. (big endian is probably chosen for historical reasons).
iconv can produce UTF-16 files with little endianess by using "UTF-16LE"
as encoding, and that file does not have a BOM.
Not all users (especially under Windows) are happy with this.
Some tools are not fully unicode aware and can only handle version (b).
Today there is no way to produce version (b) with iconv (or libiconv).
Looking into the history of iconv, it seems as if version (c) will
be used in all future iconv versions (for compatibility reasons).
Solve this dilemma and introduce a Git-specific "UTF-16LE-BOM".
libiconv can not handle the encoding, so Git pick it up, handles the BOM
and uses libiconv to convert the rest of the stream.
(UTF-16BE-BOM is added for consistency)
Rported-by: Adrián Gimeno Balaguer <adrigibal@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user gives an empty argument to --exec then git creates a todo
list that it cannot parse. The rebase starts to run before erroring out
with
error: missing arguments for exec
error: invalid line 2: exec
You can fix this with 'git rebase --edit-todo' and then run 'git rebase --continue'.
Or you can abort the rebase with 'git rebase --abort'.
Instead check for empty commands before starting the rebase.
Also check that the command does not contain any newlines as the
todo-list format is unable to cope with multiline commands. Note that
this changes the behavior, before this change one could do
git rebase --exec='echo one
exec echo two'
and it would insert two exec lines in the todo list, now it will error
out.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Custom userformat "log --format" learned %S atom that stands for
the tip the traversal reached the commit from, i.e. --source.
* it/log-format-source:
log: add %S option (like --source) to log --format
"git fetch --deepen=<more>" has been corrected to work over v2
protocol.
* jt/upload-pack-deepen-relative-proto-v2:
upload-pack: teach deepen-relative in protocol v2
fetch-pack: do not take shallow lock unnecessarily
Debugging help for http transport.
* ms/http-no-more-failonerror:
test: test GIT_CURL_VERBOSE=1 shows an error
remote-curl: unset CURLOPT_FAILONERROR
remote-curl: define struct for CURLOPT_WRITEFUNCTION
http: enable keep_error for HTTP requests
http: support file handles for HTTP_KEEP_ERROR
"git rebase" internally runs "checkout" to switch between branches,
and the command used to call the post-checkout hook, but the
reimplementation stopped doing so, which is getting fixed.
* os/rebase-runs-post-checkout-hook:
rebase: run post-checkout hook on checkout
t5403: simplify by using a single repository
Add sha-256 hash and plug it through the code to allow building Git
with the "NewHash".
* bc/sha-256:
hash: add an SHA-256 implementation using OpenSSL
sha256: add an SHA-256 implementation using libgcrypt
Add a base implementation of SHA-256 support
commit-graph: convert to using the_hash_algo
t/helper: add a test helper to compute hash speed
sha1-file: add a constant for hash block size
t: make the sha1 test-tool helper generic
t: add basic tests for our SHA-1 implementation
cache: make hashcmp and hasheq work with larger hashes
hex: introduce functions to print arbitrary hashes
sha1-file: provide functions to look up hash algorithms
sha1-file: rename algorithm to "sha1"
"git fetch --recurse-submodules" may not fetch the necessary commit
that is bound to the superproject, which is getting corrected.
* sb/submodule-recursive-fetch-gets-the-tip:
fetch: ensure submodule objects fetched
submodule.c: fetch in submodules git directory instead of in worktree
submodule: migrate get_next_submodule to use repository structs
repository: repo_submodule_init to take a submodule struct
submodule: store OIDs in changed_submodule_names
submodule.c: tighten scope of changed_submodule_names struct
submodule.c: sort changed_submodule_names before searching it
submodule.c: fix indentation
sha1-array: provide oid_array_filter
The v2 upload-pack protocol implementation failed to honor
hidden-ref configuration, which has been corrected.
An earlier attempt reverted out of 'next'.
* jk/proto-v2-hidden-refs-fix:
upload-pack: support hidden refs with protocol v2
"git rebase -i" learned to re-execute a command given with 'exec'
to run after it failed the last time.
* js/rebase-i-redo-exec:
rebase: introduce a shortcut for --reschedule-failed-exec
rebase: add a config option to default to --reschedule-failed-exec
rebase: introduce --reschedule-failed-exec
When using `human` several fields are suppressed depending on the time
difference between the reference date and the local computer date. In
cases where the difference is less than a year, the year field is
supppressed. If the time is less than a day; the month and year is
suppressed.
Use TEST_DATE_NOW environment variable when using the test-tool to
hold the expected output strings constant.
Signed-off-by: Stephen P. Smith <ischis2@cox.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the human format support to the test tool so that
GIT_TEST_DATE_NOW can be used to specify the current time.
The get_time() helper function was created and and checks the
GIT_TEST_DATE_NOW environment variable. If GIT_TEST_DATE_NOW is set,
then that date is used instead of the date returned by by
gettimeofday().
All calls to gettimeofday() were replaced by calls to get_time().
Renamed occurances of TEST_DATE_NOW to GIT_TEST_DATE_NOW since the
variable is now used in the get binary and not just in the test-tool.
Signed-off-by: Stephen P. Smith <ischis2@cox.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fact that Git's test suite is implemented in Unix shell script that
is as portable as we can muster, combined with the fact that Unix shell
scripting is foreign to Windows (and therefore has to be emulated),
results in pretty abysmal speed of the test suite on that platform, for
pretty much no other reason than that language choice.
For comparison: while the Linux build & test is typically done within
about 8 minutes, the Windows build & test typically lasts about 80
minutes in Azure Pipelines.
To help with that, let's use the Azure Pipeline feature where you can
parallelize jobs, make jobs depend on each other, and pass artifacts
between them.
The tests are distributed using the following heuristic: listing all
test scripts ordered by size in descending order (as a cheap way to
estimate the overall run time), every Nth script is run (where N is the
total number of parallel jobs), starting at the index corresponding to
the parallel job. This slicing is performed by a new function that is
added to the `test-tool`.
To optimize the overall runtime of the entire Pipeline, we need to move
the Windows jobs to the beginning (otherwise there would be a very
decent chance for the Pipeline to be run only the Windows build, while
all the parallel Windows test jobs wait for this single one).
We use Azure Pipelines Artifacts for both the minimal Git for Windows
SDK as well as the built executables, as deduplication and caching close
to the agents makes that really fast. For comparison: while downloading
and unpacking the minimal Git for Windows SDK via PowerShell takes only
one minute (down from anywhere between 2.5 to 7 when using a shallow
clone), uploading it as Pipeline Artifact takes less than 30s and
downloading and unpacking less than 20s (sometimes even as little as
only twelve seconds).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This speeds up the tests by a bit on Windows, where running Unix shell
scripts (and spawning processes) is not exactly a cheap operation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building Git with RUNTIME_PREFIX and starting a test helper from
t/helper/, it fails to detect a system prefix. The reason is that the
RUNTIME_PREFIX feature wants to use the location of the Git executable
to determine where the support files can be found, e.g. system-wide Git
config or the translations. This does not make any sense for the test
helpers, though, as they are distinctly not in a directory structure
resembling the final installation location of Git.
That is the reason why the test helpers rely on environment variables to
indicate the location of the needed support files, e.g.
GIT_TEXTDOMAINDIR. If this information is missing, the output will
contain warnings like this one:
RUNTIME_PREFIX requested, but prefix computation failed. [...]
In t0061, we did not expect that to happen, and it actually does not
happen in the regular case, because bin-wrappers/test-tool specifically
sets GIT_TEXTDOMAINDIR (and as a consequence, nothing in test-tool needs
to know anything about any runtime prefix).
However, with --with-dashes, bin-wrappers/test-tool is no longer called,
but t/helper/test-tool is called directly instead.
So let's just ignore the RUNTIME_PREFIX warning.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We really need to be able to find the test helpers... Really. This
change was forgotten when we moved the test helpers into t/helper/
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems that every once in a while in the Git for Windows SDK, there
are some transient file locking issues preventing the test clean up to
delete the trash directory. Let's be gentle and try again five seconds
later, and only error out if it still fails the second time.
This change helps Windows, and does not hurt any other platform
(normally, it is highly unlikely that said deletion fails, and if it
does, normally it will fail again even 5 seconds later).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The JUnit XML format lends itself to be presented in a powerful UI,
where you can drill down to the information you are interested in very
quickly.
For test failures, this usually means that you want to see the detailed
trace of the failing tests.
With Travis CI, we passed the `--verbose-log` option to get those
traces. However, that seems excessive, as we do not need/use the logs in
almost all of those cases: only when a test fails do we have a way to
include the trace.
So let's do something different when using Azure DevOps: let's run all
the tests with `--quiet` first, and only if a failure is encountered,
try to trace the commands as they are executed.
Of course, we cannot turn on `--verbose-log` after the fact. So let's
just re-run the test with all the same options, adding `--verbose-log`.
And then munging the output file into the JUnit XML on the fly.
Note: there is an off chance that re-running the test in verbose mode
"fixes" the failures (and this does happen from time to time!). That is
a possibility we should be able to live with. Ideally, we would label
this as "Passed upon rerun", and Azure Pipelines even know about that
outcome, but it is not available when using the JUnit XML format for
now:
https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is a bit ridiculous to spin up a full-blown Perl instance (especially
on Windows, where that means spinning up a full POSIX emulation layer,
AKA the MSYS2 runtime) just to tell how large a given file is.
So let's just use the test-tool to do that job instead.
This command will also be used over the next commits, to allow for
cutting out individual test cases' verbose log from the file generated
via --verbose-log.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will come in handy when publishing the results of Git's test suite
during an automated Azure DevOps run.
Note: we need to make extra sure that invalid UTF-8 encoding is turned
into valid UTF-8 (using the Replacement Character, \uFFFD) because
t9902's trace contains such invalid byte sequences, and the task in the
Azure Pipeline that uploads the test results would refuse to do anything
if it was asked to parse an .xml file with invalid UTF-8 in it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From `man sed` (on a Mac OS X box):
The -E, -a and -i options are non-standard FreeBSD extensions and may not be available
on other operating systems.
From `man sed` on a Linux box:
REGULAR EXPRESSIONS
POSIX.2 BREs should be supported, but they aren't completely because of
performance problems. The \n sequence in a regular expression matches the newline
character, and similarly for \a, \t, and other sequences.
The -E option switches to using extended regular expressions instead; the -E option
has been supported for years by GNU sed, and is now included in POSIX.
Well, there are still a lot of systems out there, which don't support it.
Beside that, IEEE Std 1003.1TM-2017, see
http://pubs.opengroup.org/onlinepubs/9699919799/
does not mention -E either.
To be on the safe side, don't allow -E (or -r, which is GNU).
Change check-non-portable-shell.pl to only accept the portable options:
sed [-n] [-e command] [-f command_file]
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit, we want to teach Git's test suite to optionally
output test results in JUnit-style .xml files. These files contain
information about the time spent. So we need a way to measure time.
While we could use `date +%s` for that, this will give us only seconds,
i.e. very coarse-grained timings.
GNU `date` supports `date +%s.%N` (i.e. nanosecond-precision output),
but there is no equivalent in BSD `date` (read: on macOS, we would not
be able to obtain precise timings).
So let's introduce `test-tool date getnanos`, with an optional start
time, that outputs preciser values. Note that this might not actually
give us nanosecond precision on some platforms, but it will give us as
precise information as possible, without the portability issues of shell
commands.
Granted, it is a bit pointless to try measuring times accurately in
shell scripts, certainly to nanosecond precision. But it is better than
second-granularity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If GIT_SEQUENCE_EDITOR is set then rebase runs it when executing
implicit interactive rebases which are supposed to appear
non-interactive to the user. Fix this by setting GIT_SEQUENCE_EDITOR=:
rather than GIT_EDITOR=:. A couple of tests relied on the old behavior
so they are updated to work with the new regime.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The timestamp we receive is in epoch time, so there's no need for a
timezone parameter to interpret it. The matching show_date() uses "tz"
to show dates in author local time, but relative dates show only the
absolute time difference. The author's location is irrelevant, barring
relativistic effects from using Git close to the speed of light.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently "--cc --dirstat" will show nothing for a merge. Like
--shortstat and --summary in the previous two patches, it probably makes
sense to treat it like we do --stat, and show a stat against the
first-parent.
This case is less obviously correct than for --shortstat and --summary,
as those are basically variants of --stat themselves. It's possible we
could develop a multi-parent combined dirstat format, in which case we
might regret defining this first-parent behavior. But the same could be
said for --stat, and in the 12+ years of it showing first-parent stats,
nobody has complained.
So showing the first-parent dirstat is at least _useful_, and if we
later develop a clever multi-parent stat format, we'd probably have to
deal with --stat anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently "--cc --summary" on a merge shows nothing. Since we show "--cc
--stat" as a stat against the first parent, and because --summary is
typically used in combination with --stat, it makes sense to treat them
both the same way.
Note that we have to tweak t4013's setup a bit to test this case, as the
existing merges do not have any --summary results against their first
parent. But since the merge at the tip of 'master' does add and remove
files with respect to the second parent, we can just make a reversed
doppelganger merge where the parents are swapped.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --stat of a combined diff is defined as the first-parent stat,
going all the way back to 965f803c32 (combine-diff: show diffstat with
the first parent., 2006-04-17).
Naturally, we gave --numstat the same treatment in 74e2abe5b7 (diff
--numstat, 2006-10-12).
But --shortstat, which is really just the final line of --stat, does
nothing, which produces confusing results:
$ git show --oneline --stat eab7584e37eab7584e37 Merge branch 'en/show-ref-doc-fix'
Documentation/git-show-ref.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
$ git show --oneline --shortstat eab7584e37eab7584e37 Merge branch 'en/show-ref-doc-fix'
[nothing! We'd expect to see the "1 file changed..." line]
This patch teaches combine-diff to treats the two formats identically.
Reported-by: David Turner <novalis@novalis.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's an odd bug when "log --color-moved" is used with the combination
of "--cc --stat -p": the stat for merge commits is erroneously shown
with the diff of the _next_ commit.
The included test demonstrates the issue. Our history looks something
like this:
A-B-M--D
\ /
C
When we run "git log --cc --stat -p --color-moved" starting at D, we get
this sequence of events:
1. The diff for D is using -p, so diff_flush() calls into
diff_flush_patch_all_file_pairs(). There we see that o->color_moved
is in effect, so we point o->emitted_symbols to a static local
struct, causing diff_flush_patch() to queue the symbols instead of
actually writing them out.
We then do our move detection, emit the symbols, and clear the
struct. But we leave o->emitted_symbols pointing to our struct.
2. Next we compute the diff for M. This is a merge, so we use the
combined diff code. In find_paths_generic(), we compute the
pairwise diff between each commit and its parent. Normally this is
done with DIFF_FORMAT_NO_OUTPUT, since we're just looking for
intersecting paths. But since "--stat --cc" shows the first-parent
stat, and since we're computing that diff anyway, we enable
DIFF_FORMAT_DIFFSTAT for the first parent. This outputs the stat
information immediately, saving us from running a separate
first-parent diff later.
But where does that output go? Normally it goes directly to stdout,
but because o->emitted_symbols is set, we queue it. As a result, we
don't actually print the diffstat for the merge commit (yet), which
is wrong.
3. Next we compute the diff for C. We're actually showing a patch
again, so we end up in diff_flush_patch_all_file_pairs(), but this
time we have the queued stat from step 2 waiting in our struct.
We add new elements to it for C's diff, and then flush the whole
thing. And we see the diffstat from M as part of C's diff, which is
wrong.
So triggering the bug really does require the combination of all of
those options.
To fix it, we can simply restore o->emitted_symbols to NULL after
flushing it, so that it does not affect anything outside of
diff_flush_patch_all_file_pairs(). This intuitively makes sense, since
nobody outside of that function is going to bother flushing it, so we
would not want them to write to it either.
In fact, we could take this a step further and turn the local "esm"
struct into a non-static variable that goes away after the function
ends. However, since it contains a dynamically sized array, we benefit
from amortizing the cost of allocations over many calls. So we'll leave
it as static to retain that benefit.
But let's push the zero-ing of esm.nr into the conditional for "if
(o->emitted_symbols)" to make it clear that we do not expect esm to hold
any values if we did not just try to use it. With the code as it is
written now, if we did encounter such a case (which I think would be a
bug), we'd silently leak those values without even bothering to display
them. With this change, we'd at least eventually show them, and somebody
would notice.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This set of tests was added by 4434e6ba6c (tests: check --[short]stat
output after chmod, 2012-05-01), and is primarily about the handling of
binary versus text files.
Later, 74faaa16f0 (Fix "git diff --stat" for interesting - but empty -
file changes, 2012-10-17) changed the stat output so that the empty text
file is mentioned rather than omitted. That commit just comments out
these tests. There's no discussion in the commit message, but the
original email[1] says:
NOTE! This does break two of our tests, so we clearly did this on
purpose, or at least tested for it. I just uncommented the subtests
that this makes irrelevant, and changed the output of another one.
I don't think they're irrelevant, though. We should be testing this
"mode change only" case and making sure that it has the post-74faaa16f0
behavior. So this commit brings back those tests, with the current
expected output.
[1] https://public-inbox.org/git/CA+55aFz88GPJcfMSqiyY+u0Cdm48bEyrsTGxHVJbGsYsDg=Q5w@mail.gmail.com/
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, index compat macros are off from now on, because they
could hide the_index dependency.
Only those in builtin can use it.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing a 'checkout -b' do a full checkout including updating the working
tree when doing the initial checkout. As the new test involves an filesystem
access, do it later in the sequence to give chance to other cheaper tests to
leave early. This fixes the regression in behavior caused by fa655d8411
(checkout: optimize "git checkout -b <new_branch>", 2018-08-16).
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit fa655d8411 (checkout: optimize "git checkout -b <new_branch>",
2018-08-16) introduced an unintentional change in behavior for 'checkout -b'
after doing 'clone --no-checkout'. Add a test to demonstrate the changed
behavior to be used in a later patch to verify the fix.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 60a12722ac (attr: remove maybe-real, maybe-macro from git_attr,
2017-01-27), we will always mark an attribute macro (e.g., "binary")
that is specifically queried for as "unspecified", even though listing
_all_ attributes would display it at set. E.g.:
$ echo "* binary" >.gitattributes
$ git check-attr -a file
file: binary: set
file: diff: unset
file: merge: unset
file: text: unset
$ git check-attr binary file
file: binary: unspecified
The problem stems from an incorrect conversion of the optimization from
06a604e670 (attr: avoid heavy work when we know the specified attr is
not defined, 2014-12-28). There we tried in collect_some_attrs() to
avoid even looking at the attr_stack when the user has asked for "foo"
and we know that "foo" did not ever appear in any .gitattributes file.
It used a flag "maybe_real" in each attribute struct, where "real" meant
that the attribute appeared in an actual file (we have to make this
distinction because we also create an attribute struct for any names
that are being queried). But as explained in that commit message, the
meaning of "real" was tangled with some special cases around macros.
When 60a12722ac later refactored the macro code, it dropped maybe_real
entirely. This missed the fact that "maybe_real" could be unset for two
reasons: because of a macro, or because it was never found during
parsing. This had two results:
- the optimization in collect_some_attrs() ceased doing anything
meaningful, since it no longer kept track of "was it found during
parsing"
- worse, it actually kicked in when the caller _did_ ask about a macro
by name, causing us to mark it as unspecified
It should be possible to salvage this optimization, but let's start with
just removing the remnants. It hasn't been doing anything (except
creating bugs) since 60a12722ac, and nobody seems to have noticed the
performance regression. It's more important to fix the correctness
problem clearly first.
I've added two tests here. The second one actually shows off the bug.
The test of "check-attr -a" is not strictly necessary, but we currently
do not test attribute macros much, and the builtin "binary" not at all.
So this increases our general test coverage, as well as making sure we
didn't mess up this related case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8abfdf44c8 (tests: explicitly use `git.exe` on Windows,
2018-11-14), we made sure to use the `.exe` file extension when
using an absolute path to `git.exe`, to avoid getting confused with a
file or directory in the same place that lacks said file extension.
For the same reason, we need to handle test-tool.exe the same way.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The optional 'Large Edge List' chunk of the commit graph file stores
parent information for commits with more than two parents, and the
names of most of the macros, variables, struct fields, and functions
related to this chunk contain the term "large edges", e.g.
write_graph_chunk_large_edges(). However, it's not a really great
term, as the edges to the second and subsequent parents stored in this
chunk are not any larger than the edges to the first and second
parents stored in the "main" 'Commit Data' chunk. It's the number of
edges, IOW number of parents, that is larger compared to non-merge and
"regular" two-parent merge commits. And indeed, two functions in
'commit-graph.c' have a local variable called 'num_extra_edges' that
refer to the same thing, and this "extra edges" term is much better at
describing these edges.
So let's rename all these references to "large edges" in macro,
variable, function, etc. names to "extra edges". There is a
GRAPH_OCTOPUS_EDGES_NEEDED macro as well; for the sake of consistency
rename it to GRAPH_EXTRA_EDGES_NEEDED.
We can do so safely without causing any incompatibility issues,
because the term "large edges" doesn't come up in the file format
itself in any form (the chunk's magic is {'E', 'D', 'G', 'E'}, there
is no 'L' in there), but only in the specification text. The string
"large edges", however, does come up in the output of 'git
commit-graph read' and in tests looking at its input, but that command
is explicitly documented as debugging aid, so we can change its output
and the affected tests safely.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --gpg-sign option in commit-tree, which was documented, but not
implemented, in 55ca3f99ae. Add tests for the --gpg-sign option.
Signed-off-by: Brandon Richardson <brandon1024.br@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If `git commit-tree HEAD^{tree}` fails on us and produces no output on
stdout, we will substitute that empty string and execute `git tag
ninth-unsigned`, i.e., we will tag HEAD rather than a newly created
object. But we are lucky: we have a signature on HEAD, so we should
eventually fail the next test, where we verify that "ninth-unsigned" is
indeed unsigned.
We have a similar problem a few lines later. If `git commit-tree -S`
fails with no output, we will happily tag HEAD as "tenth-signed". Here,
we are not so lucky. The tag ends up on the same commit as
"eighth-signed-alt", and that's a signed commit, so t7510-signed-commit
will pass, despite `git commit-tree -S` failing.
Make these `git commit-tree` invocations a direct part of the &&-chain,
so that we can rely less on luck and set a better example for future
tests modeled after this one. Fix a 9/10 copy/paste error while at it.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Brandon Richardson <brandon1024.br@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--format=<placeholder>" option of for-each-ref, branch and tag
learned to show a few more traits of objects that can be learned by
the object_info API.
* ot/ref-filter-object-info:
ref-filter: give uintmax_t to format with %PRIuMAX
ref-filter: add docs for new options
ref-filter: add tests for deltabase
ref-filter: add deltabase option
ref-filter: add tests for objectsize:disk
ref-filter: add check for negative file size
ref-filter: add objectsize:disk option
Flaky tests can now be repeatedly run under load with the
"--stress" option.
* sg/stress-test:
test-lib: add the '--stress' option to run a test repeatedly under load
test-lib-functions: introduce the 'test_set_port' helper function
test-lib: set $TRASH_DIRECTORY earlier
test-lib: consolidate naming of test-results paths
test-lib: parse command line options earlier
test-lib: parse options in a for loop to keep $@ intact
test-lib: extract Bash version check for '-x' tracing
test-lib: translate SIGTERM and SIGHUP to an exit
An inherently racy test that caused intermittent failures has been
removed.
* tg/t5570-drop-racy-test:
Revert "t/lib-git-daemon: record daemon log"
t5570: drop racy test
"git cherry-pick -m1" was forbidden when picking a non-merge
commit, even though there _is_ parent number 1 for such a commit.
This was done to avoid mistakes back when "cherry-pick" was about
picking a single commit, but is no longer useful with "cherry-pick"
that can pick a range of commits. Now the "-m$num" option is
allowed when picking any commit, as long as $num names an existing
parent of the commit.
Technically this is a backward incompatible change; hopefully
nobody is relying on the error-checking behaviour.
* so/cherry-pick-always-allow-m1:
t3506: validate '-m 1 -ff' is now accepted for non-merge commits
t3502: validate '-m 1' argument is now accepted for non-merge commits
cherry-pick: do not error on non-merge commits when '-m 1' is specified
t3510: stop using '-m 1' to force failure mid-sequence of cherry-picks
"git worktree remove" and "git worktree move" refused to work when
there is a submodule involved. This has been loosened to ignore
uninitialized submodules.
* nd/worktree-remove-with-uninitialized-submodules:
worktree: allow to (re)move worktrees with uninitialized submodules
The test suite tried to see if it is run under bash, but the check
itself failed under some other implementations of shell (notably
under NetBSD). This has been corrected.
* sg/test-bash-version-fix:
test-lib: check Bash version for '-x' without using shell arrays
With zsh, "git cmd path<TAB>" was completed to "git cmd path name"
when the completed path has a special character like SP in it,
without any attempt to keep "path name" a single filename. This
has been fixed to complete it to "git cmd path\ name" just like
Bash completion does.
* cy/zsh-completion-SP-in-path:
completion: treat results of git ls-tree as file paths
zsh: complete unquoted paths with spaces correctly
The core.worktree setting in a submodule repository should not be
pointing at a directory when the submodule loses its working tree
(e.g. getting deinit'ed), but the code did not properly maintain
this invariant.
* sb/submodule-unset-core-worktree-when-worktree-is-lost:
submodule deinit: unset core.worktree
submodule--helper: fix BUG message in ensure_core_worktree
submodule: unset core.worktree if no working tree is present
submodule update: add regression test with old style setups
"git stripspace" should be usable outside a git repository, but
under the "-s" or "-c" mode, it didn't.
* jn/stripspace-wo-repository:
stripspace: allow -s/-c outside git repository
"git submodule update" ought to use a single job unless asked, but
by mistake used multiple jobs, which has been fixed.
* sb/submodule-fetchjobs-default-to-one:
submodule update: run at most one fetch job unless otherwise set
The MSYS2 runtime does its best to emulate the command-line wildcard
expansion and de-quoting which would be performed by the calling Unix
shell on Unix systems.
Those Unix shell quoting rules differ from the quoting rules applying to
Windows' cmd and Powershell, making it a little awkward to quote
command-line parameters properly when spawning other processes.
In particular, git.exe passes arguments to subprocesses that are *not*
intended to be interpreted as wildcards, and if they contain
backslashes, those are not to be interpreted as escape characters, e.g.
when passing Windows paths.
Note: this is only a problem when calling MSYS2 executables, not when
calling MINGW executables such as git.exe. However, we do call MSYS2
executables frequently, most notably when setting the use_shell flag in
the child_process structure.
There is no elegant way to determine whether the .exe file to be
executed is an MSYS2 program or a MINGW one. But since the use case of
passing a command line through the shell is so prevalent, we need to
work around this issue at least when executing sh.exe.
Let's introduce an ugly, hard-coded test whether argv[0] is "sh", and
whether it refers to the MSYS2 Bash, to determine whether we need to
quote the arguments differently than usual.
That still does not fix the issue completely, but at least it is
something.
Incidentally, this also fixes the problem where `git clone \\server\repo`
failed due to incorrect handling of the backslashes when handing the path
to the git-upload-pack process.
Further, we need to take care to quote not only whitespace and
backslashes, but also curly brackets. As aliases frequently go through
the MSYS2 Bash, and as aliases frequently get parameters such as
HEAD@{yesterday}, this is really important. As an early version of this
patch broke this, let's make sure that this does not regress by adding a
test case for that.
Helped-by: Kim Gybels <kgybels@infogroep.be>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Due to a quirk in Git's method to spawn git-upload-pack, there is a
problem when passing paths with backslashes in them: Git will force the
command-line through the shell, which has different quoting semantics in
Git for Windows (being an MSYS2 program) than regular Win32 executables
such as git.exe itself.
The symptom is that the first of the two backslashes in UNC paths of the
form \\myserver\folder\repository.git is *stripped off*.
Document this bug by introducing a test case.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user fetches refs/heads/master from a repo with namespace "ns", the
remote is expected to (1) not send the real refs/heads/master, and (2)
send refs/namespaces/ns/refs/heads/master with the name
refs/heads/master. (1) indeed happens now, but not (2) - Git only sends
refs that have the user-given prefix, but it checks them against the
full name of the ref (the one starting with refs/namespaces), and not
the namespace-stripped one.
This is demonstrated by the patch in the test. Currently, it results in
"fatal: couldn't find remote ref refs/heads/master" despite both
unnamespaced and namespaced master being present. With the code change,
it produces the expected result.
Check the ref prefixes against the namespace-stripped name.
This bug was discovered through applying patches [1] that override
protocol.version to 2 in repositories when running tests, allowing us to
notice differences in behavior across different protocol versions.
[1] https://public-inbox.org/git/cover.1547677183.git.jonathantanmy@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the file system is case-insensitive, we really must be careful to
ignore differences in case only.
This fixes https://github.com/git-for-windows/git/issues/735
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce requires a complete list of files being operated on. If
git is updating an existing shelved changelist, then any files
which are moved or copied were not being added to this list.
Signed-off-by: Luke Diamand <luke@diamand.org>
Acked-by: Andrey Mazo <amazo@checkvideo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Updating a shelved P4 changelist where one or more files have
been moved or copied does not work. Add a test for this.
The problem is that P4 requires a complete list of the files being
changed, and move/copy only includes the _source_ in the case of
updating a shelved changelist. This results in errors from Perforce
such as:
//depot/src - needs tofile //depot/dst
Submit aborted -- fix problems then use 'p4 submit -c 1234'
Signed-off-by: Luke Diamand <luke@diamand.org>
Acked-by: Andrey Mazo <amazo@checkvideo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a test variable GIT_TEST_PACK_SPARSE to enable the sparse
object walk algorithm by default during the test suite. Enabling
this variable ensures coverage in many interesting cases, such as
shallow clones, partial clones, and missing objects.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '--sparse' flag in 'git pack-objects' changes the algorithm
used to enumerate objects to one that is faster for individual
users pushing new objects that change only a small cone of the
working directory. The sparse algorithm is not recommended for a
server, which likely sends new objects that appear across the
entire working directory.
Create a 'pack.useSparse' setting that enables this new algorithm.
This allows 'git push' to use this algorithm without passing a
'--sparse' flag all the way through four levels of run_command()
calls.
If the '--no-sparse' flag is set, then this config setting is
overridden.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When enumerating objects to place in a pack-file during 'git
pack-objects --revs', we discover the "frontier" of commits
that we care about and the boundary with commit we find
uninteresting. From that point, we walk trees to discover which
trees and blobs are uninteresting. Finally, we walk trees from the
interesting commits to find the interesting objects that are
placed in the pack.
This commit introduces a new, "sparse" way to discover the
uninteresting trees. We use the perspective of a single user trying
to push their topic to a large repository. That user likely changed
a very small fraction of the paths in their working directory, but
we spend a lot of time walking all reachable trees.
The way to switch the logic to work in this sparse way is to start
caring about which paths introduce new trees. While it is not
possible to generate a diff between the frontier boundary and all
of the interesting commits, we can simulate that behavior by
inspecting all of the root trees as a whole, then recursing down
to the set of trees at each path.
We already had taken the first step by passing an oidset to
mark_trees_uninteresting_sparse(). We now create a dictionary
whose keys are paths and values are oidsets. We consider the set
of trees that appear at each path. While we inspect a tree, we
add its subtrees to the oidsets corresponding to the tree entry's
path. We also mark trees as UNINTERESTING if the tree we are
parsing is UNINTERESTING.
To actually improve the performance, we need to terminate our
recursion. If the oidset contains only UNINTERESTING trees, then
we do not continue the recursion. This avoids walking trees that
are likely to not be reachable from interesting trees. If the
oidset contains only interesting trees, then we will walk these
trees in the final stage that collects the intersting objects to
place in the pack. Thus, we only recurse if the oidset contains
both interesting and UNINITERESTING trees.
There are a few ways that this is not a universally better option.
First, we can pack extra objects. If someone copies a subtree
from one tree to another, the first tree will appear UNINTERESTING
and we will not recurse to see that the subtree should also be
UNINTERESTING. We will walk the new tree and see the subtree as
a "new" object and add it to the pack. A test is modified to
demonstrate this behavior and to verify that the new logic is
being exercised.
Second, we can have extra memory pressure. If instead of being a
single user pushing a small topic we are a server sending new
objects from across the entire working directory, then we will
gain very little (the recursion will rarely terminate early) but
will spend extra time maintaining the path-oidset dictionaries.
Despite these potential drawbacks, the benefits of the algorithm
are clear. By adding a counter to 'add_children_by_path' and
'mark_tree_contents_uninteresting', I measured the number of
parsed trees for the two algorithms in a variety of repos.
For git.git, I used the following input:
v2.19.0
^v2.19.0~10
Objects to pack: 550
Walked (old alg): 282
Walked (new alg): 130
For the Linux repo, I used the following input:
v4.18
^v4.18~10
Objects to pack: 518
Walked (old alg): 4,836
Walked (new alg): 188
The two repos above are rather "wide and flat" compared to
other repos that I have used in the past. As a comparison,
I tested an old topic branch in the Azure DevOps repo, which
has a much deeper folder structure than the Linux repo.
Objects to pack: 220
Walked (old alg): 22,804
Walked (new alg): 129
I used the number of walked trees the main metric above because
it is consistent across multiple runs. When I ran my tests, the
performance of the pack-objects command with the same options
could change the end-to-end time by 10x depending on the file
system being warm. However, by repeating the same test on repeat
I could get more consistent timing results. The git.git and
Linux tests were too fast overall (less than 0.5s) to measure
an end-to-end difference. The Azure DevOps case was slow enough
to see the time improve from 15s to 1s in the warm case. The
cold case was 90s to 9s in my testing.
These improvements will have even larger benefits in the super-
large Windows repository. In our experiments, we see the
"Enumerate objects" phase of pack-objects taking 60-80% of the
end-to-end time of non-trivial pushes, taking longer than the
network time to send the pack and the server time to verify the
pack.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a pack-file using 'git pack-objects --revs' we provide
a list of interesting and uninteresting commits. For example, a push
operation would make the local topic branch be interesting and the
known remote refs as uninteresting. We want to discover the set of
new objects to send to the server as a thin pack.
We walk these commits until we discover a frontier of commits such
that every commit walk starting at interesting commits ends in a root
commit or unintersting commit. We then need to discover which
non-commit objects are reachable from uninteresting commits. This
commit walk is not changing during this series.
The mark_edges_uninteresting() method in list-objects.c iterates on
the commit list and does the following:
* If the commit is UNINTERSTING, then mark its root tree and every
object it can reach as UNINTERESTING.
* If the commit is interesting, then mark the root tree of every
UNINTERSTING parent (and all objects that tree can reach) as
UNINTERSTING.
At the very end, we repeat the process on every commit directly
given to the revision walk from stdin. This helps ensure we properly
cover shallow commits that otherwise were not included in the
frontier.
The logic to recursively follow trees is in the
mark_tree_uninteresting() method in revision.c. The algorithm avoids
duplicate work by not recursing into trees that are already marked
UNINTERSTING.
Add a new 'sparse' option to the mark_edges_uninteresting() method
that performs this logic in a slightly different way. As we iterate
over the commits, we add all of the root trees to an oidset. Then,
call mark_trees_uninteresting_sparse() on that oidset. Note that we
include interesting trees in this process. The current implementation
of mark_trees_unintersting_sparse() will walk the same trees as
the old logic, but this will be replaced in a later change.
Add a '--sparse' flag in 'git pack-objects' to call this new logic.
Add a new test script t/t5322-pack-objects-sparse.sh that tests this
option. The tests currently demonstrate that the resulting object
list is the same as the old algorithm. This includes a case where
both algorithms pack an object that is not needed by a remote due to
limits on the explored set of trees. When the sparse algorithm is
changed in a later commit, we will add a test that demonstrates a
change of behavior in some cases.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 9472935d81 (add: introduce "--renormalize", 2017-11-16) taught
git-add to pass HASH_RENORMALIZE to add_to_index(), which then passes
the flag along to index_path(). However, the flags taken by
add_to_index() and the ones taken by index_path() are distinct
namespaces. We cannot take HASH_* flags in add_to_index(), because they
overlap with the ADD_CACHE_* flags we already take (in this case,
HASH_RENORMALIZE conflicts with ADD_CACHE_IGNORE_ERRORS).
We can solve this by adding a new ADD_CACHE_RENORMALIZE flag, and using
it to set HASH_RENORMALIZE within add_to_index(). In order to make it
clear that these two flags come from distinct sets, let's also change
the name "newflags" in the function to "hash_flags".
Reported-by: Dmitriy Smirnov <dmitriy.smirnov@jetbrains.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git determines whether a file has changed, it looks at the mtime,
at the file size, and to detect changes even if the mtime is the same
(on Windows, the mtime granularity is 100ns, read: if two files are
written within the same 100ns time slot, they have the same mtime) and
even if the file size is the same, Git also looks at the inode/device
numbers.
This design obviously comes from a Linux background, where `lstat()`
calls were designed to be cheap.
On Windows, there is no `lstat()`. It has to be emulated. And while
obtaining the mtime and the file size is not all that expensive (you can
get both with a single `GetFileAttributesW()` call), obtaining the
equivalent of the inode and device numbers is very expensive (it
requires a call to `GetFileInformationByHandle()`, which in turn
requires a file handle, which is *a lot* more expensive than one might
imagine).
As it is very uncommon for developers to modify files within 100ns time
slots, Git for Windows chooses not to fill inode/device numbers
properly, but simply sets them to 0.
However, in t6042 the files file_v1 and file_v2 are typically written
within the same 100ns time slot, and they do not differ in file size. So
the minor modification is not picked up.
Let's work around this issue by avoiding the `git mv` calls in the
'mod6-setup: chains of rename/rename(1to2) and rename/rename(2to1)' test
case. The target files are overwritten anyway, so it is not like we
really rename those files. This fixes the issue because `git add` will
now add the files as new files (as opposed to existing, just renamed
files).
Functionally, we do not change anything because we replace two `git mv
<old> <new>` calls (where `<new>` is completely overwritten and `git
add`ed later anyway) by `git rm <old>` calls (removing other files, too,
that are also completely overwritten and `git add`ed later).
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define a GIT_TEST_SIDEBAND_ALL environment variable meant to be used
from tests. When set to true, this overrides uploadpack.allowsidebandall
to true, allowing the entire test suite to be run as if this
configuration is in place for all repositories.
As of this patch, all tests pass whether GIT_TEST_SIDEBAND_ALL is unset
or set to 1.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It looks like it is a copy-paste error made in 80f2a6097c
(t/helper: add test-ref-store to test ref-store functions,
2017-03-26) to pass "old-sha1" instead of "new-sha1" to
notnull() when we get the new sha1 argument from
const char **argv.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fuzz-commit-graph identified a case where Git will read past the end of
a buffer containing a commit graph if the graph's header has an
incorrect chunk count. A simple bounds check in parse_commit_graph()
prevents this.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When communicating with a remote server or a subprocess, use
expanded numbers rather than numbers with scaling suffix in the
object filter spec (e.g. "limit:blob=1k" becomes
"limit:blob=1024").
Update the protocol docs to note that clients should always perform this
expansion, to allow for more compatibility between server
implementations.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a tree has already been recorded as omitted, we don't need to
traverse it again just to collect its omits. Stop traversing trees a
second time when collecting omits.
Signed-off-by: Matthew DeVore <matvore@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Implement positive values for <depth> in the tree:<depth> filter. The
exact semantics are described in Documentation/rev-list-options.txt.
The long-term goal at the end of this is to allow a partial clone to
eagerly fetch an entire directory of files by fetching a tree and
specifying <depth>=1. This, for instance, would make a build operation
fast and convenient. It is fast because the partial clone does not need
to fetch each file individually, and convenient because the user does
not need to supply a sparse-checkout specification.
Another way of considering this feature is as a way to reduce
round-trips, since the client can get any number of levels of
directories in a single request, rather than wait for each level of tree
objects to come back, whose entries are used to construct a new request.
Signed-off-by: Matthew DeVore <matvore@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log -G<regex>" looked for a hunk in the "git log -p" patch
output that contained a string that matches the given pattern.
Optimize this code to ignore binary files, which by default will
not show any hunk that would match any pattern (unless textconv or
the --text option is in effect, that is).
* tb/log-G-binary:
log -G: ignore binary files
Lines that begin with a certain keyword that come over the wire, as
well as lines that consist only of one of these keywords, ought to
be painted in color for easier eyeballing, but the latter was
broken ever since the feature was introduced in 2.19, which has
been corrected.
* hn/highlight-sideband-keywords:
sideband: color lines with keyword only