Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.
* ab/pathname-encoding-doc:
doc: clarify the filename encoding in git diff
Effort to make the command line completion (in contrib/) safe with
"set -u" continues.
* vs/completion-with-set-u:
completion: avoid aliased command lookup error in nounset mode
Show errno in the trace output in the error codepath that calls
read_raw_ref method.
* hn/refs-trace-errno:
refs: print errno for read_raw_ref if GIT_TRACE_REFS is set
The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.
* mt/parallel-checkout-part-2:
parallel-checkout: add design documentation
parallel-checkout: support progress displaying
parallel-checkout: add configuration options
parallel-checkout: make it truly parallel
unpack-trees: add basic support for parallel checkout
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.
* ds/sparse-index-protections: (47 commits)
name-hash: use expand_to_path()
sparse-index: expand_to_path()
name-hash: don't add directories to name_hash
revision: ensure full index
resolve-undo: ensure full index
read-cache: ensure full index
pathspec: ensure full index
merge-recursive: ensure full index
entry: ensure full index
dir: ensure full index
update-index: ensure full index
stash: ensure full index
rm: ensure full index
merge-index: ensure full index
ls-files: ensure full index
grep: ensure full index
fsck: ensure full index
difftool: ensure full index
commit: ensure full index
checkout: ensure full index
...
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.
* ds/maintenance-prefetch-fix:
maintenance: respect remote.*.skipFetchAll
maintenance: use 'git fetch --prefetch'
fetch: add --prefetch option
maintenance: simplify prefetch logic
"git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.
* ow/push-quiet-set-upstream:
transport: respect verbosity when setting upstream
When packet_write() fails, we gave an extra error message
unnecessarily, which has been corrected.
* mt/pkt-write-errors:
pkt-line: do not report packet write errors twice
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).
* jk/promisor-optim:
revision: avoid parsing with --exclude-promisor-objects
lookup_unknown_object(): take a repository argument
is_promisor_object(): free tree buffer after parsing
Commit e4c7b33747 ("bisect--helper: reimplement `bisect_skip` shell
function in C", 2021-02-03), as part of the shell-to-C conversion,
forgot to read the 'terms' file (.git/BISECT_TERMS) during the new
'bisect skip' command implementation. As a result, the 'bisect skip'
command will use the default 'bad'/'good' terms. If the bisection
terms have been set to non-default values (for example by the
'bisect start' command), then the 'bisect skip' command will fail.
In order to correct this problem, we insert a call to the get_terms()
function, which reads the non-default terms from that file (if set),
in the '--bisect-skip' command implementation of 'bisect--helper'.
Also, add a test[1] to protect against potential future regression.
[1] https://lore.kernel.org/git/xmqqim45h585.fsf@gitster.g/T/#m207791568054b0f8cf1a3942878ea36293273c7d
Reported-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The backslash character is not a valid part of a file name on Windows.
If, in Windows, Git attempts to write a file that has a backslash
character in the filename, it will be incorrectly interpreted as a
directory separator.
This caused CVE-2019-1354 in MinGW, as this behaviour can be manipulated
to cause the checkout to write to files it ought not write to, such as
adding code to the .git/hooks directory. This was fixed by e1d911dd4c
(mingw: disallow backslash characters in tree objects' file names,
2019-09-12). However, the vulnerability also exists in Cygwin: while
Cygwin mostly provides a POSIX-like path system, it will still interpret
a backslash as a directory separator.
To avoid this vulnerability, CVE-2021-29468, extend the previous fix to
also apply to Cygwin.
Similarly, extend the test case added by the previous version of the
commit. The test suite doesn't have an easy way to say "run this test
if in MinGW or Cygwin", so add a new test prerequisite that covers both.
As well as checking behaviour in the presence of paths containing
backslashes, the existing test also checks behaviour in the presence of
paths that differ only by the presence of a trailing ".". MinGW follows
normal Windows application behaviour and treats them as the same path,
but Cygwin more closely emulates *nix systems (at the expense of
compatibility with native Windows applications) and will create and
distinguish between such paths. Gate the relevant bit of that test
accordingly.
Reported-by: RyotaK <security@ryotak.me>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While not documented as such, many of the top-level options like
`--git-dir` and `--work-tree` support two syntaxes: they accept both an
equals sign between option and its value, and they do support option and
value as two separate arguments. The recently added `--config-env`
option only supports the syntax with an equals sign.
Mitigate this inconsistency by accepting both syntaxes and add tests to
verify both work.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When executing `git -h`, then the `--config-env` documentation rightly
lists the option as requiring an equals between the option and its
argument: this is the only currently supported format. But the git(1)
manpage incorrectly lists the option as taking a space in between.
Fix the issue by adding the missing space.
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-of-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git apply" specifically calls out when it is falling back to 3way
merge application. Since the order changed to preferring 3way and
falling back to direct application, continue that behavior by
printing whenever 3way fails and git has to fall back.
Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We pass our prune expiration to mark_reachable_objects(), which will
traverse not only the reachable objects, but consider any recent ones as
tips for reachability; see d3038d22f9 (prune: keep objects reachable
from recent objects, 2014-10-15) for details.
However, this interacts badly with the bitmap code path added in
fde67d6896 (prune: use bitmaps for reachability traversal, 2019-02-13).
If we hit the bitmap-optimized path, we return immediately to avoid the
regular traversal, accidentally skipping the "also traverse recent"
code.
Instead, we should do an if-else for the bitmap versus regular
traversal, and then follow up with the "recent" traversal in either
case. This reuses the "rev_info" for a bitmap and then a regular
traversal, but that should work OK (the bitmap code clears the pending
array in the usual way, just like a regular traversal would).
Note that I dropped the comment above the regular traversal here. It
has little explanatory value, and makes the if-else logic much harder to
read.
Here are a few variants that I rejected:
- it seems like both the reachability and recent traversals could be
done in a single traversal. This was rejected by d3038d22f9 (prune:
keep objects reachable from recent objects, 2014-10-15), though the
balance may be different when using bitmaps. However, there's a
subtle correctness issue, too: we use revs->ignore_missing_links for
the recent traversal, but not the reachability one.
- we could try using bitmaps for the recent traversal, too, which
could possibly improve performance. But it would require some fixes
in the bitmap code, which uses ignore_missing_links for its own
purposes. Plus it would probably not help all that much in practice.
We use the reachable tips to generate bitmaps, so those objects are
likely not covered by bitmaps (unless they just became unreachable).
And in general, we expect the set of unreachable objects to be much
smaller anyway, so there's less to gain.
The test in t5304 detects the bug and confirms the fix.
I also beefed up the tests in t6501, which covers the mtime-checking
code more thoroughly, to handle the bitmap case (in addition to just
"loose" and "packed" cases). Interestingly, this test doesn't actually
detect the bug, because it is running "git gc", and not "prune"
directly. And "gc" will call "repack" first, which does not suffer the
same bug. So the old-but-reachable-from-recent objects get scooped up
into the new pack along with the actually-recent objects, which gives
both a recent mtime. But it seemed prudent to get more coverage of the
bitmap case for related code.
Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a bitmap walk has to traverse (to fill in non-bitmapped objects),
we use rev_info's include_check mechanism to let us stop the traversal
early. But after setting the function and its data parameter, we never
clean it up. This means that if the rev_info is used for a subsequent
traversal without bitmaps, it will unexpectedly call into our
include_check function (worse, it will do so pointing to a now-defunct
stack variable in include_check_data, likely resulting in a segfault).
There's no code which does this now, but it's an accident waiting to
happen. Let's clean up after ourselves in the bitmap code.
Reported-by: David Emett <dave@sp4m.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't silently ignore a flag that's invalid for a given subcommand. The
user expected it to do something; we should tell the user that they are
mistaken, instead of surprising the user.
It could be argued that this change might break existing users. I'd
argue that those existing users are already broken, and they just don't
know it. Let them know that they're broken.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git subtree split' lets you specify a rev other than HEAD. 'git push'
lets you specify a mapping between a local thing and a remot ref. So
smash those together, and have 'git subtree push' let you specify which
local thing to run split on and push the result of that split to the
remote ref.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'push' does a 'split' internally, but it doesn't pass flags through to the
'split'. This is silly, if you need to pass flags to 'split', then it
means that you can't use 'push'!
So, have 'push' accept 'split' flags, and pass them through to 'split'.
Add tests for this by copying split's tests with minimal modification.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Besides being a genuinely useful thing to do, this also just makes sense
and harmonizes which flags may be used when. `git subtree split
--rejoin` amounts to "automatically go ahead and do a `git subtree
merge` after doing the main `git subtree split`", so it's weird and
arbitrary that you can't pass `--squash` to `git subtree split --rejoin`
like you can `git subtree merge`. It's weird that `git subtree split
--rejoin` inherits `git subtree merge`'s `--message` but not `--squash`.
Reconcile the situation by just having `split --rejoin` actually just
call `merge` internally (or call `add` instead, as appropriate), so it
can get access to the full `merge` behavior, including `--squash`.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just went through the docs looking for anything inaccurate or that can
be improved.
In the '-h' text, in the man page synopsis, and in the man page
description: Normalize the ordering of the list of sub-commands: 'add',
'merge', 'split', 'pull', 'push'. This allows us to kinda separate the
lower-level add/merge/split from the higher-level pull/push.
'-h' text:
- correction: Indicate that split's arg is optional.
- clarity: Emphasize that 'pull' takes the 'add'/'merge' flags.
man page:
- correction: State that all subcommands take options (it seemed to
indicate that only 'split' takes any options other than '-P').
- correction: 'split' only guarantees that the results are identical if
the flags are identical.
- correction: The flag is named '--ignore-joins', not '--ignore-join'.
- completeness: Clarify that 'push' always operates on HEAD, and that
'split' operates on HEAD if no local commit is given.
- clarity: In the description, when listing commands, repeat what their
arguments are. This way the reader doesn't need to flip back and
forth between the command description and the synopsis and the full
description to understand what's being said.
- clarity: In the <variables> used to give command arguments, give
slightly longer, descriptive names. Like <local-commit> instead of
just <commit>.
- clarity: Emphasize that 'pull' takes the 'add'/'merge' flags.
- style: In the synopsis, list options before the subcommand. This
makes things line up and be much more readable when shown
non-monospace (such as in `make html`), and also more closely matches
other man pages (like `git-submodule.txt`).
- style: Use the correct syntax for indicating the options ([<options>]
instead of [OPTIONS]).
- style: In the synopsis, separate 'pull' and 'push' from the other
lower-level commands. I think this helps readability.
- style: Code-quote things in prose that seem like they should be
code-quoted, like '.gitmodules', flags, or full commands.
- style: Minor wording improvements, like more consistent mood (many
of the command descriptions start in the imperative mood and switch
to the indicative mode by the end). That sort of thing.
- style: Capitalize "ID".
- style: Remove the "This option is only valid for XXX command" remarks
from each option, and instead rely on the section headings.
- style: Since that line is getting edited anyway, switch "behaviour" to
American "behavior".
- style: Trim trailing whitespace.
`todo`:
- style: Trim trailing whitespace.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the $indent variable is just used to track how deeply we're
nested, and the debug log is indented by things like
debug " foo"
That is: The indentation-level is hard-coded. It used to be that the
code couldn't recurse, so the indentation level could be known
statically, so it made sense to just hard-code it in the
output. However, since 315a84f9aa ("subtree: use commits before rejoins
for splits", 2018-09-28), it can now recurse, and the debug log is
misleading.
So fix that. Indent according to $indent.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, debug output (triggered by passing '-d') and progress output
stomp on each other. The debug output is just streamed as lines to
stderr, and the progress output is sent to stderr as '%s\r'. When
writing to a file, it is awkward to read and difficult to distinguish
between the debug output and a progress line. When writing to a
terminal the debug lines hide progress lines.
So, when '-d' has been passed, spit out progress as 'progress: %s\n',
instead of as '%s\r', so that it can be detected, and so that the debug
lines don't overwrite the progress when written to a terminal.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each function in subtree, add a usage comment saying what the
arguments are, and add an `assert` checking the number of arguments.
In figuring out each thing's arguments in order to write those comments
and assertions, it turns out that find_existing_splits is written as if
it takes multiple 'revs', but it is in fact only ever passed a single
'rev':
unrevs="$(find_existing_splits "$dir" "$rev")" || exit $?
So go ahead and codify that by documenting and asserting that it takes
exactly two arguments, one dir and one rev.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`cmd_add` starts with a check that the directory doesn't yet exist.
However, the `main` function performs the exact same check before
calling `cmd_add`. So remove the check from `cmd_add`.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main argument parser goes ahead and tries to parse revs to make
things simpler for the sub-command implementations. But, it includes
enough special cases for different sub-commands. And it's difficult
having having to think about "is this info coming from an argument, or a
global variable?". So the main argument parser's effort to make things
"simpler" ends up just making it more confusing and complicated.
Begone with the 'revs' global variable; parse 'rev=$(...)' as needed in
individual 'cmd_*' functions.
Begone with the 'default' global variable. Its would-be value is
knowable just from which function we're in.
Begone with the 'ensure_single_rev' function. Its functionality can be
achieved by passing '--verify' to 'git rev-parse'.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They are synonyms. Both are used in the file. ^{commit} is clearer, so
"standardize" on that.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Scripts needing to fuss with with adding $(git --exec-prefix) PATH
before loading git-sh-setup is a thing of the past. As far as I can
tell, it's been a thing of the past since since Git v1.2.0 (2006-02-12),
or more specifically, since 77cb17e940 (Exec git programs without using
PATH, 2006-01-10). However, it stuck around in contrib scripts and in
third-party scripts for long enough that it wasn't unusual to see.
Originally `git subtree` didn't fuss with PATH, but when people
(including the original subtree author) had problems, because it was a
common thing to see, it seemed that having subtree fuss with PATH was a
reasonable solution.
Here is an abridged history of fussing with PATH in subtree:
2987e6add3 (Add explicit path of git installation by 'git --exec-path', Gianluca Pacchiella, 2009-08-20)
As pointed out by documentation, the correct use of 'git-sh-setup' is
using $(git --exec-path) to avoid problems with not standard
installations.
-. git-sh-setup
+. $(git --exec-path)/git-sh-setup
33aaa697a2 (Improve patch to use git --exec-path: add to PATH instead, Avery Pennarun, 2009-08-26)
If you (like me) are using a modified git straight out of its source
directory (ie. without installing), then --exec-path isn't actually correct.
Add it to the PATH instead, so if it is correct, it'll work, but if it's
not, we fall back to the previous behaviour.
-. $(git --exec-path)/git-sh-setup
+PATH=$(git --exec-path):$PATH
+. git-sh-setup
9c632ea29c ((Hopefully) fix PATH setting for msysgit, Avery Pennarun, 2010-06-24)
Reported by Evan Shaw. The problem is that $(git --exec-path) includes a
'git' binary which is incompatible with the one in /usr/bin; if you run it,
it gives you an error about libiconv2.dll.
+OPATH=$PATH
PATH=$(git --exec-path):$PATH
. git-sh-setup
+PATH=$OPATH # apparently needed for some versions of msysgit
df2302d774 (Another fix for PATH and msysgit, Avery Pennarun, 2010-06-24)
Evan Shaw tells me the previous fix didn't work. Let's use this one
instead, which he says does work.
This fix is kind of wrong because it will run the "correct" git-sh-setup
*after* the one in /usr/bin, if there is one, which could be weird if you
have multiple versions of git installed. But it works on my Linux and his
msysgit, so it's obviously better than what we had before.
-OPATH=$PATH
-PATH=$(git --exec-path):$PATH
+PATH=$PATH:$(git --exec-path)
. git-sh-setup
-PATH=$OPATH # apparently needed for some versions of msysgit
First of all, I disagree with Gianluca's reading of the documentation:
- I haven't gone back to read what the documentation said in 2009, but
in my reading of the 2021 documentation is that it includes "$(git
--exec-path)/" in the synopsis for illustrative purposes, not to say
it's the proper way.
- After being executed by `git`, the git exec path should be the very
first entry in PATH, so it shouldn't matter.
- None of the scripts that are part of git do it that way.
But secondly, the root reason for fussing with PATH seems to be that
Avery didn't know that he needs to set GIT_EXEC_PATH if he's going to
use git from the source directory without installing.
And finally, Evan's issue is clearly just a bug in msysgit. I assume
that msysgit has since fixed the issue, and also msysgit has been
deprecated for 6 years now, so let's drop the workaround for it.
So, remove the line fussing with PATH. However, since subtree *is* in
'contrib/' and it might get installed in funny ways by users
after-the-fact, add a sanity check to the top of the script, checking
that it is installed correctly.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"$*" is for when you want to concatenate the args together,
whitespace-separated; and "$@" is for when you want them to be separate
strings.
There are several places in subtree that erroneously use $@ when
concatenating args together into an error message.
For instance, if the args are argv[1]="dead" and argv[2]="beef", then
the line
die "You must provide exactly one revision. Got: '$@'"
surely intends to call 'die' with the argument
argv[1]="You must provide exactly one revision. Got: 'dead beef'"
however, because the line used $@ instead of $*, it will actually call
'die' with the arguments
argv[1]="You must provide exactly one revision. Got: 'dead"
argv[2]="beef'"
This isn't a big deal, because 'die' concatenates its arguments together
anyway (using "$*"). But that doesn't change the fact that it was a
mistake to use $@ instead of $*, even though in the end $@ still ended
up doing the right thing.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it painfully obvious when reading the code which variables are
direct parsings of command line arguments.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
subtree currently defines its own `say` implementation, rather than
using git-sh-setups's implementation. Change that, don't re-invent the
wheel.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of writing a slow `rev_is_descendant_of_branch $a $b` function
in shell, just use the fast `git merge-base --is-ancestor $b $a`.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Suport for Git versions older than 1.7.0 (older than February 2010) was
nice to have when git-subtree lived out-of-tree. But now that it lives
in git.git, it's not necessary to keep around. While it's technically
in contrib, with the standard 'git' packages for common systems
(including Arch Linux and macOS) including git-subtree, it seems
vanishingly likely to me that people are separately installing
git-subtree from git.git alongside an older 'git' install (although it
also seems vanishingly likely that people are still using >11 year old
git installs).
Not that there's much reason to remove it either, it's not much code,
and none of my changes depend on a newer git (to my knowledge, anyway;
I'm not actually testing against older git). I just figure it's an easy
piece of fat to trim, in the journey to making the whole thing easier to
hack on.
"Ignore space change" is probably helpful when viewing this diff.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ensure that every $(subshell) that calls a function (as opposed to an
external executable) is followed by `|| exit $?`. Similarly, ensure that
every `cmd | while read; do ... done` loop is followed by `|| exit $?`.
Both of those constructs mean that it can miss `die` calls, and keep
running when it shouldn't.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shove all of the loose code inside of a main() function.
This comes down to personal preference more than anything else. A
preference that I've developed over years of maintaining large Bash
scripts, but still a mere personal preference.
In this specific case, it's also moving the `set -- -h`, the `git
rev-parse --parseopt`, and the `. git-sh-setup` to be closer to all
the rest of the argument parsing, which is a readability win on its
own, IMO.
"Ignore space change" is probably helpful when viewing this diff.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'pull' and 'push' subcommands deserve their own sections in the tests.
Add some basic tests for them.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a dumb test, but it's surprisingly easy to break.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7900-subtree.sh defines a helper function named last_commit_message.
However, it only returns the subject line of the commit message, not the
entire commit message. So rename it, to make the name less confusing.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As far as I can tell, this test isn't actually testing anything, because
someone forgot to tack on `--name-only` to `git log`. This seems to
have been the case since the test was first written, back in fa16ab36ad
("test.sh: make sure no commit changes more than one file at a time.",
2009-04-26), unless `git log` used to do that by default and didn't need
the flag back then?
Convincing myself that it's not actually testing anything was tricky,
the code is a little hard to reason about. It can be made a lot simpler
if instead of trying to parse all of the info from a single `git log`,
we're OK calling `git log` from inside of a loop. And it's my opinion
that tests are not the place for clever optimized code.
So, fix and simplify the test, so that it's actually testing something
and is simpler to reason about.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7900-subtree.sh defines its own `check_equal A B` function, instead of
just using `test A = B` like all of the other tests. Don't be special,
get rid of `check_equal` in favor of `test`.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's unclear what the purpose of t7900-subtree.sh's
`subtree_test_create_repo` helper function is. It wraps test-lib.sh's,
`test_create_repo` but follows that up by setting log.date=relative. Why
does it set log.date=relative?
My first guess was that at one point the tests required that, but no
longer do, and that the function is now vestigial. I even wrote a patch
to get rid of it and was moments away from `git send-email`ing it.
However, by chance when looking for something else in the history, I
discovered the true reason, from e7aac44ed2 (contrib/subtree: ignore
log.date configuration, 2015-07-21). It's testing that setting
log.date=relative doesn't break `git subtree`, as at one point in the past
that did break `git subtree`.
So, add a comment about this, to avoid future such confusion.
And while at it, go ahead and (1) touch up the function to avoid a
pointless subshell and (2) update the one test that didn't use it.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The formatting in t7900-subtree.sh isn't even consistent throughout the
file. Fix that; make it consistent throughout the file.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test-lib.sh's `test_count`, instead instead of having
t7900-subtree.sh do its own book-keeping with `subtree_test_count` that
has to be explicitly incremented by calling `next_test`.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the tests had been converted to support
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`, but `contrib/subtree/t/`
hadn't.
Convert it. Most of the mentions of 'master' can just be replaced with
'HEAD'.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running `make -C contrib/subtree/ test` creates a `git-subtree` executable
in the root of the repo. Add it to the .gitignore so that anyone hacking
on subtree won't have to deal with that noise.
Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git repack -A -d` is run in a partial clone, `pack-objects`
is invoked twice: once to repack all promisor objects, and once to
repack all non-promisor objects. The latter `pack-objects` invocation
is with --exclude-promisor-objects and --unpack-unreachable, which
loosens all objects unused during this invocation. Unfortunately,
this includes promisor objects.
Because the -d argument to `git repack` subsequently deletes all loose
objects also in packs, these just-loosened promisor objects will be
immediately deleted. However, this extra disk churn is unnecessary in
the first place. For example, in a newly-cloned partial repo that
filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up
unpacking all trees and commits into the filesystem because every
object, in this particular case, is a promisor object. Depending on
the repo size, this increases the disk usage considerably: In my copy
of the linux.git, the object directory peaked 26GB of more disk usage.
In order to avoid this extra disk churn, pass the names of the promisor
packfiles as --keep-pack arguments to the second invocation of
`pack-objects`. This informs `pack-objects` that the promisor objects
are already in a safe packfile and, therefore, do not need to be
loosened.
For testing, we need to validate whether any object was loosened.
However, the "evidence" (loosened objects) is deleted during the
process which prevents us from inspecting the object directory.
Instead, let's teach `pack-objects` to count loosened objects and
emit via trace2 thus allowing inspecting the debug events after the
process is finished. This new event is used on the added regression
test.
Lastly, add a new perf test to evaluate the performance impact
made by this changes (tested on git.git):
Test HEAD^ HEAD
----------------------------------------------------------
5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2%
For a bigger repository, such as linux.git, the improvement is
even bigger:
Test HEAD^ HEAD
-------------------------------------------------------------------
5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1%
These improvements are particular big because every object in the
newly-cloned partial repository is a promisor object.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
From the documentation for generating patch text with diff-related
commands, refer to the documentation for the diff attribute.
This attribute influences the way that patches are generated, but this
was previously not mentioned in e.g., the git-diff manpage.
Signed-off-by: Peter Oliver <git@mavit.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.
cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.
LSAN output from t0020:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
#2 0x9ac07a in xmalloc wrapper.c:62:9
#3 0x873277 in parse_pathspec pathspec.c:582:2
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac2a6 in xrealloc wrapper.c:126:8
#2 0x93b14d in strbuf_grow strbuf.c:98:2
#3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
#4 0x93f726 in xstrvfmt strbuf.c:979:2
#5 0x93f8b3 in xstrfmt strbuf.c:989:8
#6 0x92ad8a in prefix_path_gently setup.c:115:15
#7 0x873a8d in init_pathspec_item pathspec.c:439:11
#8 0x87334f in parse_pathspec pathspec.c:589:3
#9 0x646ffa in cmd_rm builtin/rm.c:266:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 15 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac048 in xstrdup wrapper.c:29:14
#2 0x873ba2 in init_pathspec_item pathspec.c:468:20
#3 0x87334f in parse_pathspec pathspec.c:589:3
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac392 in xcalloc wrapper.c:140:8
#2 0x647108 in cmd_rm builtin/rm.c:294:9
#3 0x4cd91d in run_builtin git.c:467:11
#4 0x4cb5f3 in handle_builtin git.c:719:3
#5 0x4ccf47 in run_argv git.c:808:4
#6 0x4caf49 in cmd_main git.c:939:19
#7 0x69dbfe in main common-main.c:52:11
#8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>