`git range-diff` calls `git log` internally and tries to parse its
output. But `git log` output can be customized by the user in their
git config and for certain configurations either an error will be
returned by `git range-diff` or it will crash.
To fix this explicitly set the output format of the internally
executed `git log` with `--pretty=medium`. Because that cancels
`--notes`, add explicitly `--notes` at the end.
Also, make sure we never crash in the same way - trying to dereference
`util` which was never created and has remained NULL. It would happen
if the first line of `git log` output does not begin with 'commit '.
Alternative considered but discarded - somehow disable all git configs
and behave as if no config is present in the internally executed
`git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM
is the closest to it, but even with that we would still read
`.git/config`.
Signed-off-by: Vasil Dimov <vd@FreeBSD.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's unusual to see:
https://example.com?query-parameters
without an intervening slash, like:
https://example.com/some-path?query-parameters
or even:
https://example.com/?query-parameters
but it is a valid end to the hostname (actually "authority component")
according to RFC 3986. Likewise for "#".
And curl will parse the URL according to the standard, meaning it will
contact example.com, but our credential code would ask about a bogus
hostname with a "?" in it. Let's make sure we follow the standard, and
more importantly ask about the same hosts that curl will be talking to.
It would be nice if we could just ask curl to parse the URL for us. But
it didn't grow a URL-parsing API until 7.62, so we'd be stuck with
fallback code either way. Plus we'd need this code in the main Git
binary, where we've tried to avoid having a link dependency on libcurl.
But let's at least fix our parser. Moving to curl's parser would prevent
other potential discrepancies, but this gives us immediate relief for
the known problem, and would help our fallback code if we eventually use
curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update freshen_file() to use a NULL `times', semantically equivalent to
the currently setup, with an explicit `actime' and `modtime' set to the
"current time", but with the advantage that it works with other files
not owned by the current user.
Fixes an issue on shared repos with a split index, where eventually a
user's operation creates a shared index, and another user will later do
an operation that will try to update its freshness, but will instead
raise a warning:
$ git status
warning: could not freshen shared index '.git/sharedindex.bd736fa10e0519593fefdb2aec253534470865b2'
Signed-off-by: Luciano Miguel Ferreira Rocha <luciano.rocha@booking.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function won't work anywhere else, so let's mark it as an explicit
bug if it is called on a non-Windows platform.
Let's also rename the function to avoid cluttering the global namespace
with an overly-generic function name.
While at it, we also fix the code comment above that function: the
lower-case `windows` refers to something different than `Windows`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function uses Windows' system tool `attrib` to determine the state
of the hidden flag of a file or directory.
We should not actually expect the first `attrib.exe` in the PATH to
be the one we are looking for. Or that it is in the PATH, for that
matter.
Let's use the full path to the tool instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `is_hidden` function can be used (only on Windows) to determine
whether a directory or file have their `hidden` flag set.
This function is duplicated between two test scripts. It is better to
move it into `test-lib-functions.sh` so that it is reused.
This patch is best viewed with `--color-moved`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using `starts_with()`, the magic number 7, `strlen()` and a
fair number of additions to verify the three parts of the config key
"branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them
more explicitly.
We need to introduce a new variable for this (we certainly can't modify
`k` just because we see "branch."!). With `skip_prefix()` we often use
quite bland names like `p` or `str`. Let's do the same. If and when this
function needs to do more prefix-skipping, we'll have a generic variable
ready for this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing against an upstream that has had many commits since the
original branch was created:
O -- O -- ... -- O -- O (upstream)
\
-- O (my-dev-branch)
it must read the contents of every novel upstream commit, in addition to
the tip of the upstream and the merge base, because "git rebase"
attempts to exclude commits that are duplicates of upstream ones. This
can be a significant performance hit, especially in a partial clone,
wherein a read of an object may end up being a fetch.
Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.
This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user specifies the apply backend with options that only work
with the merge backend, such as
git rebase --apply --exec /bin/true HEAD~3
the error message has always been
fatal: --exec requires an interactive rebase
This error message is misleading and was one of the reasons we renamed
the interactive backend to the merge backend. Update the error message
to state that these options merely require use of the merge backend.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d48e5e21da ("rebase (interactive-backend): make --keep-empty the
default", 2020-02-15) turned --keep-empty (for keeping commits which
start empty) into the default. The logic underpinning that commit was:
1) 'git commit' errors out on the creation of empty commits without an
override flag
2) Once someone determines that the override is worthwhile, it's
annoying and/or harmful to required them to take extra steps in
order to keep such commits around (and to repeat such steps with
every rebase).
While the logic on which the decision was made is sound, the result was
a bit of an overcorrection. Instead of jumping to having --keep-empty
being the default, it jumped to making --keep-empty the only available
behavior. There was a simple workaround, though, which was thought to
be good enough at the time. People could still drop commits which
started empty the same way the could drop any commits: by firing up an
interactive rebase and picking out the commits they didn't want from the
list. However, there are cases where external tools might create enough
empty commits that picking all of them out is painful. As such, having
a flag to automatically remove start-empty commits may be beneficial.
Provide users a way to drop commits which start empty using a flag that
existed for years: --no-keep-empty. Interpret --keep-empty as
countermanding any previous --no-keep-empty, but otherwise leaving
--keep-empty as the default.
This might lead to some slight weirdness since commands like
git rebase --empty=drop --keep-empty
git rebase --empty=keep --no-keep-empty
look really weird despite making perfect sense (the first will drop
commits which become empty, but keep commits that started empty; the
second will keep commits which become empty, but drop commits which
started empty). However, --no-keep-empty was named years ago and we are
predominantly keeping it for backward compatibility; also we suspect it
will only be used rarely since folks already have a simple way to drop
commits they don't want with an interactive rebase.
Reported-by: Bryan Turner <bturner@atlassian.com>
Reported-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While many users who intentionally create empty commits do not want them
thrown away by a rebase, there are third-party tools that generate empty
commits that a user might not want. In the past, users have used rebase
to get rid of such commits (a side-effect of the fact that the --apply
backend is not currently capable of keeping them). While such users
could fire up an interactive rebase and just remove the lines
corresponding to empty commits, that might be difficult if the
third-party tool generates many of them. Simplify this task for users
by marking such lines with a suffix of " # empty" in the todo list.
Suggested-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the last few commits have made it possible for the config parser
to handle config files up to the limits of size_t, the rest of the code
isn't really ready for this. In particular, we often feed the keys as
strings into printf "%s" format specifiers. And because the printf
family of functions must return an int to specify the result, they
complain. Here are two concrete examples (using glibc; we're in
uncharted territory here so results may vary):
Generate a gigantic .gitmodules file like this:
git submodule add /some/other/repo foo
{
printf '[submodule "'
perl -e 'print "a" x 2**31'
echo '"]path = foo'
} >.gitmodules
git commit -m 'huge gitmodule'
then try this:
$ git show
BUG: strbuf.c:397: your vsnprintf is broken (returned -1)
The problem is that we end up calling:
strbuf_addf(&sb, "submodule.%s.ignore", submodule_name);
which relies on vsnprintf(), and that function has no way to report back
a size larger than INT_MAX.
Taking that same file, try this:
git config --file=.gitmodules --list --name-only
On my system it produces an output with exactly 4GB of spaces. I
confirmed in a debugger that we reach the config callback with the key
intact: it's 2147483663 bytes and full of a's. But when we print it with
this call:
printf("%s%c", key_, term);
we just get the spaces.
So given the fact that these are insane cases which we have no need to
support, the weird behavior from feeding the results to printf even if
the code is careful, and the possibility of uncareful code introducing
its own integer truncation issues, let's just declare INT_MAX as a limit
for parsing config files.
We'll enforce the limit in get_next_char(), which generalizes over all
sources (blobs, files, etc) and covers any element we're parsing
(whether section, key, value, etc). For simplicity, the limit is over
the length of the _whole_ file, so you couldn't have two 1GB values in
the same file. This should be perfectly fine, as the expected size for
config files is generally kilobytes at most.
With this patch both cases above will yield:
fatal: bad config line 1 in file .gitmodules
That's not an amazing error message, but the parser isn't set up to
provide specific messages (it just breaks out of the parsing loop and
gives that generic error even if see a syntactic issue). And we really
wouldn't expect to see this case outside of somebody maliciously probing
the limits of the config system.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the config parsing infrastructure is limited in what it can
parse only by the size of memory, because it parses character by
character, building up strbufs for keys, values, etc. One exception is
the "baselen" value we keep in git_parse_source(), which is an int.
That stores the length of the section.subsection base, to which we can
then append individual key names (by truncating back to the baselen with
strbuf_setlen(), and then appending characters for the key name). But
because it's an int, if we see an absurdly long section or subsection,
we may overflow the integer, wrapping negative. That negative value is
then implicitly cast to a size_t when we pass it to strbuf_setlen(),
creating a very large value and triggering a BUG. For example:
$ {
printf '[foo "'
perl -e 'print "a" x 2**31'
echo '"]bar = value'
} >huge
$ git config --file=huge --list
fatal: BUG: strbuf_setlen() beyond buffer
While this is obviously a silly case that we don't care about
supporting, it's worth fixing it by switching to a size_t for a few
reasons:
- we should try to avoid hitting BUG assertions at all
- avoiding integer truncation or overflow sets a good example and
makes it easier to audit the code for more important issues
- the BUG outcome is what happens in _this_ instance, because we wrap
negative. If we used a 2**32 subsection, we'd wrap to a small
positive value and actually generate wrong output (the subsection of
our key would be truncated).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the recent change to parse_config_key(), the best type to return
a string length is a size_t, as it won't cause integer truncation for a
gigantic key. And as with that change, this is mostly a clarity /
hygiene issue for now, as our config parser would choke on such a large
key anyway.
There are a few ripple effects within the config code, as callers switch
to using size_t. I also adjusted a few related variables that iterate
over strings. The most unexpected change is that a call to strbuf_addf()
had to switch to strbuf_add(). We can't use a size_t with "%.*s",
because printf precisions must have type "int" (we could cast, of
course, but that would miss the point of using size_t in the first
place).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We compute the length of a subset of a string, but then use that length
only to feed a "%.*s" printf placeholder for the same string. We can
just use "%s" to achieve the same thing.
The variable became useless in cb891a5989 (Use a strbuf for building up
section header and key/value pair strings., 2007-12-14), which swapped
out a write() which _did_ use the length for a strbuf_addf() call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We return the length to a subset of a string using an "int *"
out-parameter. This is fine most of the time, as we'd expect config keys
to be relatively short, but it could behave oddly if we had a gigantic
config key. A more appropriate type is size_t.
Let's switch over, which lets our callers use size_t as appropriate
(they are bound by our type because they must pass the out-parameter as
a pointer). This is mostly just a cleanup to make it clear this code
handles long strings correctly. In practice, our config parser already
chokes on long key names (because of a similar int/size_t mixup!).
When doing an int/size_t conversion, we have to be careful that nobody
was trying to assign a negative value to the variable. I manually
confirmed that for each case here. They tend to just feed the result to
xmemdupz() or similar; in a few cases I adjusted the parameter types for
helper functions to make sure the size_t is preserved.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The make_branch() and make_rewrite() functions can take a NUL-terminated
string or a ptr/len pair. They use a sentinel value of "0" for the len
to tell the difference between the two. However, when parsing config
like:
[branch ""]
merge = whatever
whose key flattens to:
branch..merge
we might actually have a zero-length branch name. This is obviously
nonsense, but the current code would consider it as a NUL-terminated
string and use the branch name ".merge".
We could use a better sentinel value here (like "-1"), but that gets in
the way of moving to size_t, which is a more appropriate type for a
ptr/len combo.
Let's instead just drop this feature and have the callers (of which
there are only two total) use strlen() themselves. This simplifies the
code, and lets us move to using size_t.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On certain network filesystems (currently encountered with Isilon, but
in theory more network storage solutions could be causing the same
issue), when the directory in question is missing,
`raceproof_create_file()` fails with an `ERROR_INVALID_PARAMETER`
instead of an `ERROR_PATH_NOT_FOUND`.
Since it is highly unlikely that we produce such an error by mistake
(the parameters we pass are fairly benign), we can be relatively certain
that the directory is missing in this instance. So let's just translate
that error automagically.
This fixes https://github.com/git-for-windows/git/issues/1345.
Signed-off-by: Nathan Sanders <spekbukkem@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At least one interactive command writes a prompt to `stdout` and then
reads user input on `stdin`: `git clean --interactive`. If the prompt is
left in the buffer, the user will not realize the program is waiting for
their input.
So let's just flush `stdout` before reading the user's input.
Signed-off-by: 마누엘 <nalla@hamal.uberspace.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are quite a few code locations (e.g. `git clean --interactive`)
where Git asks the user for an answer. In preparation for fixing a bug
shared by all of them, and also to DRY up the code, let's refactor it.
Please note that most of these callers trimmed white-space both at the
beginning and at the end of the answer, instead of trimming only the
end (as the caller in `add-patch.c` does).
Therefore, technically speaking, we change behavior in this patch. At
the same time, it can be argued that this is actually a bug fix.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSYS2's strace facility is very useful for debugging... With this patch,
the bash will be executed through strace if the environment variable
GIT_STRACE_COMMANDS is set, which comes in real handy when investigating
issues in the test suite.
Also support passing a path to a log file via GIT_STRACE_COMMANDS to
force Git to call strace.exe with the `-o <path>` argument, i.e. to log
into a file rather than print the log directly.
That comes in handy when the output would otherwise misinterpreted by a
calling process as part of Git's output.
Note: the values "1", "yes" or "true" are *not* specifying paths, but
tell Git to let strace.exe log directly to the console.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default file history simplification of "git log -- <path>" or
"git rev-list -- <path>" focuses on providing the smallest set of
commits that first contributed a change. The revision walk greatly
restricts the set of walked commits by visiting only the first
TREESAME parent of a merge commit, when one exists. This means
that portions of the commit-graph are not walked, which can be a
performance benefit, but can also "hide" commits that added changes
but were ignored by a merge resolution.
The --full-history option modifies this by walking all commits and
reporting a merge commit as "interesting" if it has _any_ parent
that is not TREESAME. This tends to be an over-representation of
important commits, especially in an environment where most merge
commits are created by pull request completion.
Suppose we have a commit A and we create a commit B on top that
changes our file. When we merge the pull request, we create a merge
commit M. If no one else changed the file in the first-parent
history between M and A, then M will not be TREESAME to its first
parent, but will be TREESAME to B. Thus, the simplified history
will be "B". However, M will appear in the --full-history mode.
However, suppose that a number of topics T1, T2, ..., Tn were
created based on commits C1, C2, ..., Cn between A and M as
follows:
A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn
\ \ \ \ / / / /
\ \__.. \ \/ ..__T1 / Tn
\ \__.. /\ ..__T2 /
\_____________________B \____________________/
If the commits T1, T2, ... Tn did not change the file, then all of
P1 through Pn will be TREESAME to their first parent, but not
TREESAME to their second. This means that all of those merge commits
appear in the --full-history view, with edges that immediately
collapse into the lower history without introducing interesting
single-parent commits.
The --simplify-merges option was introduced to remove these extra
merge commits. By noticing that the rewritten parents are reachable
from their first parents, those edges can be simplified away. Finally,
the commits now look like single-parent commits that are TREESAME to
their "only" parent. Thus, they are removed and this issue does not
cause issues anymore. However, this also ends up removing the commit
M from the history view! Even worse, the --simplify-merges option
requires walking the entire history before returning a single result.
Many Git users are using Git alongside a Git service that provides
code storage alongside a code review tool commonly called "Pull
Requests" or "Merge Requests" against a target branch. When these
requests are accepted and merged, they typically create a merge
commit whose first parent is the previous branch tip and the second
parent is the tip of the topic branch used for the request. This
presents a valuable order to the parents, but also makes that merge
commit slightly special. Users may want to see not only which
commits changed a file, but which pull requests merged those commits
into their branch. In the previous example, this would mean the
users want to see the merge commit "M" in addition to the single-
parent commit "C".
Users are even more likely to want these merge commits when they
use pull requests to merge into a feature branch before merging that
feature branch into their trunk.
In some sense, users are asking for the "first" merge commit to
bring in the change to their branch. As long as the parent order is
consistent, this can be handled with the following rule:
Include a merge commit if it is not TREESAME to its first
parent, but is TREESAME to a later parent.
These merges look like the merge commits that would result from
running "git pull <topic>" on a main branch. Thus, the option to
show these commits is called "--show-pulls". This has the added
benefit of showing the commits created by closing a pull request or
merge request on any of the Git hosting and code review platforms.
To test these options, extend the standard test example to include
a merge commit that is not TREESAME to its first parent. It is
surprising that that option was not already in the example, as it
is instructive.
In particular, this extension demonstrates a common issue with file
history simplification. When a user resolves a merge conflict using
"-Xours" or otherwise ignoring one side of the conflict, they create
a TREESAME edge that probably should not be TREESAME. This leads
users to become frustrated and complain that "my change disappeared!"
In my experience, showing them history with --full-history and
--simplify-merges quickly reveals the problematic merge. As mentioned,
this option is expensive to compute. The --show-pulls option
_might_ show the merge commit (usually titled "resolving conflicts")
more quickly. Of course, this depends on the user having the correct
parent order, which is backwards when using "git pull master" from a
topic branch.
There are some special considerations when combining the --show-pulls
option with --simplify-merges. This requires adding a new PULL_MERGE
object flag to store the information from the initial TREESAME
comparisons. This helps avoid dropping those commits in later filters.
This is covered by a test, including how the parents can be simplified.
Since "struct object" has already ruined its 32-bit alignment by using
33 bits across parsed, type, and flags member, let's not make it worse.
PULL_MERGE is used in revision.c with the same value (1u<<15) as
REACHABLE in commit-graph.c. The REACHABLE flag is only used when
writing a commit-graph file, and a revision walk using --show-pulls
does not happen in the same process. Care must be taken in the future
to ensure this remains the case.
Update Documentation/rev-list-options.txt with significant details
around this option. This requires updating the example in the
History Simplification section to demonstrate some of the problems
with TREESAME second parents.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're comparing a push cert nonce, we currently do so using strcmp.
Most implementations of strcmp short-circuit and exit as soon as they
know whether two values are equal. This, however, is a problem when
we're comparing the output of HMAC, as it leaks information in the time
taken about how much of the two values match if they do indeed differ.
In our case, the nonce is used to prevent replay attacks against our
server via the embedded timestamp and replay attacks using requests from
a different server via the HMAC. Push certs, which contain the nonces,
are signed, so an attacker cannot tamper with the nonces without
breaking validation of the signature. They can, of course, create their
own signatures with invalid nonces, but they can also create their own
signatures with valid nonces, so there's nothing to be gained. Thus,
there is no security problem.
Even though it doesn't appear that there are any negative consequences
from the current technique, for safety and to encourage good practices,
let's use a constant time comparison function for nonce verification.
POSIX does not provide one, but they are easy to write.
The technique we use here is also used in NaCl and the Go standard
library and relies on the fact that bitwise or and xor are constant time
on all known architectures.
We need not be concerned about exiting early if the actual and expected
lengths differ, since the standard cryptographic assumption is that
everyone, including an attacker, knows the format of and algorithm used
in our nonces (and in any event, they have the source code and can
determine it easily). As a result, we assume everyone knows how long
our nonces should be. This philosophy is also taken by the Go standard
library and other cryptographic libraries when performing constant time
comparisons on HMAC values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
c931ba4e (sha1-name.c: remove the_repo from handle_one_ref(),
2019-04-16) replaced the use of for_each_ref() helper, which works
with the main ref store of the default repository instance, with
refs_for_each_ref(), which can work on any ref store instance, by
assuming that the repository instance the function is given has its
ref store already initialized.
But it is possible that nobody has initialized it, in which case,
the code ends up dereferencing a NULL pointer.
Reported-by: Érico Rolim <erico.erc@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 1925fe0c8a ("Documentation: wrap config listings in "----"",
2019-09-07) wrapped this fairly large block of example config directives
in "----". The closing "----" ended up a few lines too early though.
Make sure to include the trailing "IncludeIf.onbranch:..." example, too.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7e9e048661 (stash -p: demonstrate failure of split with mixed y/n,
2015-04-16), a regression test for a known breakage that was added to
the test script `t3904-stash-patch.sh` that demonstrated that splitting
a hunk and trying to stash only part of that split hunk fails (but
shouldn't).
As expected, it still fails, but for the wrong reason: once the bug is
fixed, we would expect stderr to show nothing, yet the regression test
expects stderr to show something.
Let's fix that by telling that regression test case to expect nothing to
be printed to stderr.
While at it, also drop the obvious left-over from debugging where the
regression test did not mind `git stash -p` to return a non-zero exit
status.
Of course, the regression test still fails, but this time for the
correct reason.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4dc42c6c18 (mingw: refuse paths containing reserved names,
2019-12-21), we started disallowing file names that are reserved, e.g.
`NUL`, `CONOUT$`, etc.
This included `COM<n>` where `<n>` is a digit. Unfortunately, this
includes `COM0` but only `COM1`, ..., `COM9` are reserved, according to
the official documentation, `COM0` is mentioned in the "NT Namespaces"
section but it is explicitly _omitted_ from the list of reserved names:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions
Tests corroborate this: it is totally possible to write a file called
`com0.c` on Windows 10, but not `com1.c`.
So let's tighten the code to disallow only the reserved `COM<n>` file
names, but to allow `COM0` again.
This fixes https://github.com/git-for-windows/git/issues/2470.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Microsoft introduced a new "Universal C Runtime Library" (UCRT) with
Visual Studio 2015. The UCRT comes with a new strftime() implementation
that supports more date formats. We link git against the older
"Microsoft Visual C Runtime Library" (MSVCRT), so to use the UCRT
strftime() we need to load it from ucrtbase.dll using
DECLARE_PROC_ADDR()/INIT_PROC_ADDR().
Most supported Windows systems should have recieved the UCRT via Windows
update, but in some cases only MSVCRT might be available. In that case
we fall back to using that implementation.
With this change, it is possible to use e.g. the `%g` and `%V` date
format specifiers, e.g.
git show -s --format=%cd --date=format:‘%g.%V’ HEAD
Without this change, the user would see this error message on Windows:
fatal: invalid strftime format: '‘%g.%V’'
This fixes https://github.com/git-for-windows/git/issues/2495
Signed-off-by: Matthias Aßhauer <mha1993@live.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a (late) companion for f6461b82b9 (Documentation: fix build
with Asciidoctor 2, 2019-09-15).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit subjects or authors have non-ASCII characters, git
format-patch Q-encodes them so they can be safely sent over email.
However, if the patch transfer method is something other than email (web
review tools, sneakernet), this only serves to make the patch metadata
harder to read without first applying it (unless you can decode RFC 2047
in your head). git am as well as some email software supports
non-Q-encoded mail as described in RFC 6531.
Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the
user control this behavior.
Signed-off-by: Emma Brooks <me@pluvano.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08)
optimized "diff" by prefetching blobs in a partial clone, but there are
some cases wherein blobs do not need to be prefetched. In these cases,
any command that uses the diff machinery will unnecessarily fetch blobs.
diffcore_std() may read blobs when it calls the following functions:
(1) diffcore_skip_stat_unmatch() (controlled by the config variable
diff.autorefreshindex)
(2) diffcore_break() and diffcore_merge_broken() (for break-rewrite
detection)
(3) diffcore_rename() (for rename detection)
(4) diffcore_pickaxe() (for detecting addition/deletion of specified
string)
Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(),
diffcore_break(), and diffcore_rename() to prefetch blobs upon the first
read of a missing object. This covers (1), (2), and (3): to cover the
rest, teach diffcore_std() to prefetch if the output type is one that
includes blob data (and hence blob data will be required later anyway),
or if it knows that (4) will be run.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the object reads in diff_populate_filespec() to have the first
object read not be in an if/else branch, because in a future patch, a
retry will be added to that first object read.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of diff_populate_filespec() currently can be customized
through a bitflag, but a subsequent patch requires it to support a
non-boolean option. Replace the bitflag with an options struct.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a lot of code to honor GIT_REFLOG_ACTION throughout git,
including some in sequencer.c; unfortunately, reflog_message() and its
callers ignored it. Instruct reflog_message() to check the existing
environment variable, and use it when present as an override to
action_name().
Also restructure pick_commits() to only temporarily modify
GIT_REFLOG_ACTION for a short duration and then restore the old value,
so that when we do this setting within a loop we do not keep adding "
(pick)" substrings and end up with a reflog message of the form
rebase (pick) (pick) (pick) (finish): returning to refs/heads/master
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation refers to "initialized" or "populated" submodules,
to explain which submodules are affected by '--recurse-submodules', but
the real terminology here is 'active' submodules. Update the
documentation accordingly.
Some terminology:
- Active is defined in gitsubmodules(7), it only involves the
configuration variables 'submodule.active', 'submodule.<name>.active'
and 'submodule.<name>.url'. The function
submodule.c::is_submodule_active checks that a submodule is active.
- Populated means that the submodule's working tree is present (and the
gitfile correctly points to the submodule repository), i.e. either the
superproject was cloned with ` --recurse-submodules`, or the user ran
`git submodule update --init`, or `git submodule init [<path>]` and
`git submodule update [<path>]` separately which populated the
submodule working tree. This does not involve the 3 configuration
variables above.
- Initialized (at least in the context of the man pages involved in this
patch) means both "populated" and "active" as defined above, i.e. what
`git submodule update --init` does.
The --recurse-submodules option mostly affects active submodules. An
exception is `git fetch` where the option affects populated submodules.
As a consequence, in `git pull --recurse-submodules` the fetch affects
populated submodules, but the resulting working tree update only affects
active submodules.
In the documentation of `git-pull`, let's distinguish between the
fetching part which affects populated submodules, and the updating of
worktrees, which only affects active submodules.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default value also depends on the value of submodule.recurse.
Use this opportunity to correct some grammar mistakes in
Documentation/config/fetch.txt signaled by Robert P. J. Day.
Also mention `fetch.recurseSubmodules` in fetch-options.txt. In
git-push.txt, `push.recurseSubmodules` is implicitly mentioned (by
explaining how to disable it), so no need to add it there.
Lastly add a link to `git-fetch` in `git-pull.txt` to explain the
meaning of `--recurse-submodules` there.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also unify the formulation about --no-recurse-submodules for checkout
and switch, which we reuse for restore.
And correct the formulation about submodules' HEAD in read-tree, which
we reuse in reset.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Note that `ls-files` is not affected, even though it has a
`--recurse-submodules` option, so list it as an exception too.
Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a custom hash in fast-import to store the set of objects we've
imported so far. It has a fixed set of 2^16 buckets and chains any
collisions with a linked list. As the number of objects grows larger
than that, the load factor increases and we degrade to O(n) lookups and
O(n^2) insertions.
We can scale better by using our hashmap.c implementation, which will
resize the bucket count as we grow. This does incur an extra memory cost
of 8 bytes per object, as hashmap stores the integer hash value for each
entry in its hashmap_entry struct (which we really don't care about
here, because we're just reusing the embedded object hash). But I think
the numbers below justify this (and our per-object memory cost is
already much higher).
I also looked at using khash, but it seemed to perform slightly worse
than hashmap at all sizes, and worse even than the existing code for
small sizes. It's also awkward to use here, because we want to look up a
"struct object_entry" from a "struct object_id", and it doesn't handle
mismatched keys as well. Making a mapping of object_id to object_entry
would be more natural, but that would require pulling the embedded oid
out of the object_entry or incurring an extra 32 bytes per object.
In a synthetic test creating as many cheap, tiny objects as possible
perl -e '
my $bits = shift;
my $nr = 2**$bits;
for (my $i = 0; $i < $nr; $i++) {
print "blob\n";
print "data 4\n";
print pack("N", $i);
}
' $bits | git fast-import
I got these results:
nr_objects master khash hashmap
2^20 0m4.317s 0m5.109s 0m3.890s
2^21 0m10.204s 0m9.702s 0m7.933s
2^22 0m27.159s 0m17.911s 0m16.751s
2^23 1m19.038s 0m35.080s 0m31.963s
2^24 4m18.766s 1m10.233s 1m6.793s
which points to hashmap as the winner. We didn't have any perf tests for
fast-export or fast-import, so I added one as a more real-world case.
It uses an export without blobs since that's significantly cheaper than
a full one, but still is an interesting case people might use (e.g., for
rewriting history). It will emphasize this change in some ways (as a
percentage we spend more time making objects and less shuffling blob
bytes around) and less in others (the total object count is lower).
Here are the results for linux.git:
Test HEAD^ HEAD
----------------------------------------------------------------------------
9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3%
9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3%
It only has ~5.2M commits and trees, so this is a larger effect than I
expected (the 2^23 case above only improved by 50s or so, but here we
gained almost 90s). This is probably due to actually performing more
object lookups in a real import with trees and commits, as opposed to
just dumping a bunch of blobs into a pack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since f269048754 (fetch: opportunistically update tracking refs,
2013-05-11), the underlying `git fetch` in `git pull <remote> <branch>`
updates the configured remote-tracking branch for <branch>.
However, an example in the 'Examples' section of the `git pull`
documentation still states that this is not the case.
Correct the description of this example.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the `<refspec>` parameter in the `git fetch`
documentation refers to the section "CONFIGURED REMOTE-TRACKING
BRANCHES" in this same documentation page.
In the `git pull` documentation, let's also refer specifically to this
section instead of just linking to the `git fetch` documentation.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For more discussion about these hooks, their history relative to rebase,
and logical consistency between different types of operations, see
https://lore.kernel.org/git/CABPp-BG0bFKUage5cN_2yr2DkmS04W2Z9Pg5WcROqHznV3XBdw@mail.gmail.com/
and the links to some threads referenced therein.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge with --gpg-sign option, and clarify that --no-gpg-sign also
override earlier --gpg-sign.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>