Since f269048754 (fetch: opportunistically update tracking refs,
2013-05-11), the underlying `git fetch` in `git pull <remote> <branch>`
updates the configured remote-tracking branch for <branch>.
However, an example in the 'Examples' section of the `git pull`
documentation still states that this is not the case.
Correct the description of this example.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the `<refspec>` parameter in the `git fetch`
documentation refers to the section "CONFIGURED REMOTE-TRACKING
BRANCHES" in this same documentation page.
In the `git pull` documentation, let's also refer specifically to this
section instead of just linking to the `git fetch` documentation.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 94a88e2524 (Makefile: avoid running curl-config multiple times,
2020-03-26) put the call to $(CURL_CONFIG) into a "simple" variable
which is expanded immediately, rather than expanding it each time it's
needed. However, that also means that we expand it whenever the Makefile
is parsed, whether we need it or not.
This is wasteful, but also breaks the ci/test-documentation.sh job, as
it does not have curl at all and complains about the extra messages to
stderr. An easy way to see it is just:
$ make CURL_CONFIG=does-not-work check-builtins
make: does-not-work: Command not found
make: does-not-work: Command not found
GIT_VERSION = 2.26.0.108.gb3f3f45f29
make: does-not-work: Command not found
make: does-not-work: Command not found
./check-builtins.sh
We can get the best of both worlds if we're willing to accept a little
Makefile hackery. Courtesy of the article at:
http://make.mad-scientist.net/deferred-simple-variable-expansion/
this patch uses a lazily-evaluated recursive variable which replaces its
contents with an immediately assigned simple one on first use.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For more discussion about these hooks, their history relative to rebase,
and logical consistency between different types of operations, see
https://lore.kernel.org/git/CABPp-BG0bFKUage5cN_2yr2DkmS04W2Z9Pg5WcROqHznV3XBdw@mail.gmail.com/
and the links to some threads referenced therein.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In read_populate_opts(), we call read_oneliner() _after_ calling
strbuf_release(). This means that `buf` is leaked at the end of the
function.
Always clean up the strbuf by going to `done_rebase_i` whether or not
we return an error.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge with --gpg-sign option, and clarify that --no-gpg-sign also
override earlier --gpg-sign.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
{cherry-pick,revert} --edit hasn't honoured --no-gpg-sign yet.
Pass this option down to git-commit to honour it.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are 3 callers to promisor_remote_get_direct() that first check if
the number of objects to be fetched is equal to 0. Fold that check into
promisor_remote_get_direct(), and in doing so, be explicit as to what
promisor_remote_get_direct() does if oid_nr is 0 (it returns 0, success,
immediately).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have an environment variable `jobs=16` defined in our CI system, and
this environment makes our build job failed with the following message:
error: pathspec '16' did not match any file(s) known to git
The pathspec '16' for Git command is from the environment variable
"jobs".
This is because "git-submodule" command is implemented in shell script,
and environment variables may change its behavior. Set values for
uninitialized variables, such as "jobs" and "recommend_shallow" will
fix this issue.
Helped-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Li Xuejiang <xuejiang@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) command can only handle queueing transactions
right now via its "--stdin" parameter, but there is no way for users to
handle the transaction itself in a more explicit way. E.g. in a
replicated scenario, one may imagine a coordinator that spawns
git-update-ref(1) for multiple repositories and only if all agree that
an update is possible will the coordinator send a commit. Such a
transactional session could look like
> start
< start: ok
> update refs/heads/master $OLD $NEW
> prepare
< prepare: ok
# All nodes have returned "ok"
> commit
< commit: ok
or
> start
< start: ok
> create refs/heads/master $OLD $NEW
> prepare
< fatal: cannot lock ref 'refs/heads/master': reference already exists
# On all other nodes:
> abort
< abort: ok
In order to allow for such transactional sessions, this commit
introduces four new commands for git-update-ref(1), which matches those
we have internally already with the exception of "start":
- start: start a new transaction
- prepare: prepare the transaction, that is try to lock all
references and verify their current value matches the
expected one
- commit: explicitly commit a session, that is update references to
match their new expected state
- abort: abort a session and roll back all changes
By design, git-update-ref(1) will commit as soon as standard input is
being closed. While fine in a non-transactional world, it is definitely
unexpected in a transactional world. Because of this, as soon as any of
the new transactional commands is used, the default will change to
aborting without an explicit "commit". To avoid a race between queueing
updates and the first "prepare" that starts a transaction, the "start"
command has been added to start an explicit transaction.
Add some tests to exercise this new functionality.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) supports a `--stdin` mode that allows it to read
all reference updates from standard input. This is mainly used to allow
for atomic reference updates that are all or nothing, so that either all
references will get updated or none.
Currently, git-update-ref(1) reads all commands as a single block of up
to 1000 characters and only starts processing after stdin gets closed.
This is less flexible than one might wish for, as it doesn't really
allow for longer-lived transactions and doesn't allow any verification
without committing everything. E.g. one may imagine the following
exchange:
> start
< start: ok
> update refs/heads/master $NEWOID1 $OLDOID1
> update refs/heads/branch $NEWOID2 $OLDOID2
> prepare
< prepare: ok
> commit
< commit: ok
When reading all input as a whole block, the above interactive protocol
is obviously impossible to achieve. But by converting the command to
read commands linewise, we can make it more interactive than before.
Obviously, the linewise interface is only a first step in making
git-update-ref(1) work in a more transaction-oriented way. Missing is
most importantly support for transactional commands that manage the
current transaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the actual logic to update the transaction is handled in
`update_refs_stdin()`, the transaction itself is started and committed
in `cmd_update_ref()` itself. This makes it hard to handle transaction
abortion and commits as part of `update_refs_stdin()` itself, which is
required in order to introduce transaction handling features to `git
update-refs --stdin`.
Refactor the code to move all transaction handling into
`update_refs_stdin()` to prepare for transaction handling features.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently pass both an `strbuf` containing the current command line
as well as the `next` pointer pointing to the first argument to
commands. This is both confusing and makes code more intertwined.
Convert this to use a simple pointer as well as a pointer pointing to
the end of the input as a preparatory step to line-wise reading of
stdin.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `parse_refname` function accepts a `struct strbuf *input` argument
that isn't used at all. As we're about to convert commands to not use a
strbuf anymore but instead an end pointer, let's drop this argument now
to make the converting commit easier to review.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently manually wire up all commands known to `git-update-ref
--stdin`, making it harder than necessary to preprocess arguments after
the command is determined. To make this more extensible, let's refactor
the code to use an array of known commands instead. While this doesn't
add a lot of value now, it is a preparatory step to implement line-wise
reading of commands.
As we're going to introduce commands without trailing spaces, this
commit also moves whitespace parsing into the respective commands.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time we ran 'make --jobs=2 ...' to build Git, its
documentation, or to apply Coccinelle semantic patches. Then commit
eaa62291ff (ci: inherit --jobs via MAKEFLAGS in run-build-and-tests,
2019-01-27) came along, and started using the MAKEFLAGS environment
variable to centralize setting the number of parallel jobs in
'ci/libs.sh'. Alas, it forgot to update 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container running the 32
bit Linux job, and, consequently, since then that job builds Git
sequentially, and it ignores any Makefile knobs that we might set in
MAKEFLAGS (though we don't set any for the 32 bit Linux job at the
moment).
So update the 'docker run' invocation in 'ci/run-linux32-docker.sh' to
make MAKEFLAGS available inside the Docker container as well. Set
CC=gcc for the 32 bit Linux job, because that's the compiler installed
in the 32 bit Linux Docker image that we use (Travis CI nowadays sets
CC=clang by default, but clang is not installed in this image).
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of 'touch -m' to modify a file's mtime is slightly less
portable than using our own 'test-tool chmtime'. The important
thing is that these pack-files are ordered in a special way to
ensure the multi-pack-index selects some as the "newer" pack-files
when resolving duplicate objects.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph builtin has an --expire-time option that takes a
datetime using OPT_EXPIRY_DATE(). However, the implementation inside
expire_commit_graphs() was treating a non-zero value as a number of
seconds to subtract from "now".
Update t5323-split-commit-graph.sh to demonstrate the correct value
of the --expire-time option by actually creating a crud .graph file
with mtime earlier than the expire time. Instead of using a super-
early time (1980) we use an explicit, and recent, time. Using
test-tool chmtime to create two files on either end of an exact
second, we create a test that catches this failure no matter the
current time. Using a fixed date is more portable than trying to
format a relative date string into the --expiry-date input.
I noticed this when inspecting some Scalar repos that had an excess
number of commit-graph files. In Scalar, we were using this second
interpretation by using "--expire-time=3600" to mean "delete graphs
older than one hour ago" to avoid deleting a commit-graph that a
foreground process may be trying to load.
Also I noticed that the help text was copied from the --max-commits
option. Fix that help text.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As reported on the git mailing list, since git-2.25,
git add untracked-dir/
has been tab completing to
git add untracked-dir/./
The cause for this was that with commit b9670c1f5e (dir: fix checks on
common prefix directory, 2019-12-19),
git ls-files -o --directory untracked-dir/
(or the equivalent `git -C untracked-dir ls-files -o --directory`) began
reporting
untracked-dir/
instead of listing paths underneath that directory. It may also be
worth noting that the real command in question was
git -C untracked-dir ls-files -o --directory '*'
which is equivalent to
git ls-files -o --directory 'untracked-dir/*'
which behaves the same for the purposes of this issue (the '*' can match
the empty string), but becomes relevant for the proposed fix.
At first, based on the report, I decided to try to view this as a
regression and tried to find a way to recover the old behavior without
breaking other stuff, or at least breaking as little as possible.
However, in the end, I couldn't figure out a way to do it that wouldn't
just cause lots more problems than it solved. The old behavior was a
bug:
* Although older git would avoid cleaning anything with `git clean -f
.git`, it would wipe out everything under that direcotry with `git
clean -f .git/`. Despite the difference in command used, this is
relevant because the exact same change that fixed clean changed the
behavior of ls-files.
* Older git would report different results based solely on presence or
absence of a trailing slash for $SUBDIR in the command `git ls-files
-o --directory $SUBDIR`.
* Older git violated the documented behavior of not recursing into
directories that matched the pathspec when --directory was
specified.
* And, after all, commit b9670c1f5e (dir: fix checks on common prefix
directory, 2019-12-19) didn't overlook this issue; it explicitly
stated that the behavior of the command was being changed to bring
it inline with the docs.
(Also, if it helps, despite that commit being merged during the 2.25
series, this bug was not reported during the 2.25 cycle, nor even during
most of the 2.26 cycle -- it was reported a day before 2.26 was
released. So the impact of the change is at least somewhat small.)
Instead of relying on a bug of ls-files in reporting the wrong content,
change the invocation of ls-files used by git-completion to make it grab
paths one depth deeper. Do this by changing '$DIR/*' (match $DIR/ plus
0 or more characters) into '$DIR/?*' (match $DIR/ plus 1 or more
characters). Note that the '?' character should not be added when
trying to complete a filename (e.g. 'git ls-files -o --directory
"merge.c?*"' would not correctly return "merge.c" when such a file
exists), so we have to make sure to add the '?' character only in cases
where the path specified so far is a directory.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the expected calling convention for the dir.c API was:
fill_directory(&dir, ..., pathspec)
foreach entry in dir->entries:
if (dir_path_match(entry, pathspec))
process_or_display(entry)
This may have made sense once upon a time, because the fill_directory() call
could use cheap checks to avoid doing full pathspec matching, and an external
caller may have wanted to do other post-processing of the results anyway.
However:
* this structure makes it easy for users of the API to get it wrong
* this structure actually makes it harder to understand
fill_directory() and the functions it uses internally. It has
tripped me up several times while trying to fix bugs and
restructure things.
* relying on post-filtering was already found to produce wrong
results; pathspec matching had to be added internally for multiple
cases in order to get the right results (see commits 404ebceda0
(dir: also check directories for matching pathspecs, 2019-09-17)
and 89a1f4aaf7 (dir: if our pathspec might match files under a
dir, recurse into it, 2019-09-17))
* it's bad for performance: fill_directory() already has to do lots
of checks and knows the subset of cases where it still needs to do
more checks. Forcing external callers to do full pathspec
matching means they must re-check _every_ path.
So, add the pathspec matching within the fill_directory() internals, and
remove it from external callers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
treat_directory() had a call to both do_match_pathspec() and
match_pathspec(). These calls have migrated through the code somewhat
since their introduction, but we don't actually need both. Replace the
two calls with one, and while at it, move the check earlier in order to
reduce the need for callers of fill_directory() to do post-filtering of
results.
The next patch will address post-filtering more forcefully and provide
more relevant history and context.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handling DIR_KEEP_UNTRACKED_CONTENTS within treat_directory() instead of
as a post-processing step in read_directory():
* allows us to directly access and remove the relevant entries instead
of needing to calculate which ones need to be removed
* keeps the logic for directory handling in one location (and puts it
closer the the logic for stripping out extra ignored entries, which
seems logical).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir's read_directory_recursive() naturally operates recursively in order
to walk the directory tree. Treating of directories is sometimes weird
because there are so many different permutations about how to handle
directories. Some examples:
* 'git ls-files -o --directory' only needs to know that a directory
itself is untracked; it doesn't need to recurse into it to see what
is underneath.
* 'git status' needs to recurse into an untracked directory, but only
to determine whether or not it is empty. If there are no files
underneath, the directory itself will be omitted from the output.
If it is not empty, only the directory will be listed.
* 'git status --ignored' needs to recurse into untracked directories
and report all the ignored entries and then report the directory as
untracked -- UNLESS all the entries under the directory are
ignored, in which case we don't print any of the entries under the
directory and just report the directory itself as ignored. (Note
that although this forces us to walk all untracked files underneath
the directory as well, we strip them from the output, except for
users like 'git clean' who also set DIR_KEEP_TRACKED_CONTENTS.)
* For 'git clean', we may need to recurse into a directory that
doesn't match any specified pathspecs, if it's possible that there
is an entry underneath the directory that can match one of the
pathspecs. In such a case, we need to be careful to omit the
directory itself from the list of paths (see commit 404ebceda0
("dir: also check directories for matching pathspecs", 2019-09-17))
Part of the tension noted above is that the treatment of a directory can
change based on the files within it, and based on the various settings
in dir->flags. Trying to keep this in mind while reading over the code,
it is easy to think in terms of "treat_directory() tells us what to do
with a directory, and read_directory_recursive() is the thing that
recurses". Since we need to look into a directory to know how to treat
it, though, it is quite easy to decide to (also) recurse into the
directory from treat_directory() by adding a read_directory_recursive()
call. Adding such a call is actually fine, IF we make sure that
read_directory_recursive() does not also recurse into that same
directory.
Unfortunately, commit df5bcdf83a ("dir: recurse into untracked dirs
for ignored files", 2017-05-18), added exactly such a case to the code,
meaning we'd have two calls to read_directory_recursive() for an
untracked directory. So, if we had a file named
one/two/three/four/five/somefile.txt
and nothing in one/ was tracked, then 'git status --ignored' would
call read_directory_recursive() twice on the directory 'one/', and
each of those would call read_directory_recursive() twice on the
directory 'one/two/', and so on until read_directory_recursive() was
called 2^5 times for 'one/two/three/four/five/'.
Avoid calling read_directory_recursive() twice per level by moving a
lot of the special logic into treat_directory().
Since dir.c is somewhat complex, extra cruft built up around this over
time. While trying to unravel it, I noticed several instances where the
first call to read_directory_recursive() would return e.g.
path_untracked for some directory and a later one would return e.g.
path_none, despite the fact that the directory clearly should have been
considered untracked. The code happened to work due to the side-effect
from the first invocation of adding untracked entries to dir->entries;
this allowed it to get the correct output despite the supposed override
in return value by the later call.
I am somewhat concerned that there are still bugs and maybe even
testcases with the wrong expectation. I have tried to carefully
document treat_directory() since it becomes more complex after this
change (though much of this complexity came from elsewhere that probably
deserved better comments to begin with). However, much of my work felt
more like a game of whackamole while attempting to make the code match
the existing regression tests than an attempt to create an
implementation that matched some clear design. That seems wrong to me,
but the rules of existing behavior had so many special cases that I had
a hard time coming up with some overarching rules about what correct
behavior is for all cases, forcing me to hope that the regression tests
are correct and sufficient. Such a hope seems likely to be ill-founded,
given my experience with dir.c-related testcases in the last few months:
Examples where the documentation was hard to parse or even just wrong:
* 3aca58045f (git-clean.txt: do not claim we will delete files with
-n/--dry-run, 2019-09-17)
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
Examples where testcases were declared wrong and changed:
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
* a2b13367fe (Revert "dir.c: make 'git-status --ignored' work within
leading directories", 2019-12-10)
Examples where testcases were clearly inadequate:
* 502c386ff9 (t7300-clean: demonstrate deleting nested repo with an
ignored file breakage, 2019-08-25)
* 7541cc5302 (t7300: add testcases showing failure to clean specified
pathspecs, 2019-09-17)
* a5e916c745 (dir: fix off-by-one error in match_pathspec_item,
2019-09-17)
* 404ebceda0 (dir: also check directories for matching pathspecs,
2019-09-17)
* 09487f2cba (clean: avoid removing untracked files in a nested git
repository, 2019-09-17)
* e86bbcf987 (clean: disambiguate the definition of -d, 2019-09-17)
* 452efd11fb (t3011: demonstrate directory traversal failures,
2019-12-10)
* b9670c1f5e (dir: fix checks on common prefix directory, 2019-12-19)
Examples where "correct behavior" was unclear to everyone:
https://lore.kernel.org/git/20190905154735.29784-1-newren@gmail.com/
Other commits of note:
* 902b90cf42 (clean: fix theoretical path corruption, 2019-09-17)
However, on the positive side, it does make the code much faster. For
the following simple shell loop in an empty repository:
for depth in $(seq 10 25)
do
dirs=$(for i in $(seq 1 $depth) ; do printf 'dir/' ; done)
rm -rf dir
mkdir -p $dirs
>$dirs/untracked-file
/usr/bin/time --format="$depth: %e" git status --ignored >/dev/null
done
I saw the following timings, in seconds (note that the numbers are a
little noisy from run-to-run, but the trend is very clear with every
run):
10: 0.03
11: 0.05
12: 0.08
13: 0.19
14: 0.29
15: 0.50
16: 1.05
17: 2.11
18: 4.11
19: 8.60
20: 17.55
21: 33.87
22: 68.71
23: 140.05
24: 274.45
25: 551.15
For the above run, using strace I can look for the number of untracked
directories opened and can verify that it matches the expected
2^($depth+1)-2 (the sum of 2^1 + 2^2 + 2^3 + ... + 2^$depth).
After this fix, with strace I can verify that the number of untracked
directories that are opened drops to just $depth, and the timings all
drop to 0.00. In fact, it isn't until a depth of 190 nested directories
that it sometimes starts reporting a time of 0.01 seconds and doesn't
consistently report 0.01 seconds until there are 240 nested directories.
The previous code would have taken
17.55 * 2^220 / (60*60*24*365) = 9.4 * 10^59 YEARS
to have completed the 240 nested directories case. It's not often
that you get to speed something up by a factor of 3*10^69.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic in treat_directory() is handled by a multi-case
switch statement, but this switch is very asymmetrical, as
the first two cases are simple but the third is more
complicated than the rest of the method. In fact, the third
case includes a "break" statement that leads to the block
of code outside the switch statement. That is the only way
to reach that block, as the switch handles all possible
values from directory_exists_in_index();
Extract the switch statement into a series of "if" statements.
This simplifies the trivial cases, while clarifying how to
reach the "show_other_directories" case. This is particularly
important as the "show_other_directories" case will expand
in a later change.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite having contributed several fixes in this area, I have for months
(years?) assumed that the "exclude" variable was a directive; this
caused me to think of it as a different mode we operate in and left me
confused as I tried to build up a mental model around why we'd need such
a directive. I mostly tried to ignore it while focusing on the pieces I
was trying to understand.
Then I finally traced this variable all back to a call to is_excluded(),
meaning it was actually functioning as an adjective. In particular, it
was a checked property ("Does this path match a rule in .gitignore?"),
rather than a mode passed in from the caller. Change the variable name
to match the part of speech used by the function called to define it,
which will hopefully make these bits of code slightly clearer to the
next reader.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 16e2cfa909 ("read_directory(): further split treat_path()",
2010-01-08) split treat_one_path() out of treat_path(), because
treat_leading_path() would not have access to a dirent but wanted to
re-use as much of treat_path() as possible. Not re-using all of
treat_path() caused other bugs, as noted in commit b9670c1f5e ("dir:
fix checks on common prefix directory", 2019-12-19). Finally, in commit
ad6f2157f9 ("dir: restructure in a way to avoid passing around a
struct dirent", 2020-01-16), dirents were removed from treat_path() and
other functions entirely.
Since the only reason for splitting these functions was the lack of a
dirent -- which no longer applies to either function -- and since the
split caused problems in the past resulting in us not using
treat_one_path() separately anymore, just undo the split.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds seven new ls-files tests. While currently all seven test
pass, my earlier rounds of restructuring dir.c to replace an exponential
algorithm with a linear one passed all the tests in the testsuite but
failed six of these seven new tests. Add these tests to increase our
case coverage.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It turns out the t7063 has some testcases that even without using the
untracked cache cover situations that nothing else in the testsuite
handles. Checking the results of
git status --porcelain
both with and without the untracked cache, and comparing both against
our expected results helped uncover a critical bug in some dir.c
restructuring.
Unfortunately, it's not easy to run status and tell it to ignore the
untracked cache; the only knob we have is core.untrackedCache=false,
which is used to instruct git to *delete* the untracked cache (which
might also ignore the untracked cache when it operates, but that isn't
specified in the docs).
Create a simple helper that will create a clone of the index that is
missing the untracked cache bits, and use it to compare that the results
with the untracked cache match the results we get without the untracked
cache.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning with --single-branch, we implement git-fetch's usual
tag-following behavior, grabbing any tag objects that point to objects
we have locally.
When we're a partial clone, though, our has_object_file() check will
actually lazy-fetch each tag. That not only defeats the purpose of
--single-branch, but it does it incredibly slowly, potentially kicking
off a new fetch for each tag. This is even worse for a shallow clone,
which implies --single-branch, because even tags which are supersets of
each other will be fetched individually.
We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call,
which is what git-fetch does in this case.
Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch
does. The rationale is discussed in 5827a03545 (fetch: use "quick"
has_sha1_file for tag following, 2016-10-13), but here the tradeoff
would apply even more so because clone is very unlikely to be racing
with another process repacking our newly-created repository.
This may provide a very small speedup even in the non-partial case case,
as we'd avoid calling reprepare_packed_git() for each tag (though in
practice, we'd only have a single packfile, so that reprepare should be
quite cheap).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the config file we use when we build the user manual with
AsciiDoc. The comment at the top of this chunk that we're removing says
the following:
"unbreak" docbook-xsl v1.68 for manpages (sic!). v1.69 works with or
without this.
This comes from d19fbc3c17 ("Documentation: add git user's manual",
2007-01-07), where it looks like this conf file in general and this
snippet in particular was copy-pasted from asciidoc.conf.
This chunk is very similar to something we just got rid of for the
manpages, and because this appears to be aimed at v1.68 -- which we no
longer support for the manpages as of a few commits ago --, it's
tempting to get rid of this. That reveals an interesting aspect of
"works with or without this": it turns out it actually works /better/
without!
Dropping this makes us render code snippets and shell listings using
<screen> rather than <literallayout>, just like Asciidoctor does. In
user-manual.pdf, this puts the contents into dimmed-background,
easy-to-distinguish-from-the-surrounding-text boxes, as opposed to
white-background (transparent) boxes.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast forwarding, `git --merge' should act the same whether
`rebase.abbreviateCommands' is set or not, but so far it was not the
case. This duplicates the tests ensuring that `--merge' works when fast
forwarding to check if it also works with abbreviated commands.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the sequencer is requested to abbreviate commands, it will replace
those that do not have a short form (eg. `noop') by a comment mark.
`noop' serves no purpose, except when fast-forwarding (ie. by running
`git rebase'). Removing it will break this command when
`rebase.abbreviateCommands' is set to true.
Teach todo_list_to_strbuf() to check if a command has an actual
short form, and to ignore it if not.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the given example, `commit` cannot be `NULL` (because this is the
loop condition: if it was `NULL`, the loop body would not be entered at
all). It took this developer a moment or two to see that this is
therefore dead code.
Let's remove it, to avoid puzzling future readers.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ths has been oid_array for some time, though the source only recently
moved from sha1-array.[ch] to oid-array.[ch]. In either case, we should
say "oid-array" here.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment refers to the "sha1s in the given sha1 array". But this became
an oid_array along with everywhere else in 910650d2f8 (Rename sha1_array
to oid_array, 2017-03-31). Plus there's an extra line of leftover
editing cruft we can drop.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our join_sha1_array_hex() function long ago switched to using an
oid_array; let's change the name to match.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This matches the actual data structure name, as well as the source file
that contains the code we're testing. The test scripts need updating to
use the new name, as well.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to
oid_array, 2017-03-31), but the file is still called sha1-array. Besides
being slightly confusing, it makes it more annoying to grep for leftover
occurrences of "sha1" in various files, because the header is included
in so many places.
Let's complete the transition by renaming the source and header files
(and fixing up a few comment references).
I kept the "-" in the name, as that seems to be our style; cf.
fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10).
We also have oidmap.h and oidset.h without any punctuation, but those
are "struct oidmap" and "struct oidset" in the code. We _could_ make
this "oidarray" to match, but somehow it looks uglier to me because of
the length of "array" (plus it would be a very invasive patch for little
gain).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit started using size_t for our allocations. There are
some iterations that use int or unsigned, though. These aren't dangerous
with respect to memory, but they could produce incorrect results.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The oid_array object uses an "int" to store the number of items and the
allocated size. It's rather unlikely for somebody to have more than 2^31
objects in a repository (the sha1's alone would be 40GB!), but if they
do, we'd overflow our alloc variable.
You can reproduce this case with something like:
git init repo
cd repo
# make a pack with 2^24 objects
perl -e '
my $nr = 2**24;
for (my $i = 0; $i < $nr; $i++) {
print "blob\n";
print "data 4\n";
print pack("N", $i);
}
' | git fast-import
# now make 256 copies of it; most of these objects will be duplicates,
# but oid_array doesn't de-dup until all values are read and it can
# sort the result.
cd .git/objects/pack/
pack=$(echo *.pack)
idx=$(echo *.idx)
for i in $(seq 0 255); do
# no need to waste disk space
ln "$pack" "pack-extra-$i.pack"
ln "$idx" "pack-extra-$i.idx"
done
# and now force an oid_array to store all of it
git cat-file --batch-all-objects --batch-check
which results in:
fatal: size_t overflow: 32 * 18446744071562067968
So the good news is that st_mult() sees the problem (the large number is
because our int wraps negative, and then that gets cast to a size_t),
doing the job it was meant to: bailing in crazy situations rather than
causing an undersized buffer.
But we should avoid hitting this case at all, and instead limit
ourselves based on what malloc() is willing to give us. We can easily do
that by switching to size_t.
The cat-file process above made it to ~120GB virtual set size before the
integer overflow (our internal hash storage is 32-bytes now in
preparation for sha256, so we'd expect ~128GB total needed, plus
potentially more to copy from one realloc'd block to another)). After
this patch (and about 130GB of RAM+swap), it does eventually read in the
whole set. No test for obvious reasons.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git is an enormously flexible and powerful piece of software. However,
it can be intimidating for many users and there are a set of common
questions that users often ask. While we already have some new user
documentation, it's worth adding a FAQ to address common questions that
users often have. Even though some of this is addressed elsewhere in
the documentation, experience has shown that it is difficult for users
to find, so a centralized location is helpful.
Add such a FAQ and fill it with some common questions and answers.
While there are few entries now, we can expand it in the future to cover
more things as we find new questions that users have. Let's also add
section markers so that people answering questions can directly link
users to the proper answer.
The FAQ also addresses common configuration questions that apply not
only to Git as an independent piece of software but also the ecosystem
of CI tools and hosting providers that people use, since these are the
source of common questions. An attempt has been made to avoid
mentioning any particular provider or tool, but to nevertheless cover
common configurations that apply to a wide variety of such tools.
Note that the long lines for certain questions are required, since
Asciidoctor does not permit broken lines there.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the strbuf interface already provides functions to read a line
into it that completely replaces its current contents, we do not have an
interface that allows for appending lines without discarding current
contents.
Add a new function `strbuf_appendwholeline` that reads a line including
its terminating character into a strbuf non-destructively. This is a
preparatory step for git-update-ref(1) reading standard input line-wise
instead of as a block.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description for the "verify" command is lacking a single word "is",
which this commit corrects.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>