get_worktrees() accepts a 'flags' argument, however, there are no
existing flags (the lone flag GWT_SORT_LINKED was recently retired) and
no behavior which can be tweaked. Therefore, drop the 'flags' argument.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Of all the clients of get_worktrees(), only "git worktree list" wants
the list sorted in a very specific way; other clients simply don't care
about the order. Rather than imbuing get_worktrees() with special
knowledge about how various clients -- now and in the future -- may want
the list sorted, drop the sorting capability altogether and make it the
client's responsibility to sort the list if needed.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git index-pack is usually run in a repository, but need not be. Since
packs don't contains information on the algorithm in use, instead
relying on context, add an option to index-pack to tell it which one
we're using in case someone runs it outside of a repository. Since
using --stdin necessarily implies a repository, don't allow specifying
an object format if it's provided to prevent users from passing an
option that won't work. Add documentation for this option.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_pull() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Leaking in cmd_*() isn't a big deal, but sets a bad example for other
users of is_descendant_of().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment in cmd_diff() states that if one tree-ish and no blobs are
provided, (the "N=1, M=0" case), it will provide a diff between the tree
and the cache. This is incorrect because a diff happens between the
tree-ish and the working tree. Remove the `--cached` in the comment so
that the correct behavior is shown. Add a new section describing the
"N=1, M=0, --cached" behavior.
Next, describe the "N=0, M=0, --cached" case, similar to the above since
it is undocumented.
Finally, fix some spacing issues. Add spaces between each section for
consistency and readability. Also, change tabs within the comment into
spaces.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behaviour of "sparse-checkout" in the state "git clone
--no-checkout" left was changed accidentally in 2.27, which has
been corrected.
* en/sparse-checkout:
sparse-checkout: avoid staging deletions of all files
The reflog entries for "git clone" and "git fetch" did not
anonymize the URL they operated on.
* js/reflog-anonymize-for-clone-and-fetch:
clone/fetch: anonymize URLs in the reflog
14ba97f8 (alloc: allow arbitrary repositories for alloc functions,
2018-05-15) introduced parsed_object_pool->commit_count to keep count of
commits per repository and was used to assign commit->index.
However, commit-slab code requires commit->index values to be unique
and a global count would be correct, rather than a per-repo count.
Let's introduce a static counter variable, `parsed_commits_count` to
keep track of parsed commits so far.
As commit_count has no use anymore, let's also drop it from the struct.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git branch` accepts `--edit-description` in conjunction with other
arguments. However, `--edit-description` is its own mode, similar to
`--set-upstream-to`, which is also made mutually exclusive with other
modes. Prevent `--edit-description` from being mixed with other modes.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6b1db43109 ("clean: teach clean -d to preserve ignored paths",
2017-05-23) added the following code block (among others) to git-clean:
if (remove_directories)
dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
The reason for these flags is well documented in the commit message, but
isn't obvious just from looking at the code. Add some explanations to
the code to make it clearer.
Further, it appears git-2.26 did not correctly handle this combination
of flags from git-clean. With both these flags and without
DIR_SHOW_IGNORED_TOO_MODE_MATCHING set, git is supposed to recurse into
all untracked AND ignored directories. git-2.26.0 clearly was not doing
that. I don't know the full reasons for that or whether git < 2.27.0
had additional unknown bugs because of that misbehavior, because I don't
feel it's worth digging into. As per the huge changes and craziness
documented in commit 8d92fb2927 ("dir: replace exponential algorithm
with a linear one", 2020-04-01), the old algorithm was a mess and was
thrown out. What I can say is that git-2.27.0 correctly recurses into
untracked AND ignored directories with that combination.
However, in clean's case we don't need to recurse into ignored
directories; that is just a waste of time. Thus, when git-2.27.0
started correctly handling those flags, we got a performance regression
report. Rather than relying on other bugs in fill_directory()'s former
logic to provide the behavior of skipping ignored directories, make use
of the DIR_SHOW_IGNORED_TOO_MODE_MATCHING value specifically added in
commit eec0f7f2b7 ("status: add option to show ignored files
differently", 2017-10-30) for this purpose.
Reported-by: Brian Malehorn <bmalehorn@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I spent a long time trying to figure out how and whether the code worked
with different values of ignore, ignore_only, and remove_directories.
After lots of time setting up lots of testcases, sifting through lots of
print statements, and walking through the debugger, I finally realized
that one piece of code related to how it was all setup was found in
clean.c rather than dir.c. Make a change that would have made it easier
for me to do the extra testing by putting this handling in one spot.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.h documented quite clearly that DIR_SHOW_IGNORED and
DIR_SHOW_IGNORED_TOO are mutually exclusive, with a big comment to this
effect by the definition of both enum values. However, a command like
git clean -fx $DIR
would set both values for dir.flags. I _think_ it happened to work
because:
* As dir.h points out, DIR_KEEP_UNTRACKED_CONTENTS only takes effect
if DIR_SHOW_IGNORED_TOO is set.
* As coded, I believe DIR_SHOW_IGNORED would just happen to take
precedence over DIR_SHOW_IGNORED_TOO in the code as currently
constructed.
Which is a long way of saying "we just got lucky".
Fix clean.c to avoid setting these mutually exclusive values at the same
time, and add a check to dir.c that will throw a BUG() to prevent anyone
else from making this mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack to send part of its packfile response as URIs.
An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID, a pack
hash, and a URI. A client may configure fetch.uriprotocols to be a
comma-separated list of protocols that it is willing to use to fetch
additional packfiles - this list will be sent to the server. Whenever an
object with one of those OIDs would appear in the packfile transmitted
by upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a fetch results in a packfile being downloaded, a .keep file is
generated, so that the packfile can be preserved (from, say, a running
"git repack") until refs are written referring to the contents of the
packfile.
In a subsequent patch, a successful fetch using protocol v2 may result
in more than one .keep file being generated. Therefore, teach
fetch_pack() and the transport mechanism to support multiple .keep
files.
Implementation notes:
- builtin/fetch-pack.c normally does not generate .keep files, and thus
is unaffected by this or future changes. However, it has an
undocumented "--lock-pack" feature, used by remote-curl.c when
implementing the "fetch" remote helper command. In keeping with the
remote helper protocol, only one "lock" line will ever be written;
the rest will result in warnings to stderr. However, in practice,
warnings will never be written because the remote-curl.c "fetch" is
only used for protocol v0/v1 (which will not generate multiple .keep
files). (Protocol v2 uses the "stateless-connect" command, not the
"fetch" command.)
- connected.c has an optimization in that connectivity checks on a ref
need not be done if the target object is in a pack known to be
self-contained and connected. If there are multiple packfiles, this
optimization can no longer be done.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" takes special care to avoid creating a new worktree
at a location already registered to an existing worktree even if that
worktree is missing (which can happen, for instance, if the worktree
resides on removable media). "git worktree move", however, is not so
careful when validating the destination location and will happily move
the source worktree atop the location of a missing worktree. This leads
to the anomalous situation of multiple worktrees being associated with
the same path, which is expressly forbidden by design. For example:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
$ git worktree remove ../bar
fatal: validation failed, cannot remove working tree:
'.../bar' does not point back to '.git/worktrees/bar'
Fix this shortcoming by enhancing "git worktree move" to perform the
same additional validation of the destination directory as done by "git
worktree add".
While at it, add a test to verify that "git worktree move" won't move a
worktree atop an existing (non-worktree) path -- a restriction which has
always been in place but was never tested.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" checks that the specified path is a valid location
for a new worktree by ensuring that the path does not already exist and
is not already registered to another worktree (a path can be registered
but missing, for instance, if it resides on removable media). Since "git
worktree add" is not the only command which should perform such
validation ("git worktree move" ought to also), generalize the the
validation function for use by other callers, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree prune" detects when multiple entries are associated with
the same path and prunes the duplicates, however, it does not detect
when a linked worktree points at the path of the main worktree.
Although "git worktree add" disallows creating a new worktree with the
same path as the main worktree, such a case can arise outside the
control of Git even without the user mucking with .git/worktree/<id>/
administrative files. For instance:
$ git clone foo.git
$ git -C foo worktree add ../bar
$ rm -rf bar
$ mv foo bar
$ git -C bar worktree list
.../bar deadfeeb [master]
.../bar deadfeeb [bar]
Help the user recover from such corruption by extending "git worktree
prune" to also detect when a linked worktree is associated with the path
of the main worktree.
Reported-by: Jonathan Müller <jonathanmueller.dev@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fundamental restriction of linked working trees is that there must
only ever be a single worktree associated with a particular path, thus
"git worktree add" explicitly disallows creation of a new worktree at
the same location as an existing registered worktree. Nevertheless,
users can still "shoot themselves in the foot" by mucking with
administrative files in .git/worktree/<id>/. Worse, "git worktree move"
is careless[1] and allows a worktree to be moved atop a registered but
missing worktree (which can happen, for instance, if the worktree is on
removable media). For instance:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
Help users recover from this form of corruption by teaching "git
worktree prune" to detect when multiple worktrees are associated with
the same path.
[1]: A subsequent commit will fix "git worktree move" validation to be
more strict.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level logic for removing a worktree is well encapsulated in
delete_git_dir(). However, high-level details related to pruning a
worktree -- such as dealing with verbosity and dry-run mode -- are not
encapsulated. Factor out this high-level logic into its own function so
it can be re-used as new worktree corruption detectors are added.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Readers of the name prune_worktree() are likely to expect the function
to actually prune a worktree, however, it only answers the question
"should this worktree be pruned?". Give it a name more reflective of its
true purpose to avoid such confusion.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to parse "git bisect start" command line was lax in
validating the arguments.
* cb/bisect-helper-parser-fix:
bisect--helper: avoid segfault with bad syntax in `start --term-*`
On-the-wire protocol v2 easily falls into a deadlock between the
remote-curl helper and the fetch-pack process when the server side
prematurely throws an error and disconnects. The communication has
been updated to make it more robust.
* dl/remote-curl-deadlock-fix:
stateless-connect: send response end packet
pkt-line: define PACKET_READ_RESPONSE_END
remote-curl: error on incomplete packet
pkt-line: extern packet_length()
transport: extract common fetch_pack() call
remote-curl: remove label indentation
remote-curl: fix typo
Code simplification and test coverage enhancement.
* bc/filter-process:
t2060: add a test for switch with --orphan and --discard-changes
builtin/checkout: simplify metadata initialization
For each worktree removed by "git worktree prune", it reports the reason
for the removal. All reasons share the common prefix "Removing
worktrees/%s:". As new removal reasons are added, this prefix needs to
be duplicated, which is error-prone and potentially cumbersome.
Therefore, factor out the common prefix.
Although this change seems to increase the "sentence lego quotient", it
should be reasonably safe, as the reason for removal is a distinct
clause, not strictly related to the prefix. Moreover, the "worktrees" in
"Removing worktrees/%s:" is a path literal which ought not be localized,
so by factoring it out, we can more easily avoid exposing that path
fragment to translators.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'extensions' configuration variable gets special meaning in the new
repository version, so when enabling the extension we should upgrade the
repository to version 1.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Retroactively adding a filter can be useful for existing shallow clones as
they allow users to see earlier change histories without downloading all
git objects in a regular --unshallow fetch.
Without this patch, users can make a clone partial by editing the
repository configuration to convert the remote into a promisor, like:
git config core.repositoryFormatVersion 1
git config extensions.partialClone origin
git fetch --unshallow --filter=blob:none origin
Since the hard part of making this work is already in place and such
edits can be error-prone, teach Git to perform the required configuration
change automatically instead.
Note that this change does not modify the existing git behavior which
recognizes setting extensions.partialClone without changing
repositoryFormatVersion.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sparse-checkout's purpose is to update the working tree to have it
reflect a subset of the tracked files. As such, it shouldn't be
switching branches, making commits, downloading or uploading data, or
staging or unstaging changes. Other than updating the worktree, the
only thing sparse-checkout should touch is the SKIP_WORKTREE bit of the
index. In particular, this sets up a nice invariant: running
sparse-checkout will never change the status of any file in `git status`
(reflecting the fact that we only set the SKIP_WORKTREE bit if the file
is safe to delete, i.e. if the file is unmodified).
Traditionally, we did a _really_ bad job with this goal. The
predecessor to sparse-checkout involved manual editing of
.git/info/sparse-checkout and running `git read-tree -mu HEAD`. That
command would stage and unstage changes and overwrite dirty changes in
the working tree.
The initial implementation of the sparse-checkout command was no better;
it simply invoked `git read-tree -mu HEAD` as a subprocess and had the
same caveats, though this issue came up repeatedly in review comments
and workarounds for the problems were put in place before the feature
was merged[1, 2, 3, 4, 5, 6; especially see 4 & 6].
[1] https://lore.kernel.org/git/CABPp-BFT9A5n=_bx5LsjCvbogqwSjiwgr5amcjgbU1iAk4KLJg@mail.gmail.com/
[2] https://lore.kernel.org/git/CABPp-BEmwSwg4tgJg6nVG8a3Hpn_g-=ZjApZF4EiJO+qVgu4uw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFV7TA0qwZCQpHCqx9N+JifyRyuBQ-pZ_oGfe-NOgyh7A@mail.gmail.com/
[4] https://lore.kernel.org/git/CABPp-BHYCCD+Vx5fq35jH82eHc1-P53Lz_aGNpHJNcx9kg2K-A@mail.gmail.com/
[5] https://lore.kernel.org/git/CABPp-BF+JWYZfDqp2Tn4AEKVp4b0YMA=Mbz4Nz62D-gGgiduYQ@mail.gmail.com/
[6] https://lore.kernel.org/git/20191121163706.GV23183@szeder.dev/
However, these workarounds, in addition to disabling the feature in a
number of important cases, also missed one special case. I'll get back
to it later.
In the 2.27.0 cycle, the disabling of the feature was lifted by finally
replacing the internal equivalent of `git read-tree -mu HEAD` with
something that did what we wanted: the new update_sparsity() function in
unpack-trees.c that only ever updates SKIP_WORKTREE bits in the index
and updates the working tree to match. This new function handles all
the cases that were problematic for the old implementation, except that
it breaks the same special case that avoided the workarounds of the old
implementation, but broke it in a different way.
So...that brings us to the special case: a git clone performed with
--no-checkout. As per the meaning of the flag, --no-checkout does not
check out any branch, with the implication that you aren't on one and
need to switch to one after the clone. Implementationally, HEAD is
still set (so in some sense you are partially on a branch), but
* the index is "unborn" (non-existent)
* there are no files in the working tree (other than .git/)
* the next time git switch (or git checkout) is run it will run
unpack_trees with `initial_checkout` flag set to true.
It is not until you run, e.g. `git switch <somebranch>` that the index
will be written and files in the working tree populated.
With this special --no-checkout case, the traditional `read-tree -mu
HEAD` behavior would have done the equivalent of acting like checkout --
switch to the default branch (HEAD), write out an index that matches
HEAD, and update the working tree to match. This special case slipped
through the avoid-making-changes checks in the original sparse-checkout
command and thus continued there.
After update_sparsity() was introduced and used (see commit f56f31af03
("sparse-checkout: use new update_sparsity() function", 2020-03-27)),
the behavior for the --no-checkout case changed: Due to git's
auto-vivification of an empty in-memory index (see do_read_index() and
note that `must_exist` is false), and due to sparse-checkout's
update_working_directory() code to always write out the index after it
was done, we got a new bug. That made it so that sparse-checkout would
switch the repository from a clone with an "unborn" index (i.e. still
needing an initial_checkout), to one that had a recorded index with no
entries. Thus, instead of all the files appearing deleted in `git
status` being known to git as a special artifact of not yet being on a
branch, our recording of an empty index made it suddenly look to git as
though it was definitely on a branch with ALL files staged for deletion!
A subsequent checkout or switch then had to contend with the fact that
it wasn't on an initial_checkout but had a bunch of staged deletions.
Make sure that sparse-checkout changes nothing in the index other than
the SKIP_WORKTREE bit; in particular, when the index is unborn we do not
have any branch checked out so there is no sparsification or
de-sparsification work to do. Simply return from
update_working_directory() early.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.
Let's scrub them before writing them to the reflog.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message from "git checkout -b foo -t bar baz" was
confusing.
* rs/checkout-b-track-error:
checkout: improve error messages for -b with extra argument
checkout: add tests for -b and --track
Convert submodule subcommand 'set-branch' to a builtin and call it via
'git-submodule.sh'.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Denton Liu <liu.denton@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-remote may or may not operate within a repository, and as such will
not have been initialized with the repository's hash algorithm. Even if
it were, the remote side could be using a different algorithm and we
would still want to display those refs properly. Find the hash
algorithm used by the remote side by querying the transport object and
set our hash algorithm accordingly.
Without this change, if the remote side is using SHA-256, we truncate
the refs to 40 hex characters, since that's the length of the default
hash algorithm (SHA-1).
Note that technically this is not a correct setting of the repository
hash algorithm since, if we are in a repository, it might be one of a
different hash algorithm from the remote side. However, our current
code paths don't handle multiple algorithms and won't for some time, so
this is the best we can do. We rely on the fact that ls-remote never
modifies the current repository, which is a reasonable assumption to
make.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
show-index is capable of reading any possible index file whether or not
the index is inside a repository. However, because our index files lack
metadata about the hash algorithm in use, it's not possible to
autodetect the algorithm that a particular index file is using.
In order to allow us to read index files of any algorithm, let's set up
the .git directory gently so that we default to the algorithm for the
current repository, and add an --object-format option to allow users to
override this setting and continue to run show-index outside of a
repository altogether. Let's also document this new option so that
people can find it and use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both v2 pack index files and the v3 format specified as part of the
NewHash work have similar data starting at the CRC table. Much of the
existing code wants to read either this table or the offset entries
following it, and in doing so computes the offset each time.
In order to share as much code between v2 and v3, compute the offset of
the CRC table and store it when the pack is opened. Use this value to
compute offsets to not only the CRC table, but to the offset entries
beyond it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a clone, we don't know what hash algorithm the other end
will support. Currently, we don't support fetching data belonging to a
different algorithm, so we must know what algorithm the remote side is
using in order to properly initialize the repository. We can know that
only after fetching the refs, so if the remote side has any references,
use that information to reinitialize the repository with the correct
hash algorithm information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Advertise the current hash algorithm in use by using the object-format
capability as part of the ref advertisement.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.
This can be seen in the following command which does not terminate:
$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
whereas the v1 version does terminate as expected:
$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
fatal: the remote end hung up unexpectedly
Instead of blindly forwarding packets, make remote-curl insert a
response end packet after proxying the responses from the remote server
when using stateless_connect(). On the RPC client side, ensure that each
response ends as described.
A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.
Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to create a branch "foo" based on "origin/master" and give
git commit -b an extra unsupported argument "bar", it confusingly
reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
$ git checkout --track -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
That's wrong, because it very well understands that "origin/master" is
supposed to be the start point for the new branch and not "bar". Check
if we got a commit and show more fitting messages in that case instead:
$ git checkout -b foo origin/master bar
fatal: Cannot update paths and switch to branch 'foo' at the same time.
$ git checkout --track -b foo origin/master bar
fatal: '--track' cannot be used with updating paths
Original-patch-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
06f5608c14 (bisect--helper: `bisect_start` shell function partially in C,
2019-01-02) adds a lax parser for `git bisect start` which could result
in a segfault under a bad syntax call for start with custom terms.
Detect if there are enough arguments left in the command line to use for
--term-{old,good,new,bad} and abort with the same syntax error the original
implementation will show if not.
While at it, remove an unnecessary (and incomplete) check for unknown
arguments and make sure to add a test to avoid regressions.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call init_checkout_metadata in reset_tree, we want to pass the
object ID of the commit in question so that it can be passed to filters,
or if there is no commit, the tree. We anticipated this latter case,
which can occur elsewhere in the checkout code, but it cannot occur
here. The only case in which we do not have a commit object is when
invoking git switch with --orphan. Moreover, we can only hit this code
path without a commit object additionally with either --force or
--discard-changes.
In such a case, there is no point initializing the checkout metadata
with a commit or tree because (a) there is no commit, only the empty
tree, and (b) we will never use the data, since no files will be smudged
when checking out a branch with no files. Pass the all-zeros object ID
in this case, since we just need some value which is a valid pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.
Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a merge with a single strategy, the result of evaluate_result() is
effectively not used and therefore is not needed, so avoid altogether.
On Windows, this optimization can halve the time required to perform a
recursive merge of a single commit with the LLVM repo.
Signed-off-by: Andrew Ng <andrew.ng@sony.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 7c5c9b9c57 (commit-graph: error out on invalid commit oids in
'write --stdin-commits', 2019-08-05), the commit-graph builtin dies on
receiving non-commit OIDs as input to '--stdin-commits'.
This behavior can be cumbersome to work around in, say, the case of
piping 'git for-each-ref' to 'git commit-graph write --stdin-commits' if
the caller does not want to cull out non-commits themselves. In this
situation, it would be ideal if 'git commit-graph write' wrote the graph
containing the inputs that did pertain to commits, and silently ignored
the remainder of the input.
Some options have been proposed to the effect of '--[no-]check-oids'
which would allow callers to have the commit-graph builtin do just that.
After some discussion, it is difficult to imagine a caller who wouldn't
want to pass '--no-check-oids', suggesting that we should get rid of the
behavior of complaining about non-commit inputs altogether.
If callers do wish to retain this behavior, they can easily work around
this change by doing the following:
git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
awk '
!/commit/ { print "not-a-commit:"$1 }
/commit/ { print $1 }
' |
git commit-graph write --stdin-commits
To make it so that valid OIDs that refer to non-existent objects are
indeed an error after loosening the error handling, perform an extra
lookup to make sure that object indeed exists before sending it to the
commit-graph internals.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given a list of commits, the commit-graph machinery calls
'lookup_commit_reference_gently()' on each element in the set and treats
the resulting set of OIDs as the base over which to close for
reachability.
In an earlier collection of commits, the 'git commit-graph write
--reachable' case made the inner-most call to
'lookup_commit_reference_gently()' by peeling references before they
were passed over to the commit-graph internals.
Do the analog for 'git commit-graph write --stdin-commits' by calling
'lookup_commit_reference_gently()' outside of the commit-graph
machinery, making the inner-most call a noop.
Since this may incur additional processing time, surround
'read_one_commit' with a progress meter to provide output to the caller.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With either '--stdin-commits' or '--stdin-packs', the commit-graph
builtin will read line-delimited input, and interpret it either as a
series of commit OIDs, or pack names.
In a subsequent commit, we will begin handling '--stdin-commits'
differently by processing each line as it comes in, instead of in one
shot at the end. To make adequate room for this additional logic, split
the '--stdin-commits' case from '--stdin-packs' by only storing the
input when '--stdin-packs' is given.
In the case of '--stdin-commits', feed each line to a new
'read_one_commit' helper, which (for now) will merely call
'parse_oid_hex'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach "am", "commit", "merge" and "rebase", when they are run with
the "--quiet" option, to pass "--quiet" down to "gc --auto".
* jc/auto-gc-quiet:
auto-gc: pass --quiet down from am, commit, merge and rebase
auto-gc: extract a reusable helper from "git fetch"
The <stdlib.h> header on NetBSD brings in its own definition of
hmac() function (eek), which conflicts with our own and unrelated
function with the same name. Our function has been renamed to work
around the issue.
* cb/avoid-colliding-with-netbsd-hmac:
builtin/receive-pack: avoid generic function name hmac()
"git restore --staged --worktree" now defaults to take the contents
out of "HEAD", instead of erring out.
* es/restore-staged-from-head-by-default:
restore: default to HEAD when combining --staged and --worktree
"git branch" and other "for-each-ref" variants accepted multiple
--sort=<key> options in the increasing order of precedence, but it
had a few breakages around "--ignore-case" handling, and tie-breaking
with the refname, which have been fixed.
* jk/for-each-ref-multi-key-sort-fix:
ref-filter: apply fallback refname sort only after all user sorts
ref-filter: apply --ignore-case to all sorting keys
In error messages that "git switch" mentions its option to create a
new branch, "-b/-B" options were shown, where "-c/-C" options
should be, which has been corrected.
* dl/switch-c-option-in-error-message:
switch: fix errors and comments related to -c and -C
Convert submodule subcommand 'set-url' to a builtin. Port 'set-url' to
'submodule--helper.c' and call the latter via 'git-submodule.sh'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These commands take the --quiet option for their own operation, but
they forget to pass the option down when they invoke "git gc --auto"
internally.
Teach them to do so using the run_auto_gc() helper we added in the
previous step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 1991006c (fetch: convert argv_gc_auto to struct argv_array,
2014-08-16), we taught "git fetch --quiet" to pass the "--quiet"
option down to "gc --auto". This issue, however, is not limited to
"fetch":
$ git grep -e 'gc.*--auto' \*.c
finds hits in "am", "commit", "merge", and "rebase" and these
commands do not pass "--quiet" down to "gc --auto" when they
themselves are told to be quiet.
As a preparatory step, let's introduce a helper function
run_auto_gc(), that the caller can pass a boolean "quiet",
and redo the fix to "git fetch" using the helper.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, files are restored from the index for --worktree, and from
HEAD for --staged. When --worktree and --staged are combined, --source
must be specified to disambiguate the restore source[1], thus making it
cumbersome to restore a file in both the worktree and the index.
However, HEAD is also a reasonable default for --worktree when combined
with --staged, so make it the default anytime --staged is used (whether
combined with --worktree or not).
[1]: Due to an oversight, the --source requirement, though documented,
is not actually enforced.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fabec2c5c3 (builtin/receive-pack: switch to use the_hash_algo, 2019-08-18)
renames hmac_sha1 to hmac, as it was updated to use the hash function used
by git (which won't be sha1 in the future).
hmac() is provided by NetBSD >= 8 libc and therefore conflicts as shown by :
builtin/receive-pack.c:421:13: error: conflicting types for 'hmac'
static void hmac(unsigned char *out,
^~~~
In file included from ./git-compat-util.h:172:0,
from ./builtin.h:4,
from builtin/receive-pack.c:1:
/usr/include/stdlib.h:305:10: note: previous declaration of 'hmac' was here
ssize_t hmac(const char *, const void *, size_t, const void *, size_t, void *,
^~~~
Rename it again to hmac_hash to reflect it will use the git's defined hash
function and avoid the conflict, while at it update a comment to better
describe the HMAC function that was used.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the ref-filter users (for-each-ref, branch, and tag) take an
--ignore-case option which makes filtering and sorting case-insensitive.
However, this option was applied only to the first element of the
ref_sorting list. So:
git for-each-ref --ignore-case --sort=refname
would do what you expect, but:
git for-each-ref --ignore-case --sort=refname --sort=taggername
would sort the primary key (taggername) case-insensitively, but sort the
refname case-sensitively. We have two options here:
- teach callers to set ignore_case on the whole list
- replace the ref_sorting list with a struct that contains both the
list of sorting keys, as well as options that apply to _all_
keys
I went with the first one here, as it gives more flexibility if we later
want to let the users set the flag per-key (presumably through some
special syntax when defining the key; for now it's all or nothing
through --ignore-case).
The new test covers this by sorting on both tagger and subject
case-insensitively, which should compare "a" and "A" identically, but
still sort them before "b" and "B". We'll break ties by sorting on the
refname to give ourselves a stable output (this is actually supposed to
be done automatically, but there's another bug which will be fixed in
the next commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error and verbose trace messages from "git push" did not redact
credential material embedded in URLs.
* js/anonymise-push-url-in-errors:
push: anonymize URLs in error messages and warnings
The "bugreport" tool.
* es/bugreport:
bugreport: drop extraneous includes
bugreport: add compiler info
bugreport: add uname info
bugreport: gather git version and build info
bugreport: add tool to generate debugging info
help: move list_config_help to builtin/help
Incompatible options "--root" and "--fork-point" of "git rebase"
have been marked and documented as being incompatible.
* en/rebase-root-and-fork-point-are-incompatible:
rebase: display an error if --root and --fork-point are both provided
"git blame" learns to take advantage of the "changed-paths" Bloom
filter stored in the commit-graph file.
* ds/blame-on-bloom:
test-bloom: check that we have expected arguments
test-bloom: fix some whitespace issues
blame: drop unused parameter from maybe_changed_path
blame: use changed-path Bloom filters
tests: write commit-graph with Bloom filters
revision: complicated pathspecs disable filters
Introduce an extension to the commit-graph to make it efficient to
check for the paths that were modified at each commit using Bloom
filters.
* gs/commit-graph-path-filter:
bloom: ignore renames when computing changed paths
commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag
t4216: add end to end tests for git log with Bloom filters
revision.c: add trace2 stats around Bloom filter usage
revision.c: use Bloom filters to speed up path based revision walks
commit-graph: add --changed-paths option to write subcommand
commit-graph: reuse existing Bloom filters during write
commit-graph: write Bloom filters to commit graph file
commit-graph: examine commits by generation number
commit-graph: examine changed-path objects in pack order
commit-graph: compute Bloom filters for changed paths
diff: halt tree-diff early after max_changes
bloom.c: core Bloom filter implementation for changed paths.
bloom.c: introduce core Bloom filter constructs
bloom.c: add the murmur3 hash implementation
commit-graph: define and use MAX_NUM_CHUNKS
Fix in-core inconsistency after fetching into a shallow repository
that broke the code to write out commit-graph.
* tb/reset-shallow:
shallow.c: use '{commit,rollback}_shallow_file'
t5537: use test_write_lines and indented heredocs for readability
In previous patches, the functions 'commit_shallow_file' and
'rollback_shallow_file' were introduced to reset the shallowness
validity checks on a repository after potentially modifying
'.git/shallow'.
These functions can be made safer by wrapping the 'struct lockfile *' in
a new type, 'shallow_lock', so that they cannot be called with a raw
lock (and potentially misused by other code that happens to possess a
lockfile, but has nothing to do with shallowness).
This patch introduces that type as a thin wrapper around 'struct
lockfile', and updates the two aforementioned functions and their
callers to use it.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d787d311db (checkout: split part of it to new command 'switch',
2019-03-29), the `git switch` command was created by extracting the
common functionality of cmd_checkout() in checkout_main(). However, in
b7b5fce270 (switch: better names for -b and -B, 2019-03-29), the branch
creation and force creation options for 'switch' were changed to -c and
-C, respectively. As a result of this, error messages and comments that
previously referred to `-b` and `-B` became invalid for `git switch`.
For error messages that refer to `-b` and `-B`, use a format string
instead so that `-c` and `-C` can be printed when `git switch` is
invoked.
Reported-by: Robert Simpson
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-ref --stdin" learned a handful of new verbs to let the
user control ref update transactions more explicitly, which helps
as an ingredient to implement two-phase commit-style atomic
ref-updates across multiple repositories.
* ps/transactional-update-ref-stdin:
update-ref: implement interactive transaction handling
update-ref: read commands in a line-wise fashion
update-ref: move transaction handling into `update_refs_stdin()`
update-ref: pass end pointer instead of strbuf
update-ref: drop unused argument for `parse_refname`
update-ref: organize commands in an array
strbuf: provide function to append whole lines
git-update-ref.txt: add missing word
refs: fix segfault when aborting empty transaction
The directory traversal code had redundant recursive calls which
made its performance characteristics exponential with respect to
the depth of the tree, which was corrected.
* en/fill-directory-exponential:
completion: fix 'git add' on paths under an untracked directory
Fix error-prone fill_directory() API; make it only return matches
dir: replace double pathspec matching with single in treat_directory()
dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory()
dir: replace exponential algorithm with a linear one
dir: refactor treat_directory to clarify control flow
dir: fix confusion based on variable tense
dir: fix broken comment
dir: consolidate treat_path() and treat_one_path()
dir: fix simple typo in comment
t3000: add more testcases testing a variety of ls-files issues
t7063: more thorough status checking
"sparse-checkout" UI improvements.
* en/sparse-checkout:
sparse-checkout: provide a new reapply subcommand
unpack-trees: failure to set SKIP_WORKTREE bits always just a warning
unpack-trees: provide warnings on sparse updates for unmerged paths too
unpack-trees: make sparse path messages sound like warnings
unpack-trees: split display_error_msgs() into two
unpack-trees: rename ERROR_* fields meant for warnings to WARNING_*
unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier
sparse-checkout: use improved unpack_trees porcelain messages
sparse-checkout: use new update_sparsity() function
unpack-trees: add a new update_sparsity() function
unpack-trees: pull sparse-checkout pattern reading into a new function
unpack-trees: do not mark a dirty path with SKIP_WORKTREE
unpack-trees: allow check_updates() to work on a different index
t1091: make some tests a little more defensive against failures
unpack-trees: simplify pattern_list freeing
unpack-trees: simplify verify_absent_sparse()
unpack-trees: remove unused error type
unpack-trees: fix minor typo in comment
The stash entry created by "git rebase --autosquash" to keep the
initial dirty state were discarded by mistake upon "git rebase
--quit", which has been corrected.
* dl/merge-autostash-rebase-quit-fix:
rebase: save autostash entry into stash reflog on --quit
"git grep" did not quote a path with unusual character like other
commands (like "git diff", "git status") do, but did quote when run
from a subdirectory, both of which has been corrected.
* mt/grep-cquote-path:
grep: follow conventions for printing paths w/ unusual chars
The "--decorate-refs" and "--decorate-refs-exclude" options "git
log" takes have learned a companion configuration variable
log.excludeDecoration that sits at the lowest priority in the
family.
* ds/log-exclude-decoration-config:
log: add log.excludeDecoration config option
log-tree: make ref_filter_match() a helper method
"git diff-tree --pretty --notes" used to hit an assertion failure,
as it forgot to initialize the notes subsystem.
* tb/diff-tree-with-notes:
diff-tree.c: load notes machinery when required
Allowing the user to split a patch hunk while "git stash -p" does
not work well; a band-aid has been added to make this (partially)
work better.
* js/stash-p-fix:
stash -p: (partially) fix bug concerning split hunks
t3904: fix incorrect demonstration of a bug
Code in builtin/*, i.e. those can only be called from within
built-in subcommands, that implements bulk of a couple of
subcommands have been moved to libgit.a so that they could be used
by others.
* dl/libify-a-few:
Lib-ify prune-packed
Lib-ify fmt-merge-msg
"git diff" in a partial clone learned to avoid lazy loading blob
objects in more casese when they are not needed.
* jt/avoid-prefetch-when-able-in-diff:
diff: restrict when prefetching occurs
diff: refactor object read
diff: make diff_populate_filespec_options struct
promisor-remote: accept 0 as oid_nr in function
"git commit-graph write --expire-time=<timestamp>" did not use the
given timestamp correctly, which has been corrected.
* ds/commit-graph-expiry-fix:
commit-graph: fix buggy --expire-time option
"git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap"
* jc/log-no-mailmap:
log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap
clone: reorder --recursive/--recurse-submodules
parse-options: teach "git cmd -h" to show alias as alias
The config API made mixed uses of int and size_t types to represent
length of various pieces of text it parsed, which has been updated
to use the correct type (i.e. size_t) throughout.
* jk/config-use-size-t:
config: reject parsing of files over INT_MAX
config: use size_t to store parsed variable baselen
git_config_parse_key(): return baselen as size_t
config: drop useless length variable in write_pair()
parse_config_key(): return subsection len as size_t
remote: drop auto-strlen behavior of make_branch() and make_rewrite()
Validation of push certificate has been made more robust against
timing attacks.
* bc/constant-memequal:
receive-pack: compilation fix
builtin/receive-pack: use constant-time comparison for HMAC value
Just like 47abd85ba0 (fetch: Strip usernames from url's before storing
them, 2009-04-17) and later 882d49ca5c (push: anonymize URL in status
output, 2016-07-13), and even later c1284b21f2 (curl: anonymize URLs
in error messages and warnings, 2019-03-04) this change anonymizes URLs
(read: strips them of user names and especially passwords) in
user-facing error messages and warnings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a03b55530a (merge: teach --autostash option, 2020-04-07), the
--autostash option was introduced for `git merge`. Notably, when
`git merge --quit` is run with an autostash entry present, it is saved
into the stash reflog. This is contrasted with the current behaviour of
`git rebase --quit` where the autostash entry is simply just dropped out
of existence.
Adopt the behaviour of `git merge --quit` in `git rebase --quit` and
save the autostash entry into the stash reflog instead of just deleting
it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the usage for `git push` is shown, it includes the following
lines
--recurse-submodules[=(check|on-demand|no)]
control recursive pushing of submodules
which seem to indicate that the argument for --recurse-submodules is
optional. However, we cannot actually run that optiion without an
argument:
$ git push --recurse-submodules
fatal: recurse-submodules missing parameter
Unset PARSE_OPT_OPTARG so that it is clear that this option requires an
argument. Since the parse-options machinery guarantees that an argument
is present now, assume that `arg` is set in the else of
option_parse_recurse_submodules().
Reported-by: Andrew White <andrew.white@audinate.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, there are many options which use OPTION_CALLBACK in a
plain ol' struct definition. However, we have the OPT_CALLBACK and
OPT_CALLBACK_F macros which are meant to abstract these plain struct
definitions away. These macros are useful as they semantically signal to
developers that these are just normal callback option with nothing fancy
happening.
Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or
OPT_CALLBACK_F where applicable. The heavy lifting was done using the
following (disgusting) shell script:
#!/bin/sh
do_replacement () {
tr '\n' '\r' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' |
tr '\r' '\n'
}
for f in $(git ls-files \*.c)
do
do_replacement <"$f" >"$f.tmp"
mv "$f.tmp" "$f"
done
The result was manually inspected and then reformatted to match the
style of the surrounding code. Finally, using
`git grep OPTION_CALLBACK \*.c`, leftover results which were not handled
by the script were manually transformed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--root implies we want to rebase all commits since the beginning of
history. --fork-point means we want to use the reflog of the specified
upstream to find the best common ancestor between <upstream> and
<branch> and only rebase commits since that common ancestor. These
options are clearly contradictory, so throw an error (instead of
segfaulting on a NULL pointer) if both are specified.
Reported-by: Alexander Berg <alexander.berg@atos.net>
Documentation-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
2019-01-10), the author noted that 'is_repository_shallow' produces
visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
This is a problem for e.g., fetching with '--update-shallow' in a
shallow repository with 'fetch.writeCommitGraph' enabled, since the
update to '.git/shallow' will cause Git to think that the repository
isn't shallow when it is, thereby circumventing the commit-graph
compatibility check.
This causes problems in shallow repositories with at least shallow refs
that have at least one ancestor (since the client won't have those
objects, and therefore can't take the reachability closure over commits
when writing a commit-graph).
Address this by introducing thin wrappers over 'commit_lock_file' and
'rollback_lock_file' for use specifically when the lock is held over
'.git/shallow'. These wrappers (appropriately called
'commit_shallow_file' and 'rollback_shallow_file') call into their
respective functions in 'lockfile.h', but additionally reset validity
checks used by the shallow machinery.
Replace each instance of 'commit_lock_file' and 'rollback_lock_file'
with 'commit_shallow_file' and 'rollback_shallow_file' when the lock
being held is over the '.git/shallow' file.
As a result, 'prune_shallow' can now only be called once (since
'check_shallow_file_for_update' will die after calling
'reset_repository_shallow'). But, this is OK since we only call
'prune_shallow' at most once per process.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow "git rebase" to reapply all local commits, even if the may be
already in the upstream, without checking first.
* jt/rebase-allow-duplicate:
rebase --merge: optionally skip upstreamed commits
"git rebase" (again) learns to honor "--no-keep-empty", which lets
the user to discard commits that are empty from the beginning (as
opposed to the ones that become empty because of rebasing). The
interactive rebase also marks commits that are empty in the todo.
* en/rebase-no-keep-empty:
rebase: fix an incompatible-options error message
rebase: reinstate --no-keep-empty
rebase -i: mark commits that begin empty in todo editor
The interactive input from various codepaths are consolidated and
any prompt possibly issued earlier are fflush()ed before we read.
* js/flush-prompt-before-interative-input:
interactive: explicitly `fflush` stdout before expecting input
interactive: refactor code asking the user for interactive input
The output from "git format-patch" uses RFC 2047 encoding for
non-ASCII letters on From: and Subject: headers, so that it can
directly be fed to e-mail programs. A new option has been added
to produce these headers in raw.
* eb/format-patch-no-encode-headers:
format-patch: teach --no-encode-email-headers
"git rebase" learned the "--no-gpg-sign" option to countermand
commit.gpgSign the user may have.
* dd/no-gpg-sign:
Documentation: document merge option --no-gpg-sign
Documentation: merge commit-tree --[no-]gpg-sign
Documentation: reword commit --no-gpg-sign
Documentation: document am --no-gpg-sign
cherry-pick/revert: honour --no-gpg-sign in all case
rebase.c: honour --no-gpg-sign
The logic to auto-follow tags by "git clone --single-branch" was
not careful to avoid lazy-fetching unnecessary tags, which has been
corrected.
* jk/use-quick-lookup-in-clone-for-tag-following:
clone: use "quick" lookup while following tags
Code cleanup.
* jk/oid-array-cleanups:
oidset: stop referring to sha1-array
ref-filter: stop referring to "sha1 array"
bisect: stop referring to sha1_array
test-tool: rename sha1-array to oid-array
oid_array: rename source file from sha1-array
oid_array: use size_t for iteration
oid_array: use size_t for count and allocation
"git pull --rebase" tried to run a rebase even after noticing that
the pull results in a fast-forward and no rebase is needed nor
sensible, for the past few years due to a mistake nobody noticed.
* en/pull-do-not-rebase-after-fast-forwarding:
pull: avoid running both merge and rebase
"git pull" shares many options with underlying "git fetch", but
some of them were not documented and some of those that would make
sense to pass down were not passed down.
* rs/pull-options-sync-code-and-doc:
pull: pass documented fetch options on
pull: remove --update-head-ok from documentation
Simplify the commit ancestry connectedness check in a partial clone
repository in which "promised" objects are assumed to be obtainable
lazily on-demand from promisor remote repositories.
* jt/connectivity-check-optim-in-partial-clone:
connected: always use partial clone optimization
We do not use C99 "for loop initial declaration" in our codebase
(yet), but one snuck in.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since its introduction in 7249e91 (revision.c: support --notes
command-line option, 2011-03-29), combining '--notes' with any option
that causes us to format notes (e.g., '--pretty', '--format="%N"', etc)
results in a failed assertion at runtime.
$ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
Aborted
This failure is due to diff-tree not calling 'load_display_notes' to
initialize the notes machinery.
Ordinarily, this failure isn't triggered, because it requires passing
both '--notes' and another of the above mentioned options. In the case
of '--pretty', for example, we set 'opt->verbose_header', causing
'show_log()' to eventually call 'format_display_notes()', which expects
a non-NULL 'display_note_trees'.
Without initializing the notes machinery, 'display_note_trees' remains
NULL, and thus triggers an assertion failure.
Fix this by initializing the notes machinery after parsing our options,
and harden this behavior against regression with a test in t4013. (Note
that the added ref in this test requires updating two unrelated tests
which use 'log --all', and thus need to learn about the new refs).
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep does not follow the conventions used by other Git commands when
printing paths that contain unusual characters (as double-quotes or
newlines). Commands such as ls-files, commit, status and diff will:
- Quote and escape unusual pathnames, by default.
- Print names verbatim and unquoted when "-z" is used.
But grep *never* quotes/escapes absolute paths with unusual chars and
*always* quotes/escapes relative ones, even with "-z". Besides being
inconsistent in its own output, the deviation from other Git commands
can be confusing. So let's make it follow the two rules above and add
some tests for this new behavior. Note that, making grep quote/escape
all unusual paths by default, also make it fully compliant with the
core.quotePath configuration, which is currently ignored for absolute
paths.
Reported-by: Greg Hurrell <greg@hurrell.net>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters help reduce the amount of tree
parsing required during history queries. Before calculating a
diff, we can ask the filter if a path changed between a commit
and its first parent. If the filter says "no" then we can move
on without parsing trees. If the filter says "maybe" then we
parse trees to discover if the answer is actually "yes" or "no".
When computing a blame, there is a section in find_origin() that
computes a diff between a commit and one of its parents. When this
is the first parent, we can check the Bloom filters before calling
diff_tree_oid().
In order to make this work with the blame machinery, we need to
initialize a struct bloom_key with the initial path. But also, we
need to add more keys to a list if a rename is detected. We then
check to see if _any_ of these keys answer "maybe" in the diff.
During development, I purposefully left out this "add a new key
when a rename is detected" to see if the test suite would catch
my error. That is how I discovered the issues with
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS from the previous change.
With that change, we can feel some confidence in the coverage of
this change.
If a user requests copy detection using "git blame -C", then there
are more places where the set of "important" files can expand. I
do not know enough about how this happens in the blame machinery.
Thus, the Bloom filter integration is explicitly disabled in this
mode. A later change could expand the bloom_key data with an
appropriate call (or calls) to add_bloom_key().
If we did not disable this mode, then the following tests would
fail:
t8003-blame-corner-cases.sh
t8011-blame-split-file.sh
Generally, this is a performance enhancement and should not
change the behavior of 'git blame' in any way. If a repo has a
commit-graph file with computed changed-path Bloom filters, then
they should notice improved performance for their 'git blame'
commands.
Here are some example timings that I found by blaming some paths
in the Linux kernel repository:
git blame arch/x86/kernel/topology.c >/dev/null
Before: 0.83s
After: 0.24s
git blame kernel/time/time.c >/dev/null
Before: 0.72s
After: 0.24s
git blame tools/perf/ui/stdio/hist.c >/dev/null
Before: 0.27s
After: 0.11s
I specifically looked for "deep" paths that were also edited many
times. As a counterpoint, the MAINTAINERS file was edited many
times but is located in the root tree. This means that the cost of
computing a diff relative to the pathspec is very small. Here are
the timings for that command:
git blame MAINTAINERS >/dev/null
Before: 20.1s
After: 18.0s
These timings are the best of five. The worst-case runs were on the
order of 2.5 minutes for both cases. Note that the MAINTAINERS file
has 18,740 lines across 17,000+ commits. This happens to be one of
the cases where this change provides the least improvement.
The lack of improvement for the MAINTAINERS file and the relatively
modest improvement for the other examples can be easily explained.
The blame machinery needs to compute line-level diffs to determine
which lines were changed by each commit. That makes up a large
proportion of the computation time, and this change does not
attempt to improve on that section of the algorithm. The
MAINTAINERS file is large and changed often, so it takes time to
determine which lines were updated by which commit. In contrast,
the code files are much smaller, and it takes longer to comute
the line-by-line diff for a single patch on the Linux mailing
lists.
Outside of the "-C" integration, I believe there is little more to
gain from the changed-path Bloom filters for 'git blame' after this
patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_COMMIT_GRAPH environment variable updates the commit-
graph file whenever "git commit" is run, ensuring that we always
have an updated commit-graph throughout the test suite. The
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS environment variable was
introduced to write the changed-path Bloom filters whenever "git
commit-graph write" is run. However, the GIT_TEST_COMMIT_GRAPH
trick doesn't launch a separate process and instead writes it
directly.
To expand the number of tests that have commits in the commit-graph
file, add a helper method that computes the commit-graph and place
that helper inside "git commit" and "git merge".
In the helper method, check GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS
to ensure we are writing changed-path Bloom filters whenever
possible.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting in 3ac68a93fd, help.o began to depend on builtin/branch.o,
builtin/clean.o, and builtin/config.o. This meant that help.o was
unusable outside of the context of the main Git executable.
To make help.o usable by other commands again, move list_config_help()
into builtin/help.c (where it makes sense to assume other builtin libraries
are present).
When command-list.h is included but a member is not used, we start to
hear a compiler warning. Since the config list is generated in a fairly
different way than the command list, and since commands and config
options are semantically different, move the config list into its own
header and move the generator into its own script and build rule.
For reasons explained in 976aaedc (msvc: add a Makefile target to
pre-generate the Visual Studio solution, 2019-07-29), some build
artifacts we consider non-source files cannot be generated in the
Visual Studio environment, and we already have some Makefile tweaks
to help Visual Studio to use generated command-list.h header file.
Do the same to a new generated file, config-list.h, introduced by
this change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
In 'git log', the --decorate-refs-exclude option appends a pattern
to a string_list. This list is used to prevent showing some refs
in the decoration output, or even by --simplify-by-decoration.
Users may want to use their refs space to store utility refs that
should not appear in the decoration output. For example, Scalar [1]
runs a background fetch but places the "new" refs inside the
refs/scalar/hidden/<remote>/* refspace instead of refs/<remote>/*
to avoid updating remote refs when the user is not looking. However,
these "hidden" refs appear during regular 'git log' queries.
A similar idea to use "hidden" refs is under consideration for core
Git [2].
Add the 'log.excludeDecoration' config option so users can exclude
some refs from decorations by default instead of needing to use
--decorate-refs-exclude manually. The config value is multi-valued
much like the command-line option. The documentation is careful to
point out that the config value can be overridden by the
--decorate-refs option, even though --decorate-refs-exclude would
always "win" over --decorate-refs.
Since the 'log.excludeDecoration' takes lower precedence to
--decorate-refs, and --decorate-refs-exclude takes higher
precedence, the struct decoration_filter needed another field.
This led also to new logic in load_ref_decorations() and
ref_filter_match().
There are several tests in t4202-log.sh that test the
--decorate-refs-(include|exclude) options, so these are extended.
Since the expected output is already stored as a file, most tests
could simply replace a "--decorate-refs-exclude" option with an
in-line config setting. Other tests involve the precedence of
the config option compared to command-line options and needed more
modification.
[1] https://github.com/microsoft/scalar
[2] https://lore.kernel.org/git/77b1da5d3063a2404cd750adfe3bb8be9b6c497d.1585946894.git.gitgitgadget@gmail.com/
Helped-by: Junio C Hamano <gister@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When operating on a stream of commit OIDs on stdin, 'git commit-graph
write' checks that each OID refers to an object that is indeed a commit.
This is convenient to make sure that the given input is well-formed, but
can sometimes be undesirable.
For example, server operators may wish to feed the refnames that were
updated during a push to 'git commit-graph write --input=stdin-commits',
and silently discard refs that don't point at commits. This can be done
by combing the output of 'git for-each-ref' with '--format
%(*objecttype)', but this requires opening up a potentially large number
of objects. Instead, it is more convenient to feed the updated refs to
the commit-graph machinery, and let it throw out refs that don't point
to commits.
Introduce '--[no-]check-oids' to make such a behavior possible. With
'--check-oids' (the default behavior to retain backwards compatibility),
'git commit-graph write' will barf on a non-commit line in its input.
With 'no-check-oids', such lines will be silently ignored, making the
above possible by specifying this option.
No matter which is supplied, 'git commit-graph write' retains the
behavior from the previous commit of rejecting non-OID inputs like
"HEAD" and "refs/heads/foo" as before.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'write_commit_graph()' function takes in either a string list of
pack indices, or a string list of hexadecimal commit OIDs. These
correspond to the '--stdin-packs' and '--stdin-commits' mode(s) from
'git commit-graph write'.
Using a string_list of hexadecimal commit IDs is not the most efficient
use of memory, since we can instead use the 'struct oidset', which is
more well-suited for this case.
This has another benefit which will become apparent in the following
commit. This is that we are about to disambiguate the kinds of errors we
produce with '--stdin-commits' into "non-hex input" and "hex-input, but
referring to a non-commit object". By having 'write_commit_graph' take
in a 'struct oidset *' of commits, we place the burden on the caller (in
this case, the builtin) to handle the first case, and the commit-graph
machinery can handle the second case.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using split commit-graphs, it is sometimes useful to completely
replace the commit-graph chain with a new base.
For example, consider a scenario in which a repository builds a new
commit-graph incremental for each push. Occasionally (say, after some
fixed number of pushes), they may wish to rebuild the commit-graph chain
with all reachable commits.
They can do so with
$ git commit-graph write --reachable
but this removes the chain entirely and replaces it with a single
commit-graph in 'objects/info/commit-graph'. Unfortunately, this means
that the next push will have to move this commit-graph into the first
layer of a new chain, and then write its new commits on top.
Avoid such copying entirely by allowing the caller to specify that they
wish to replace the entirety of their commit-graph chain, while also
specifying that the new commit-graph should become the basis of a fresh,
length-one chain.
This addresses the above situation by making it possible for the caller
to instead write:
$ git commit-graph write --reachable --split=replace
which writes a new length-one chain to 'objects/info/commit-graphs',
making the commit-graph incremental generated by the subsequent push
relatively cheap by avoiding the aforementioned copy.
In order to do this, remove an assumption in 'write_commit_graph_file'
that chains are always at least two incrementals long.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we laid the groundwork for supporting different
splitting strategies. In this commit, we introduce the first splitting
strategy: 'no-merge'.
Passing '--split=no-merge' is useful for callers which wish to write a
new incremental commit-graph, but do not want to spend effort condensing
the incremental chain [1]. Previously, this was possible by passing
'--size-multiple=0', but this no longer the case following 63020f175f
(commit-graph: prefer default size_mult when given zero, 2020-01-02).
When '--split=no-merge' is given, the commit-graph machinery will never
condense an existing chain, and it will always write a new incremental.
[1]: This might occur when, for example, a server administrator running
some program after each push may want to ensure that each job runs
proportional in time to the size of the push, and does not "jump" when
the commit-graph machinery decides to trigger a merge.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With '--split', the commit-graph machinery writes new commits in another
incremental commit-graph which is part of the existing chain, and
optionally decides to condense the chain into a single commit-graph.
This is done to ensure that the asymptotic behavior of looking up a
commit in an incremental chain is not dominated by the number of
incrementals in that chain. It can be controlled by the '--max-commits'
and '--size-multiple' options.
In the next two commits, we will introduce additional splitting
strategies that can exert additional control over:
- when a split commit-graph is and isn't written, and
- when the existing commit-graph chain is discarded completely and
replaced with another graph
To prepare for this, make '--split' take an optional strategy (as in
'--split[=<strategy>]'), and add a new enum to describe which strategy
is being used. For now, no strategies are given, and the only enumerated
value is 'COMMIT_GRAPH_SPLIT_UNSPECIFIED', indicating the absence of a
strategy.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using `starts_with()`, the magic number 7, `strlen()` and a
fair number of additions to verify the three parts of the config key
"branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them
more explicitly.
We need to introduce a new variable for this (we certainly can't modify
`k` just because we see "branch."!). With `skip_prefix()` we often use
quite bland names like `p` or `str`. Let's do the same. If and when this
function needs to do more prefix-skipping, we'll have a generic variable
ready for this.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When rebasing against an upstream that has had many commits since the
original branch was created:
O -- O -- ... -- O -- O (upstream)
\
-- O (my-dev-branch)
it must read the contents of every novel upstream commit, in addition to
the tip of the upstream and the merge base, because "git rebase"
attempts to exclude commits that are duplicates of upstream ones. This
can be a significant performance hit, especially in a partial clone,
wherein a read of an object may end up being a fetch.
Add a flag to "git rebase" to allow suppression of this feature. This
flag only works when using the "merge" backend.
This flag changes the behavior of sequencer_make_script(), called from
do_interactive_rebase() <- run_rebase_interactive() <-
run_specific_rebase() <- cmd_rebase(). With this flag, limit_list()
(indirectly called from sequencer_make_script() through
prepare_revision_walk()) will no longer call cherry_pick_list(), and
thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both
means that the intermediate commits in upstream are no longer read (as
shown by the test) and means that no PATCHSAME-caused skipping of
commits is done by sequencer_make_script(), either directly or through
make_script_with_merges().
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the user specifies the apply backend with options that only work
with the merge backend, such as
git rebase --apply --exec /bin/true HEAD~3
the error message has always been
fatal: --exec requires an interactive rebase
This error message is misleading and was one of the reasons we renamed
the interactive backend to the merge backend. Update the error message
to state that these options merely require use of the merge backend.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit d48e5e21da ("rebase (interactive-backend): make --keep-empty the
default", 2020-02-15) turned --keep-empty (for keeping commits which
start empty) into the default. The logic underpinning that commit was:
1) 'git commit' errors out on the creation of empty commits without an
override flag
2) Once someone determines that the override is worthwhile, it's
annoying and/or harmful to required them to take extra steps in
order to keep such commits around (and to repeat such steps with
every rebase).
While the logic on which the decision was made is sound, the result was
a bit of an overcorrection. Instead of jumping to having --keep-empty
being the default, it jumped to making --keep-empty the only available
behavior. There was a simple workaround, though, which was thought to
be good enough at the time. People could still drop commits which
started empty the same way the could drop any commits: by firing up an
interactive rebase and picking out the commits they didn't want from the
list. However, there are cases where external tools might create enough
empty commits that picking all of them out is painful. As such, having
a flag to automatically remove start-empty commits may be beneficial.
Provide users a way to drop commits which start empty using a flag that
existed for years: --no-keep-empty. Interpret --keep-empty as
countermanding any previous --no-keep-empty, but otherwise leaving
--keep-empty as the default.
This might lead to some slight weirdness since commands like
git rebase --empty=drop --keep-empty
git rebase --empty=keep --no-keep-empty
look really weird despite making perfect sense (the first will drop
commits which become empty, but keep commits that started empty; the
second will keep commits which become empty, but drop commits which
started empty). However, --no-keep-empty was named years ago and we are
predominantly keeping it for backward compatibility; also we suspect it
will only be used rarely since folks already have a simple way to drop
commits they don't want with an interactive rebase.
Reported-by: Bryan Turner <bturner@atlassian.com>
Reported-by: Sami Boukortt <sami@boukortt.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We return the length to a subset of a string using an "int *"
out-parameter. This is fine most of the time, as we'd expect config keys
to be relatively short, but it could behave oddly if we had a gigantic
config key. A more appropriate type is size_t.
Let's switch over, which lets our callers use size_t as appropriate
(they are bound by our type because they must pass the out-parameter as
a pointer). This is mostly just a cleanup to make it clear this code
handles long strings correctly. In practice, our config parser already
chokes on long key names (because of a similar int/size_t mixup!).
When doing an int/size_t conversion, we have to be careful that nobody
was trying to assign a negative value to the variable. I manually
confirmed that for each case here. They tend to just feed the result to
xmemdupz() or similar; in a few cases I adjusted the parameter types for
helper functions to make sure the size_t is preserved.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are quite a few code locations (e.g. `git clean --interactive`)
where Git asks the user for an answer. In preparation for fixing a bug
shared by all of them, and also to DRY up the code, let's refactor it.
Please note that most of these callers trimmed white-space both at the
beginning and at the end of the answer, instead of trimming only the
end (as the caller in `add-patch.c` does).
Therefore, technically speaking, we change behavior in this patch. At
the same time, it can be argued that this is actually a bug fix.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, `--autostash` only worked with `git pull --rebase`. However, in
the last patch, merge learned `--autostash` as well so there's no reason
why we should have this restriction anymore. Teach pull to pass
`--autostash` to merge, just like it did for rebase.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In rebase, one can pass the `--autostash` option to cause the worktree
to be automatically stashed before continuing with the rebase. This
option is missing in merge, however.
Implement the `--autostash` option and corresponding `merge.autoStash`
option in merge which stashes before merging and then pops after.
This option is useful when a developer has some local changes on a topic
branch but they realize that their work depends on another branch.
Previously, they had to run something like
git fetch ...
git stash push
git merge FETCH_HEAD
git stash pop
but now, that is reduced to
git fetch ...
git merge --autostash FETCH_HEAD
When an autostash is generated, it is automatically reapplied to the
worktree only in three explicit situations:
1. An incomplete merge is commit using `git commit`.
2. A merge completes successfully.
3. A merge is aborted using `git merge --abort`.
In all other situations where the merge state is removed using
remove_merge_branch_state() such as aborting a merge via
`git reset --hard`, the autostash is saved into the stash reflog
instead keeping the worktree clean.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Suggested-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lib-ify the autostash code by extracting perform_autostash() from rebase
into sequencer. In a future commit, this will be used to implement
`--autostash` in other builtins.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on lib-ifying create_autostash() so we need it to
be more generic. Make it more generic by making it accept a
`struct repository` argument instead of implicitly using the non-repo
functions and `the_repository`. Also, make it accept a `path` argument
so that we no longer rely have to rely on `struct rebase_options`.
Finally, make it accept a `default_reflog_action` argument so we no
longer have to rely on `DEFAULT_REFLOG_ACTION`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future commit, we will lib-ify this code. In preparation for
this, extract the code into the create_autostash() function so that it
can be cleaned up before it is finally lib-ified.
This patch is best viewed with `--color-moved` and
`--color-moved-ws=allow-indentation-change`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue the process of lib-ifying the autostash code. In a future
commit, this will be used to implement `--autostash` in other builtins.
This patch is best viewed with `--color-moved`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we plan on lib-ifying reset_head() so we need it to
be more generic. Make it more generic by making it accept a
`struct repository` argument instead of implicitly using the non-repo
functions. Also, make it accept a `const char *default_reflog_action`
argument so that the default action of "rebase" isn't hardcoded in.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The apply_autostash() function in builtin/rebase.c is similar enough to
the apply_autostash() function in sequencer.c that they are almost
interchangeable, except for the type of arg they accept. Make the
sequencer.c version extern and use it in rebase.
The rebase version was introduced in 6defce2b02 (builtin rebase: support
`--autostash` option, 2018-09-04) as part of the shell to C conversion.
It opted to duplicate the function because, at the time, there was
another in-progress project converting interactive rebase from shell to
C as well and they did not want to clash with them by refactoring
sequencer.c version of apply_autostash(). Since both efforts are long
done, we can freely combine them together now.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since in sequencer.c, read_one() basically duplicates the functionality
of read_oneliner(), reduce code duplication by replacing read_one() with
read_oneliner().
This was done with the following Coccinelle script
@@
expression a, b;
@@
- read_one(a, b)
+ !read_oneliner(b, a, READ_ONELINER_WARN_MISSING)
and long lines were manually broken up.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're comparing a push cert nonce, we currently do so using strcmp.
Most implementations of strcmp short-circuit and exit as soon as they
know whether two values are equal. This, however, is a problem when
we're comparing the output of HMAC, as it leaks information in the time
taken about how much of the two values match if they do indeed differ.
In our case, the nonce is used to prevent replay attacks against our
server via the embedded timestamp and replay attacks using requests from
a different server via the HMAC. Push certs, which contain the nonces,
are signed, so an attacker cannot tamper with the nonces without
breaking validation of the signature. They can, of course, create their
own signatures with invalid nonces, but they can also create their own
signatures with valid nonces, so there's nothing to be gained. Thus,
there is no security problem.
Even though it doesn't appear that there are any negative consequences
from the current technique, for safety and to encourage good practices,
let's use a constant time comparison function for nonce verification.
POSIX does not provide one, but they are easy to write.
The technique we use here is also used in NaCl and the Go standard
library and relies on the fact that bitwise or and xor are constant time
on all known architectures.
We need not be concerned about exiting early if the actual and expected
lengths differ, since the standard cryptographic assumption is that
everyone, including an attacker, knows the format of and algorithm used
in our nonces (and in any event, they have the source code and can
determine it easily). As a result, we assume everyone knows how long
our nonces should be. This philosophy is also taken by the Go standard
library and other cryptographic libraries when performing constant time
comparisons on HMAC values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When trying to stash part of the worktree changes by splitting a hunk
and then only partially accepting the split bits and pieces, the user
is presented with a rather cryptic error:
error: patch failed: <file>:<line>
error: test: patch does not apply
Cannot remove worktree changes
and the command would fail to stash the desired parts of the worktree
changes (even if the `stash` ref was actually updated correctly).
We even have a test case demonstrating that failure, carrying it for
four years already.
The explanation: when splitting a hunk, the changed lines are no longer
separated by more than 3 lines (which is the amount of context lines
Git's diffs use by default), but less than that. So when staging only
part of the diff hunk for stashing, the resulting diff that we want to
apply to the worktree in reverse will contain those changes to be
dropped surrounded by three context lines, but since the diff is
relative to HEAD rather than to the worktree, these context lines will
not match.
Example time. Let's assume that the file README contains these lines:
We
the
people
and the worktree added some lines so that it contains these lines
instead:
We
are
the
kind
people
and the user tries to stash the line containing "are", then the command
will internally stage this line to a temporary index file and try to
revert the diff between HEAD and that index file. The diff hunk that
`git stash` tries to revert will look somewhat like this:
@@ -1776,3 +1776,4
We
+are
the
people
It is obvious, now, that the trailing context lines overlap with the
part of the original diff hunk that the user did *not* want to stash.
Keeping in mind that context lines in diffs serve the primary purpose of
finding the exact location when the diff does not apply precisely (but
when the exact line number in the file to be patched differs from the
line number indicated in the diff), we work around this by reducing the
amount of context lines: the diff was just generated.
Note: this is not a *full* fix for the issue. Just as demonstrated in
t3701's 'add -p works with pathological context lines' test case, there
are ambiguities in the diff format. It is very rare in practice, of
course, to encounter such repeated lines.
The full solution for such cases would be to replace the approach of
generating a diff from the stash and then applying it in reverse by
emulating `git revert` (i.e. doing a 3-way merge). However, in `git
stash -p` it would not apply to `HEAD` but instead to the worktree,
which makes this non-trivial to implement as long as we also maintain a
scripted version of `add -i`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commit subjects or authors have non-ASCII characters, git
format-patch Q-encodes them so they can be safely sent over email.
However, if the patch transfer method is something other than email (web
review tools, sneakernet), this only serves to make the patch metadata
harder to read without first applying it (unless you can decode RFC 2047
in your head). git am as well as some email software supports
non-Q-encoded mail as described in RFC 6531.
Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the
user control this behavior.
Signed-off-by: Emma Brooks <me@pluvano.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag to the test setup suite
in order to toggle writing Bloom filters when running any of the git tests.
If set to true, we will compute and write Bloom filters every time a test
calls `git commit-graph write`, as if the `--changed-paths` option was
passed in.
The test suite passes when GIT_TEST_COMMIT_GRAPH and
GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS are enabled.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --changed-paths option to git commit-graph write. This option will
allow users to compute information about the paths that have changed
between a commit and its first parent, and write it into the commit graph
file. If the option is passed to the write subcommand we set the
COMMIT_GRAPH_WRITE_BLOOM_FILTERS flag and pass it down to the
commit-graph logic.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are 3 callers to promisor_remote_get_direct() that first check if
the number of objects to be fetched is equal to 0. Fold that check into
promisor_remote_get_direct(), and in doing so, be explicit as to what
promisor_remote_get_direct() does if oid_nr is 0 (it returns 0, success,
immediately).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) command can only handle queueing transactions
right now via its "--stdin" parameter, but there is no way for users to
handle the transaction itself in a more explicit way. E.g. in a
replicated scenario, one may imagine a coordinator that spawns
git-update-ref(1) for multiple repositories and only if all agree that
an update is possible will the coordinator send a commit. Such a
transactional session could look like
> start
< start: ok
> update refs/heads/master $OLD $NEW
> prepare
< prepare: ok
# All nodes have returned "ok"
> commit
< commit: ok
or
> start
< start: ok
> create refs/heads/master $OLD $NEW
> prepare
< fatal: cannot lock ref 'refs/heads/master': reference already exists
# On all other nodes:
> abort
< abort: ok
In order to allow for such transactional sessions, this commit
introduces four new commands for git-update-ref(1), which matches those
we have internally already with the exception of "start":
- start: start a new transaction
- prepare: prepare the transaction, that is try to lock all
references and verify their current value matches the
expected one
- commit: explicitly commit a session, that is update references to
match their new expected state
- abort: abort a session and roll back all changes
By design, git-update-ref(1) will commit as soon as standard input is
being closed. While fine in a non-transactional world, it is definitely
unexpected in a transactional world. Because of this, as soon as any of
the new transactional commands is used, the default will change to
aborting without an explicit "commit". To avoid a race between queueing
updates and the first "prepare" that starts a transaction, the "start"
command has been added to start an explicit transaction.
Add some tests to exercise this new functionality.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-update-ref(1) supports a `--stdin` mode that allows it to read
all reference updates from standard input. This is mainly used to allow
for atomic reference updates that are all or nothing, so that either all
references will get updated or none.
Currently, git-update-ref(1) reads all commands as a single block of up
to 1000 characters and only starts processing after stdin gets closed.
This is less flexible than one might wish for, as it doesn't really
allow for longer-lived transactions and doesn't allow any verification
without committing everything. E.g. one may imagine the following
exchange:
> start
< start: ok
> update refs/heads/master $NEWOID1 $OLDOID1
> update refs/heads/branch $NEWOID2 $OLDOID2
> prepare
< prepare: ok
> commit
< commit: ok
When reading all input as a whole block, the above interactive protocol
is obviously impossible to achieve. But by converting the command to
read commands linewise, we can make it more interactive than before.
Obviously, the linewise interface is only a first step in making
git-update-ref(1) work in a more transaction-oriented way. Missing is
most importantly support for transactional commands that manage the
current transaction.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the actual logic to update the transaction is handled in
`update_refs_stdin()`, the transaction itself is started and committed
in `cmd_update_ref()` itself. This makes it hard to handle transaction
abortion and commits as part of `update_refs_stdin()` itself, which is
required in order to introduce transaction handling features to `git
update-refs --stdin`.
Refactor the code to move all transaction handling into
`update_refs_stdin()` to prepare for transaction handling features.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently pass both an `strbuf` containing the current command line
as well as the `next` pointer pointing to the first argument to
commands. This is both confusing and makes code more intertwined.
Convert this to use a simple pointer as well as a pointer pointing to
the end of the input as a preparatory step to line-wise reading of
stdin.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `parse_refname` function accepts a `struct strbuf *input` argument
that isn't used at all. As we're about to convert commands to not use a
strbuf anymore but instead an end pointer, let's drop this argument now
to make the converting commit easier to review.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently manually wire up all commands known to `git-update-ref
--stdin`, making it harder than necessary to preprocess arguments after
the command is determined. To make this more extensible, let's refactor
the code to use an array of known commands instead. While this doesn't
add a lot of value now, it is a preparatory step to implement line-wise
reading of commands.
As we're going to introduce commands without trailing spaces, this
commit also moves whitespace parsing into the respective commands.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph builtin has an --expire-time option that takes a
datetime using OPT_EXPIRY_DATE(). However, the implementation inside
expire_commit_graphs() was treating a non-zero value as a number of
seconds to subtract from "now".
Update t5323-split-commit-graph.sh to demonstrate the correct value
of the --expire-time option by actually creating a crud .graph file
with mtime earlier than the expire time. Instead of using a super-
early time (1980) we use an explicit, and recent, time. Using
test-tool chmtime to create two files on either end of an exact
second, we create a test that catches this failure no matter the
current time. Using a fixed date is more portable than trying to
format a relative date string into the --expiry-date input.
I noticed this when inspecting some Scalar repos that had an excess
number of commit-graph files. In Scalar, we were using this second
interpretation by using "--expire-time=3600" to mean "delete graphs
older than one hour ago" to avoid deleting a commit-graph that a
foreground process may be trying to load.
Also I noticed that the help text was copied from the --max-commits
option. Fix that help text.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the expected calling convention for the dir.c API was:
fill_directory(&dir, ..., pathspec)
foreach entry in dir->entries:
if (dir_path_match(entry, pathspec))
process_or_display(entry)
This may have made sense once upon a time, because the fill_directory() call
could use cheap checks to avoid doing full pathspec matching, and an external
caller may have wanted to do other post-processing of the results anyway.
However:
* this structure makes it easy for users of the API to get it wrong
* this structure actually makes it harder to understand
fill_directory() and the functions it uses internally. It has
tripped me up several times while trying to fix bugs and
restructure things.
* relying on post-filtering was already found to produce wrong
results; pathspec matching had to be added internally for multiple
cases in order to get the right results (see commits 404ebceda0
(dir: also check directories for matching pathspecs, 2019-09-17)
and 89a1f4aaf7 (dir: if our pathspec might match files under a
dir, recurse into it, 2019-09-17))
* it's bad for performance: fill_directory() already has to do lots
of checks and knows the subset of cases where it still needs to do
more checks. Forcing external callers to do full pathspec
matching means they must re-check _every_ path.
So, add the pathspec matching within the fill_directory() internals, and
remove it from external callers.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>