Provide commit metadata for checkout code paths that use unpack_trees
and friends. When we're checking out a commit, use the commit
information, but don't provide commit information if we're checking out
from the index, since there need not be any particular commit associated
with the index, and even if there is one, we can't know what it is.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have the codebase wired up to pass any additional metadata
to filters, let's collect the additional metadata that we'd like to
pass.
The two main places we pass this metadata are checkouts and archives.
In these two situations, reading HEAD isn't a valid option, since HEAD
isn't updated for checkouts until after the working tree is written and
archives can accept an arbitrary tree. In other situations, HEAD will
usually reflect the refname of the branch in current use.
We pass a smaller amount of data in other cases, such as git cat-file,
where we can really only logically know about the blob.
This commit updates only the parts of the checkout code where we don't
use unpack_trees. That function and callers of it will be handled in a
future commit.
In the archive code, we leak a small amount of memory, since nothing we
pass in the archiver argument structure is freed.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a variety of situations where a filter process can make use of
some additional metadata. For example, some people find the ident
filter too limiting and would like to include the commit or the branch
in their smudged files. This information isn't available during
checkout as HEAD hasn't been updated at that point, and it wouldn't be
available in archives either.
Let's add a way to pass this metadata down to the filter. We pass the
blob we're operating on, the treeish (preferring the commit over the
tree if one exists), and the ref we're operating on. Note that we won't
pass this information in all cases, such as when renormalizing or when
we're performing diffs, since it doesn't make sense in those cases.
The data we currently get from the filter process looks like the
following:
command=smudge
pathname=git.c
0000
With this change, we'll get data more like this:
command=smudge
pathname=git.c
refname=refs/tags/v2.25.1
treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b
blob=7be7ad34bd053884ec48923706e70c81719a8660
0000
There are a couple things to note about this approach. For operations
like checkout, treeish will always be a commit, since we cannot check
out individual trees, but for other operations, like archive, we can end
up operating on only a particular tree, so we'll provide only a tree as
the treeish. Similar comments apply for refname, since there are a
variety of cases in which we won't have a ref.
This commit wires up the code to print this information, but doesn't
pass any of it at this point. In a future commit, we'll have various
code paths pass the actual useful data down.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit refactors the use of verify_signed_buffer() outside of
gpg-interface.c to use check_signature() instead. It also turns
verify_signed_buffer() into a file-local function since it's now only
invoked internally by check_signature().
There were previously two globally scoped functions used in different
parts of Git to perform GPG signature verification:
verify_signed_buffer() and check_signature(). Now only
check_signature() is used.
The verify_signed_buffer() function doesn't guard against duplicate
signatures as described by Michał Górny [1]. Instead it only ensures a
non-erroneous exit code from GPG and the presence of at least one
GOODSIG status field. This stands in contrast with check_signature()
that returns an error if more than one signature is encountered.
The lower degree of verification makes the use of verify_signed_buffer()
problematic if callers don't parse and validate the various parts of the
GPG status message themselves. And processing these messages seems like
a task that should be reserved to gpg-interface.c with the function
check_signature().
Furthermore, the use of verify_signed_buffer() makes it difficult to
introduce new functionality that relies on the content of the GPG status
lines.
Now all operations that does signature verification share a single entry
point to gpg-interface.c. This makes it easier to propagate changed or
additional functionality in GPG signature verification to all parts of
Git, without having odd edge-cases that don't perform the same degree of
verification.
[1] https://dev.gentoo.org/~mgorny/articles/attack-on-git-signature-verification.html
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Band-aid fixes for two fallouts from switching the default "rebase"
backend.
* en/rebase-backend:
git-rebase.txt: highlight backend differences with commit rewording
sequencer: clear state upon dropping a become-empty commit
i18n: unmark a message in rebase.c
In the future, we're going to want to use the branch info in
checkout_worktree, so let's pass the whole struct branch_info down, not
just the revision name. We hoist the definition of struct branch_info
so it's in scope.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit v2.25.0-4-ge98c4269c8 (rebase (interactive-backend): fix handling
of commits that become empty, 2020-02-15) marked "{drop,keep,ask}" for
translation, but this message should not be changed.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Often novice Git users forget to say "pull --rebase" and end up with an
unnecessary merge from upstream. What they usually want is either "pull
--rebase" in the simpler cases, or "pull --ff-only" to update the copy
of main integration branches, and rebase their work separately. The
pull.rebase configuration variable exists to help them in the simpler
cases, but there is no mechanism to make these users aware of it.
Issue a warning message when no --[no-]rebase option from the command
line and no pull.rebase configuration variable is given. This will
inconvenience those who never want to "pull --rebase", who haven't had
to do anything special, but the cost of the inconvenience is paid only
once per user, which should be a reasonable cost to help a number of new
users.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Together with the previous commits, this commit fully fixes the problem
of using shared buffer for `real_path()` in `get_superproject_working_tree()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Returning a shared buffer invites very subtle bugs due to reentrancy or
multi-threading, as demonstrated by the previous patch.
There was an unfinished effort to abolish this [1].
Let's finally rid of `real_path()`, using `strbuf_realpath()` instead.
This patch uses a local `strbuf` for most places where `real_path()` was
previously called.
However, two places return the value of `real_path()` to the caller. For
them, a `static` local `strbuf` was added, effectively pushing the
problem one level higher:
read_gitfile_gently()
get_superproject_working_tree()
[1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git am --short-current-patch" is a way to show the piece of e-mail
for the stopped step, which is not suitable to directly feed "git
apply" (it is designed to be a good "git am" input). It learned a
new option to show only the patch part.
* pb/am-show-current-patch:
am: support --show-current-patch=diff to retrieve .git/rebase-apply/patch
am: support --show-current-patch=raw as a synonym for--show-current-patch
am: convert "resume" variable to a struct
parse-options: convert "command mode" to a flag
parse-options: add testcases for OPT_CMDMODE()
`real_path()` returns result from a shared buffer, inviting subtle
reentrance bugs. One of these bugs occur when invoked this way:
set_git_dir(real_path(git_dir))
In this case, `real_path()` has reentrance:
real_path
read_gitfile_gently
repo_set_gitdir
setup_git_env
set_git_dir_1
set_git_dir
Later, `set_git_dir()` uses its now-dead parameter:
!is_absolute_path(path)
Fix this by using a dedicated `strbuf` to hold `strbuf_realpath()`.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the stash.useBuiltin setting which was added as an escape hatch
to disable the builtin version of stash first released with Git 2.22.
Carrying the legacy version is a maintenance burden, and has in fact
become out of date failing a test since the 2.23 release, without
anyone noticing until now. So users would be getting a hint to fall
back to a potentially buggy version of the tool.
We used to shell out to git config to get the useBuiltin configuration
to avoid changing any global state before spawning legacy-stash.
However that is no longer necessary, so just use the 'git_config'
function to get the setting instead.
Similar to what we've done in d03ebd411c ("rebase: remove the
rebase.useBuiltin setting", 2019-03-18), where we remove the
corresponding setting for rebase, we leave the documentation in place,
so people can refer back to it when searching for it online, and so we
can refer to it in the commit message.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge signed-tag" while lacking the public key started to say
"No signature", which was utterly wrong. This regression has been
reverted.
* hi/gpg-use-check-signature:
Revert "gpg-interface: prefer check_signature() for GPG verification"
"git describe" in a repository with multiple root commits sometimes
gave up looking for the best tag to describe a given commit with
too early, which has been adjusted.
* be/describe-multiroot:
describe: don't abort too early when searching tags
"git clone --recurse-submodules --single-branch" now uses the same
single-branch option when cloning the submodules.
* es/recursive-single-branch-clone:
clone: pass --single-branch during --recurse-submodules
submodule--helper: use C99 named initializer
Code cleanup to use "struct object_id" more by replacing use of
"char *sha1"
* jk/nth-packed-object-id:
packfile: drop nth_packed_object_sha1()
packed_object_info(): use object_id internally for delta base
packed_object_info(): use object_id for returning delta base
pack-check: push oid lookup into loop
pack-check: convert "internal error" die to a BUG()
pack-bitmap: use object_id when loading on-disk bitmaps
pack-objects: use object_id struct in pack-reuse code
pack-objects: convert oe_set_delta_ext() to use object_id
pack-objects: read delta base oid into object_id struct
nth_packed_object_oid(): use customary integer return
"git rebase BASE BRANCH" rebased/updated the tip of BRANCH and
checked it out, even when the BRANCH is checked out in a different
worktree. This has been corrected.
* es/do-not-let-rebase-switch-to-protected-branch:
rebase: refuse to switch to branch already checked out elsewhere
t3400: make test clean up after itself
"git push" should stop from updating a branch that is checked out
when receive.denyCurrentBranch configuration is set, but it failed
to pay attention to checkouts in secondary worktrees. This has
been corrected.
* hv/receive-denycurrent-everywhere:
t2402: test worktree path when called in .git directory
receive.denyCurrentBranch: respect all worktrees
t5509: use a bare repository for test push target
get_main_worktree(): allow it to be called in the Git directory
In rare cases "git worktree add <path>" could think that <path>
was already a registered worktree even when it wasn't and refuse
to add the new worktree. This has been corrected.
* es/worktree-avoid-duplication-fix:
worktree: don't allow "add" validation to be fooled by suffix matching
worktree: add utility to find worktree by pathname
worktree: improve find_worktree() documentation
Underlying machinery of "git bisect--helper" is being refactored
into pieces that are more easily reused.
* mr/bisect-in-c-1:
bisect: libify `bisect_next_all`
bisect: libify `handle_bad_merge_base` and its dependents
bisect: libify `check_good_are_ancestors_of_bad` and its dependents
bisect: libify `check_merge_bases` and its dependents
bisect: libify `bisect_checkout`
bisect: libify `exit_if_skipped_commits` to `error_if_skipped*` and its dependents
bisect--helper: return error codes from `cmd_bisect__helper()`
bisect: add enum to represent bisect returning codes
bisect--helper: introduce new `decide_next()` function
bisect: use the standard 'if (!var)' way to check for 0
bisect--helper: change `retval` to `res`
bisect--helper: convert `vocab_*` char pointers to char arrays
"git sparse-checkout" learned a new "add" subcommand.
* ds/sparse-add:
sparse-checkout: allow one-character directories in cone mode
sparse-checkout: work with Windows paths
sparse-checkout: create 'add' subcommand
sparse-checkout: extract pattern update from 'set' subcommand
sparse-checkout: extract add_patterns_from_input()
change the advise call in tag library from advise() to
advise_if_enabled() to construct an example of the usage of
the new API.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit we're adding another config variable to be read
from 'git_stash_config', that is valid for the top level command
instead of just a subset. Move the 'git_config' invocation for
'git_stash_config' to the top-level to prepare for that.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix for a bug revealed by a recent change to make the protocol v2
the default.
* ds/partial-clone-fixes:
partial-clone: avoid fetching when looking for objects
partial-clone: demonstrate bugs in partial fetch
"git rebase" has learned to use the merge backend (i.e. the
machinery that drives "rebase -i") by default, while allowing
"--apply" option to use the "apply" backend (e.g. the moral
equivalent of "format-patch piped to am"). The rebase.backend
configuration variable can be set to customize.
* en/rebase-backend:
rebase: rename the two primary rebase backends
rebase: change the default backend from "am" to "merge"
rebase: make the backend configurable via config setting
rebase tests: repeat some tests using the merge backend instead of am
rebase tests: mark tests specific to the am-backend with --am
rebase: drop '-i' from the reflog for interactive-based rebases
git-prompt: change the prompt for interactive-based rebases
rebase: add an --am option
rebase: move incompatibility checks between backend options a bit earlier
git-rebase.txt: add more details about behavioral differences of backends
rebase: allow more types of rebases to fast-forward
t3432: make these tests work with either am or merge backends
rebase: fix handling of restrict_revision
rebase: make sure to pass along the quiet flag to the sequencer
rebase, sequencer: remove the broken GIT_QUIET handling
t3406: simplify an already simple test
rebase (interactive-backend): fix handling of commits that become empty
rebase (interactive-backend): make --keep-empty the default
t3404: directly test the behavior of interest
git-rebase.txt: update description of --allow-empty-message
"git check-ignore" did not work when the given path is explicitly
marked as not ignored with a negative entry in the .gitignore file.
* en/check-ignore:
check-ignore: fix documentation and implementation to match
The object reachability bitmap machinery and the partial cloning
machinery were not prepared to work well together, because some
object-filtering criteria that partial clones use inherently rely
on object traversal, but the bitmap machinery is an optimization
to bypass that object traversal. There however are some cases
where they can work together, and they were taught about them.
* jk/object-filter-with-bitmap:
rev-list --count: comment on the use of count_right++
pack-objects: support filters with bitmaps
pack-bitmap: implement BLOB_LIMIT filtering
pack-bitmap: implement BLOB_NONE filtering
bitmap: add bitmap_unset() function
rev-list: use bitmap filters for traversal
pack-bitmap: basic noop bitmap filter infrastructure
rev-list: allow commit-only bitmap traversals
t5310: factor out bitmap traversal comparison
rev-list: allow bitmaps when counting objects
rev-list: make --count work with --objects
rev-list: factor out bitmap-optimized routines
pack-bitmap: refuse to do a bitmap traversal with pathspecs
rev-list: fallback to non-bitmap traversal when filtering
pack-bitmap: fix leak of haves/wants object lists
pack-bitmap: factor out type iterator initialization
This reverts commit 72b006f4bf, which
breaks the end-user experience when merging a signed tag without
having the public key. We should report "can't check because we
have no public key", but the code with this change claimed that
there was no signature.
When searching the commit graph for tag candidates, `git-describe`
will stop as soon as there is only one active branch left and
it already found an annotated tag as a candidate.
This works well as long as all branches eventually connect back
to a common root, but if the tags are found across branches
with no common ancestor
B
o----.
\
o-----o---o----x
A
it can happen that the search on one branch terminates prematurely
because a tag was found on another, independent branch. This scenario
isn't quite as obscure as it sounds, since cloning with a limited
depth often introduces many independent "dead ends" into the commit
graph.
The help text of `git-describe` states pretty clearly that when
describing a commit D, the number appended to the emitted tag X should
correspond to the number of commits found by `git log X..D`.
Thus, this commit modifies the stopping condition to only abort
the search when only one branch is left to search *and* all current
best candidates are descendants from that branch.
For repositories with a single root, this condition is always
true: When the search is reduced to a single active branch, the
current commit must be an ancestor of *all* tag candidates. This
means that in the common case, this change will have no negative
performance impact since the same number of commits as before will
be traversed.
Signed-off-by: Benno Evers <benno@bmevers.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `options.switch_to' is set, `options.orig_head' is populated right
after with the object name the ref/commit argument points at.
Therefore, there is no need to parse `switch_to' again.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote rename X Y" needs to adjust configuration variables
(e.g. branch.<name>.remote) whose value used to be X to Y.
branch.<name>.pushRemote is now also updated.
* bw/remote-rename-update-config:
remote rename/remove: gently handle remote.pushDefault config
config: provide access to the current line number
remote rename/remove: handle branch.<name>.pushRemote config values
remote: clean-up config callback
remote: clean-up by returning early to avoid one indentation
pull --rebase/remote rename: document and honor single-letter abbreviations rebase types
Previously, performing "git clone --recurse-submodules --single-branch"
resulted in submodules cloning all branches even though the superproject
cloned only one branch. Pipe --single-branch through the submodule
helper framework to make it to 'clone' later on.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start using a named initializer list for SUBMODULE_UPDATE_CLONE_INIT, as
the struct is becoming cumbersome for a typical struct initializer list.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add <path>" performs various checks before approving
<path> as a valid location for the new worktree. Aside from ensuring
that <path> does not already exist, one of the questions it asks is
whether <path> is already a registered worktree. To perform this check,
it queries find_worktree() and disallows the "add" operation if
find_worktree() finds a match for <path>. As a convenience, however,
find_worktree() casts an overly wide net to allow users to identify
worktrees by shorthand in order to keep typing to a minimum. For
instance, it performs suffix matching which, given subtrees "foo/bar"
and "foo/baz", can correctly select the latter when asked only for
"baz".
"add" validation knows the exact path it is interrogating, so this sort
of heuristic-based matching is, at best, questionable for this use-case
and, at worst, may may accidentally interpret <path> as matching an
existing worktree and incorrectly report it as already registered even
when it isn't. (In fact, validate_worktree_add() already contains a
special case to avoid accidentally matching against the main worktree,
precisely due to this problem.)
Avoid the problem of potential accidental matching against an existing
worktree by instead taking advantage of find_worktree_by_path() which
matches paths deterministically, without applying any sort of magic
shorthand matching performed by find_worktree().
Reported-by: Cameron Gunnin <cameron.gunnin@synopsys.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a caller sets the object_info.delta_base_sha1 to a non-NULL pointer,
we'll write the oid of the object's delta base to it. But we can
increase our type safety by switching this to a real object_id struct.
All of our callers are just pointing into the hash member of an
object_id anyway, so there's no inconvenience.
Note that we do still keep it as a pointer-to-struct, because the NULL
sentinel value tells us whether the caller is even interested in the
information.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the pack-reuse code is dumping an OFS_DELTA entry to a client that
doesn't support it, we re-write it as a REF_DELTA. To do so, we use
nth_packed_object_sha1() to get the oid, but that function is soon going
away in favor of the more type-safe nth_packed_object_id(). Let's switch
now in preparation.
Note that this does incur an extra hash copy (from the pack idx mmap to
the object_id and then to the output, rather than straight from mmap to
the output). But this is not worth worrying about. It's probably not
measurable even when it triggers, and this is fallback code that we
expect to trigger very rarely (since everybody supports OFS_DELTA these
days anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already store an object_id internally, and now our sole caller also
has one. Let's stop passing around the internal hash array, which adds a
bit of type safety.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're considering reusing an on-disk delta, we get the oid of the
base as a pointer to unsigned char bytes of the hash, either into the
packfile itself (for REF_DELTA) or into the pack idx (using the revindex
to convert the offset into an index entry).
Instead, we'd prefer to use a more type-safe object_id as much as
possible. We can get the pack idx using nth_packed_object_id() instead.
For the packfile bytes, we can copy them out using oidread().
This doesn't even incur an extra copy overall, since the next thing we'd
always do with that pointer is pass it to can_reuse_delta(), which needs
an object_id anyway (and called oidread() itself). So this patch also
converts that function to take the object_id directly.
Note that we did previously use NULL as a sentinel value when the object
isn't a delta. We could probably get away with using the null oid for
this, but instead we'll use an explicit boolean flag, which should make
things more obvious for people reading the code later.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our nth_packed_object_sha1() function returns NULL for error. So when we
wrapped it with nth_packed_object_oid(), we kept the same semantics. But
it's a bit funny, because the caller actually passes in an out
parameter, and the pointer we return is just that same struct they
passed to us (or NULL).
It's not too terrible, but it does make the interface a little
non-idiomatic. Let's switch to our usual "0 for success, negative for
error" return value. Most callers either don't check it, or are
trivially converted. The one that requires the biggest change is
actually improved, as we can ditch an extra aliased pointer variable.
Since we are changing the interface in a subtle way that the compiler
wouldn't catch, let's also change the name to catch any topics in
flight. We can drop the 'o' and make it nth_packed_object_id(). That's
slightly shorter, but also less redundant since the 'o' stands for
"object" already.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The invocation "git rebase <upstream> <branch>" switches to <branch>
before performing the rebase operation. However, unlike git-switch,
git-checkout, and git-worktree which all refuse to switch to a branch
that is already checked out in some other worktree, git-rebase switches
to <branch> unconditionally. Curb this careless behavior by making
git-rebase also refuse to switch to a branch checked out elsewhere.
Reported-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The receive.denyCurrentBranch config option controls what happens if
you push to a branch that is checked out into a non-bare repository.
By default, it rejects it. It can be disabled via `ignore` or `warn`.
Another yet trickier option is `updateInstead`.
However, this setting was forgotten when the git worktree command was
introduced: only the main worktree's current branch is respected.
With this change, all worktrees are respected.
That change also leads to revealing another bug,
i.e. `receive.denyCurrentBranch = true` was ignored when pushing into a
non-bare repository's unborn current branch using ref namespaces. As
`is_ref_checked_out()` returns 0 which means `receive-pack` does not get
into conditional statement to switch `deny_current_branch` accordingly
(ignore, warn, refuse, unconfigured, updateInstead).
receive.denyCurrentBranch uses the function `refs_resolve_ref_unsafe()`
(called via `resolve_refdup()`) to resolve the symbolic ref HEAD, but
that function fails when HEAD does not point at a valid commit.
As we replace the call to `refs_resolve_ref_unsafe()` with
`find_shared_symref()`, which has no problem finding the worktree for a
given branch even if it is unborn yet, this bug is fixed at the same
time: receive.denyCurrentBranch now also handles worktrees with unborn
branches as intended even while using ref namespaces.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transition plan anticipates that we will allow signatures using
multiple algorithms in a single commit. In order to do so, we need to
use a different header per algorithm so that it will be obvious over
which data to compute the signature.
The transition plan specifies that we should use "gpgsig-sha256", so
wire up the commit code such that it can write and parse the current
algorithm, and it can remove the headers for any algorithm when creating
a new commit. Add tests to ensure that we write using the right header
and that git fsck doesn't reject these commits.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we perform a clone, we won't know the remote side's hash algorithm
until we've read the heads. Consequently, we'll need to rewrite the
repository format version and hash algorithm once we know what the
remote side has. Move the code that does this into its own function so
that we can call it from clone in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the foreseeable future, SHA-1 will be the default algorithm for Git.
However, when running the testsuite, we want to be able to test an
arbitrary algorithm. It would be quite burdensome and very untidy to
have to specify the algorithm we'd like to test every time we
initialized a new repository somewhere in the testsuite, so add an
environment variable to allow us to specify the default hash algorithm
for Git.
This has the benefit that we can set it once for the entire testsuite
and not have to think about it. In the future, users can also use it to
set the default for their repositories if they would like to do so.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow the user to specify the hash algorithm on the command line by
using the --object-format option to git init. Validate that the user is
not attempting to reinitialize a repository with a different hash
algorithm. Ensure that if we are writing a non-SHA-1 repository that we
set the repository version to 1 and write the objectFormat extension.
Restrict this option to work only when ENABLE_SHA256 is set until the
codebase is in a situation to fully support this.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we will want to not only check the repository format, but
extract the information that we've gained. To do so, allow
check_repository_format to take a pointer to struct repository_format.
Allow passing NULL for this argument if we're not interested in the
information, and pass NULL for all existing callers. A future patch
will make use of this information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid hard-coding a hash size, instead preferring to use the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can check if certain characters are present in a string by calling
strchr(3) on each of them, or we can pass them all to a single
strpbrk(3) call. The latter is shorter, less repetitive and slightly
more efficient, so let's do that instead.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using partial clone, find_non_local_tags() in builtin/fetch.c
checks each remote tag to see if its object also exists locally. There
is no expectation that the object exist locally, but this function
nevertheless triggers a lazy fetch if the object does not exist. This
can be extremely expensive when asking for a commit, as we are
completely removed from the context of the non-existent object and
thus supply no "haves" in the request.
6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
global variable that prevented these fetches in favor of a bitflag.
However, some object existence checks were not updated to use this flag.
Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
repreparing the pack-file structures. We need to be extremely careful
about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
due to updated refs.
This resolves a broken test in t5616-partial-clone.sh.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An annotated tag has two names: where it sits in the refs/tags
hierarchy and the tagname recorded in the "tag" field in the object
itself. They usually should match.
Since 212945d4 ("Teach git-describe to verify annotated tag names
before output", 2008-02-28), a commit described using an annotated
tag bases its name on the tagname from the object. While this was a
deliberate design decision to make it easier to converse about tags
with others, even if the tags happen to be fetched to a different
name than it was given by its creator, it had one downside.
The output from "git describe", at least in the modern Git, should
be usable as an object name to name the exact commit given to the
"git describe" command. Using the tagname, when two names differ,
breaks this property, when describing a commit that is directly
pointed at by such a tag. An annotated tag Bob made as "v1.0" may
sit at "refs/tags/v1.0-bob" in the ref hierarchy, and output from
"git describe v1.0-bob^0" would say "v1.0", but there may not be
any tag at "refs/tags/v1.0" locally or there may be another tag that
points at a different object.
Note that this won't be a problem if a commit being described is not
directly pointed at by such a mislocated tag. In the example in the
previous paragraph, describing a commit whose parent is v1.0-bob
would result in "v1.0" (i.e. the tagname taken from the tag object)
followed by "-1-gXXXXX" where XXXXX is the abbreviated object name,
and a string that ends with "-g" followed by a hexadecimal string is
an object name for the object whose name begins with hexadecimal
string (as long as it is unique), so it does not matter if the
leading part is "v1.0" or "v1.0-bob".
Show the name in the long format, i.e. with "-0-gXXXXX" suffix, when
the name we give is based on a mislocated annotated tag to ensure
that the output can be used as the object name for the object
originally given to the command to fix the issue.
While at it, remove an overly cautious dead code to protect against
an annotated tag object without the tagname. Such a tag is filtered
out much earlier in the codeflow, and will not reach this part of
the code.
Helped-by: Matheus Tavares <matheus.bernardino@usp.br>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git am --show-current-patch" was added in commit 984913a210 ("am:
add --show-current-patch", 2018-02-12), "git am" started recommending it
as a replacement for .git/rebase-merge/patch. Unfortunately the suggestion
is somewhat misguided; for example, the output of "git am --show-current-patch"
cannot be passed to "git apply" if it is encoded as quoted-printable
or base64. Add a new mode to "git am --show-current-patch" in order to
straighten the suggestion.
Reported-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git am --show-current-patch" was added in commit 984913a210 ("am:
add --show-current-patch", 2018-02-12), "git am" started recommending it
as a replacement for .git/rebase-merge/patch. Unfortunately the suggestion
is somewhat misguided; for example, the output "git am --show-current-patch"
cannot be passed to "git apply" if it is encoded as quoted-printable or
base64. To simplify worktree operations and to avoid that users poke into
.git, it would be better if "git am" also provided a mode that copies
.git/rebase-merge/patch to stdout.
One possibility could be to have completely separate options, introducing
for example --show-current-message (for .git/rebase-apply/NNNN)
and --show-current-diff (for .git/rebase-apply/patch), while possibly
deprecating --show-current-patch.
That would even remove the need for the first two patches in the series.
However, the long common prefix would have prevented using an abbreviated
option such as "--show". Therefore, I chose instead to add a string
argument to --show-current-patch. The new argument is optional, so that
"git am --show-current-patch"'s behavior remains backwards-compatible.
The next choice to make is how to handle multiple --show-current-patch
options. Right now, something like "git am --abort --show-current-patch"
is rejected, and the previous suggestion would likewise have naturally
rejected a command line like
git am --show-current-message --show-current-diff
Therefore, I decided to also reject for example
git am --show-current-patch=diff --show-current-patch=raw
In other words the whole of --show-current-patch=xxx (including the
optional argument) is treated as the command mode. I found this to be
more consistent and intuitive, even though it differs from the usual
"last one wins" semantics of the git command line.
Add the code to parse submodes based on the above design, where for now
"raw" is the only valid submode. "raw" prints the full e-mail message
just like "git am --show-current-patch".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will allow stashing the submode of --show-current-patch from a
callback function. Using a struct will allow accessing both fields from
outside cmd_am (through container_of).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) For now, `--pathspec-from-file` is declared incompatible with
`--patch`, even when <file> is not `-`. Such use case is not
really expected.
2) It is not allowed to pass pathspec in both args and file.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate crude option parsing and rely on real parsing instead, because
1) Crude parsing is crude, for example it's not capable of
handling things like `git stash -m Message`
2) Adding options in two places is inconvenient and prone to bugs
As a side result, the case of `git stash -m Message` gets fixed.
Also give a good error message instead of just throwing usage at user.
----
Some review of what's been happening to this code:
Before [1], `git-stash.sh` only verified that all args begin with `-` :
# The default command is "push" if nothing but options are given
seen_non_option=
for opt
do
case "$opt" in
--) break ;;
-*) ;;
*) seen_non_option=t; break ;;
esac
done
Later, [1] introduced the duplicate code I'm now removing, also making
the previous test more strict by white-listing options.
----
[1] Commit 40af1468 ("stash: convert `stash--helper.c` into `stash.c`" 2019-02-26)
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Decisions taken for simplicity:
1) It is not allowed to pass pathspec in both args and file.
Adjustments were needed for `if (!argc)` block:
This code actually means "pathspec is not present". Previously, pathspec
could only come from commandline arguments, so testing for `argc` was a
valid way of testing for the presence of pathspec. But this is no longer
true with `--pathspec-from-file`.
During the entire `--pathspec-from-file` story, I tried to keep its
behavior very close to giving pathspec on commandline, so that switching
from one to another doesn't involve any surprises.
However, throwing usage at user in the case of empty
`--pathspec-from-file` would puzzle because there's nothing wrong with
"usage" (that is, argc/argv array).
On the other hand, throwing usage in the old case also feels bad to me.
While it's less of a puzzle, I (as user) never liked the experience of
comparing my commandline to "usage", trying to spot a difference. Since
it's already known what the error is, it feels a lot better to give that
specific error to user.
Judging from [1] it doesn't seem that showing usage in this case was
important (the patch was to avoid segfault), and it doesn't fit into how
other commands react to empty pathspec (see for example `git add` with a
custom message).
Therefore, I decided to show new error text in both cases. In order to
continue testing for error early, I moved `parse_pathspec()` higher. Now
it happens before `read_cache()` / `hold_locked_index()` /
`setup_work_tree()`, which shouldn't cause any issues.
[1] Commit 7612a1ef ("git-rm: honor -n flag" 2006-06-09)
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary to
convert those exit() calls to return statements so that errors can be
reported.
Emulate try catch in C by converting `exit(<positive-value>)` to
`return <negative-value>`. Follow POSIX conventions to return
<negative-value> to indicate error.
Code that turns BISECT_INTERNAL_SUCCESS_MERGE_BASE (-11)
to BISECT_OK (0) from `check_good_are_ancestors_of_bad()` has been moved to
`cmd_bisect__helper()`.
Update all callers to handle the error returns.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to get rid of git-bisect.sh, it would be necessary
to convert bisect.c exit() calls to return statements so
that errors can be reported. Let's prepare for that by making
it possible to return different error codes than just 0 or 1.
Different error codes might enable a bisecting script calling the
bisect command that uses this function to do different things
depending on the exit status of the bisect command.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's refactor code from bisect_next_check() into a new
decide_next() helper function.
This removes some goto statements and makes the code simpler,
clearer and easier to understand.
While at it `bad_ref` and `good_glob` are not const any more
to void casting them inside `free()`.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's rename variable retval to res, so that variable names
in bisect--helper.c are more consistent.
After this change, there are 110 occurrences of res in the file
and zero of retval, while there were 26 instances of retval before.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a pointer that points at a constant string,
just give name directly to the constant string; this way, we
do not have to allocate a pointer variable in addition to
the string we want to use.
Let's convert `vocab_bad` and `vocab_good` char pointers to char arrays.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
check-ignore has two different modes, and neither of these modes has an
implementation that matches the documentation. These modes differ in
whether they just print paths or whether they also print the final
pattern matched by the path. The fix is different for both modes, so
I'll discuss both separately.
=== First (default) mode ===
The first mode is documented as:
For each pathname given via the command-line or from a file via
--stdin, check whether the file is excluded by .gitignore (or other
input files to the exclude mechanism) and output the path if it is
excluded.
However, it fails to do this because it did not account for negated
patterns. Commands other than check-ignore verify exclusion rules via
calling
... -> treat_one_path() -> is_excluded() -> last_matching_pattern()
while check-ignore has a call path of the form:
... -> check_ignore() -> last_matching_pattern()
The fact that the latter does not include the call to is_excluded()
means that it is susceptible to to messing up negated patterns (since
that is the only significant thing is_excluded() adds over
last_matching_pattern()). Unfortunately, we can't make it just call
is_excluded(), because the same codepath is used by the verbose mode
which needs to know the matched pattern in question. This brings us
to...
=== Second (verbose) mode ===
The second mode, known as verbose mode, references the first in the
documentation and says:
Also output details about the matching pattern (if any) for each
given pathname. For precedence rules within and between exclude
sources, see gitignore(5).
The "Also" means it will print patterns that match the exclude rules as
noted for the first mode, and also print which pattern matches. Unless
more information is printed than just pathname and pattern (which is not
done), this definition is somewhat ill-defined and perhaps even
self-contradictory for negated patterns: A path which matches a negated
exclude pattern is NOT excluded and thus shouldn't be printed by the
former logic, while it certainly does match one of the explicit patterns
and thus should be printed by the latter logic.
=== Resolution ==
Since the second mode exists to find out which pattern matches given
paths, and showing the user a pattern that begins with a '!' is
sufficient for them to figure out whether the pattern is excluded, the
existing behavior is desirable -- we just need to update the
documentation to match the implementation (i.e. it is about printing
which pattern is matched by paths, not about showing which paths are
excluded).
For the first or default mode, users just want to know whether a pattern
is excluded. As such, the existing documentation is desirable; change
the implementation to match the documented behavior.
Finally, also adjust a few tests in t0008 that were caught up by this
discrepancy in how negated paths were handled.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git config" learned to show in which "scope", in addition to in
which file, each config setting comes from.
* mr/show-config-scope:
config: add '--show-scope' to print the scope of a config value
submodule-config: add subomdule config scope
config: teach git_config_source to remember its scope
config: preserve scope in do_git_config_sequence
config: clarify meaning of command line scoping
config: split repo scope to local and worktree
config: make scope_name non-static and rename it
t1300: create custom config file without special characters
t1300: fix over-indented HERE-DOCs
config: fix typo in variable name
Memory footprint and performance of "git name-rev" has been
improved.
* rs/name-rev-memsave:
name-rev: sort tip names before applying
name-rev: release unused name strings
name-rev: generate name strings only if they are better
name-rev: pre-size buffer in get_parent_name()
name-rev: factor out get_parent_name()
name-rev: put struct rev_name into commit slab
name-rev: don't _peek() in create_or_update_name()
name-rev: don't leak path copy in name_ref()
name-rev: respect const qualifier
name-rev: remove unused typedef
name-rev: rewrite create_or_update_name()
Two related changes, with separate rationale for each:
Rename the 'interactive' backend to 'merge' because:
* 'interactive' as a name caused confusion; this backend has been used
for many kinds of non-interactive rebases, and will probably be used
in the future for more non-interactive rebases than interactive ones
given that we are making it the default.
* 'interactive' is not the underlying strategy; merging is.
* the directory where state is stored is not called
.git/rebase-interactive but .git/rebase-merge.
Rename the 'am' backend to 'apply' because:
* Few users are familiar with git-am as a reference point.
* Related to the above, the name 'am' makes sentences in the
documentation harder for users to read and comprehend (they may read
it as the verb from "I am"); avoiding this difficult places a large
burden on anyone writing documentation about this backend to be very
careful with quoting and sentence structure and often forces
annoying redundancy to try to avoid such problems.
* Users stumble over pronunciation ("am" as in "I am a person not a
backend" or "am" as in "the first and thirteenth letters in the
alphabet in order are "A-M"); this may drive confusion when one user
tries to explain to another what they are doing.
* While "am" is the tool driving this backend, the tool driving git-am
is git-apply, and since we are driving towards lower-level tools
for the naming of the merge backend we may as well do so here too.
* The directory where state is stored has never been called
.git/rebase-am, it was always called .git/rebase-apply.
For all the reasons listed above:
* Modify the documentation to refer to the backends with the new names
* Provide a brief note in the documentation connecting the new names
to the old names in case users run across the old names anywhere
(e.g. in old release notes or older versions of the documentation)
* Change the (new) --am command line flag to --apply
* Rename some enums, variables, and functions to reinforce the new
backend names for us as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A large variety of rebase types are supported by the interactive
machinery, not just the explicitly interactive ones. These all share
the same code and write the same reflog messages, but the "-i" moniker
in those messages doesn't really have much meaning. It also becomes
somewhat distracting once we switch the default from the am-backend to
the interactive one. Just remove the "-i" from these messages.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, this option doesn't do anything except error out if any
options requiring the interactive-backend are also passed. However,
when we make the default backend configurable later in this series, this
flag will provide a way to override the config setting.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the past, we dis-allowed rebases using the interactive backend from
performing a fast-forward to short-circuit the rebase operation. This
made sense for explicitly interactive rebases and some implicitly
interactive rebases, but certainly became overly stringent when the
merge backend was re-implemented via the interactive backend.
Just as the am-based rebase has always had to disable the fast-forward
based on a variety of conditions or flags (e.g. --signoff, --whitespace,
etc.), we need to do the same but now with a few more options. However,
continuing to use REBASE_FORCE for tracking this is problematic because
the interactive backend used it for a different purpose. (When
REBASE_FORCE wasn't set, the interactive backend would not fast-forward
the whole series but would fast-forward individual "pick" commits at the
beginning of the todo list, and then a squash or something would cause
it to start generating new commits.) So, introduce a new
allow_preemptive_ff flag contained within cmd_rebase() and use it to
track whether we are going to allow a pre-emptive fast-forward that
short-circuits the whole rebase.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
restrict_revision in the original shell script was an excluded revision
range. It is also treated that way by the am-backend. In the
conversion from shell to C (see commit 6ab54d17be ("rebase -i:
implement the logic to initialize $revisions in C", 2018-08-28)), the
interactive-backend accidentally treated it as a positive revision
rather than a negated one.
This was missed as there were no tests in the testsuite that tested an
interactive rebase with fork-point behavior.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_QUIET environment variable was used to signal the non-am
backends that the rebase should perform quietly. The preserve-merges
backend does not make use of the quiet flag anywhere (other than to
write out its state whenever it writes state), and this mechanism was
broken in the conversion from shell to C. Since this environment
variable was specifically designed for scripts and the only backend that
would still use it is no longer a script, just gut this code.
A subsequent commit will fix --quiet for the interactive/merge backend
in a different way.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As established in the previous commit and commit b00bf1c9a8
(git-rebase: make --allow-empty-message the default, 2018-06-27), the
behavior for rebase with different backends in various edge or corner
cases is often more happenstance than design. This commit addresses
another such corner case: commits which "become empty".
A careful reader may note that there are two types of commits which would
become empty due to a rebase:
* [clean cherry-pick] Commits which are clean cherry-picks of upstream
commits, as determined by `git log --cherry-mark ...`. Re-applying
these commits would result in an empty set of changes and a
duplicative commit message; i.e. these are commits that have
"already been applied" upstream.
* [become empty] Commits which are not empty to start, are not clean
cherry-picks of upstream commits, but which still become empty after
being rebased. This happens e.g. when a commit has changes which
are a strict subset of the changes in an upstream commit, or when
the changes of a commit can be found spread across or among several
upstream commits.
Clearly, in both cases the changes in the commit in question are found
upstream already, but the commit message may not be in the latter case.
When cherry-mark can determine a commit is already upstream, then
because of how cherry-mark works this means the upstream commit message
was about the *exact* same set of changes. Thus, the commit messages
can be assumed to be fully interchangeable (and are in fact likely to be
completely identical). As such, the clean cherry-pick case represents a
case when there is no information to be gained by keeping the extra
commit around. All rebase types have always dropped these commits, and
no one to my knowledge has ever requested that we do otherwise.
For many of the become empty cases (and likely even most), we will also
be able to drop the commit without loss of information -- but this isn't
quite always the case. Since these commits represent cases that were
not clean cherry-picks, there is no upstream commit message explaining
the same set of changes. Projects with good commit message hygiene will
likely have the explanation from our commit message contained within or
spread among the relevant upstream commits, but not all projects run
that way. As such, the commit message of the commit being rebased may
have reasoning that suggests additional changes that should be made to
adapt to the new base, or it may have information that someone wants to
add as a note to another commit, or perhaps someone even wants to create
an empty commit with the commit message as-is.
Junio commented on the "become-empty" types of commits as follows[1]:
WRT a change that ends up being empty (as opposed to a change that
is empty from the beginning), I'd think that the current behaviour
is desireable one. "am" based rebase is solely to transplant an
existing history and want to stop much less than "interactive" one
whose purpose is to polish a series before making it publishable,
and asking for confirmation ("this has become empty--do you want to
drop it?") is more appropriate from the workflow point of view.
[1] https://lore.kernel.org/git/xmqqfu1fswdh.fsf@gitster-ct.c.googlers.com/
I would simply add that his arguments for "am"-based rebases actually
apply to all non-explicitly-interactive rebases. Also, since we are
stating that different cases should have different defaults, it may be
worth providing a flag to allow users to select which behavior they want
for these commits.
Introduce a new command line flag for selecting the desired behavior:
--empty={drop,keep,ask}
with the definitions:
drop: drop commits which become empty
keep: keep commits which become empty
ask: provide the user a chance to interact and pick what to do with
commits which become empty on a case-by-case basis
In line with Junio's suggestion, if the --empty flag is not specified,
pick defaults as follows:
explicitly interactive: ask
otherwise: drop
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Different rebase backends have different treatment for commits which
start empty (i.e. have no changes relative to their parent), and the
--keep-empty option was added at some point to allow adjusting behavior.
The handling of commits which start empty is actually quite similar to
commit b00bf1c9a8 (git-rebase: make --allow-empty-message the default,
2018-06-27), which pointed out that the behavior for various backends is
often more happenstance than design. The specific change made in that
commit is actually quite relevant as well and much of the logic there
directly applies here.
It makes a lot of sense in 'git commit' to error out on the creation of
empty commits, unless an override flag is provided. However, once
someone determines that there is a rare case that merits using the
manual override to create such a commit, it is somewhere between
annoying and harmful to have to take extra steps to keep such
intentional commits around. Granted, empty commits are quite rare,
which is why handling of them doesn't get considered much and folks tend
to defer to existing (accidental) behavior and assume there was a reason
for it, leading them to just add flags (--keep-empty in this case) that
allow them to override the bad defaults. Fix the interactive backend so
that --keep-empty is the default, much like we did with
--allow-empty-message. The am backend should also be fixed to have
--keep-empty semantics for commits that start empty, but that is not
included in this patch other than a testcase documenting the failure.
Note that there was one test in t3421 which appears to have been written
expecting --keep-empty to not be the default as correct behavior. This
test was introduced in commit 00b8be5a4d ("add tests for rebasing of
empty commits", 2013-06-06), which was part of a series focusing on
rebase topology and which had an interesting original cover letter at
https://lore.kernel.org/git/1347949878-12578-1-git-send-email-martinvonz@gmail.com/
which noted
Your input especially appreciated on whether you agree with the
intent of the test cases.
and then went into a long example about how one of the many tests added
had several questions about whether it was correct. As such, I believe
most the tests in that series were about testing rebase topology with as
many different flags as possible and were not trying to state in general
how those flags should behave otherwise.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to compute the commit-graph has been taught to use a more
robust way to tell if two object directories refer to the same
thing.
* tb/commit-graph-object-dir:
commit-graph.h: use odb in 'load_commit_graph_one_fd_st'
commit-graph.c: remove path normalization, comparison
commit-graph.h: store object directory in 'struct commit_graph'
commit-graph.h: store an odb in 'struct write_commit_graph_context'
t5318: don't pass non-object directory to '--object-dir'
The index-pack code now diagnoses a bad input packstream that
records the same object twice when it is used as delta base; the
code used to declare a software bug when encountering such an
input, but it is an input error.
* jk/index-pack-dupfix:
index-pack: downgrade twice-resolved REF_DELTA to die()
The way "git submodule status" reports an initialized but not yet
populated submodule has not been reimplemented correctly when a
part of the "git submodule" command was rewritten in C, which has
been corrected.
* pk/status-of-uncloned-submodule:
t7400: testcase for submodule status on unregistered inner git repos
submodule: fix status of initialized but not cloned submodules
t7400: add a testcase for submodule status on empty dirs
Some codepaths were given a repository instance as a parameter to
work in the repository, but passed the_repository instance to its
callees, which has been cleaned up (somewhat).
* mt/use-passed-repo-more-in-funcs:
sha1-file: allow check_object_signature() to handle any repo
sha1-file: pass git_hash_algo to hash_object_file()
sha1-file: pass git_hash_algo to write_object_file_prepare()
streaming: allow open_istream() to handle any repo
pack-check: use given repo's hash_algo at verify_packfile()
cache-tree: use given repo's hash_algo at verify_one()
diff: make diff_populate_filespec() honor its repo argument
Some rough edges in the sparse-checkout feature, especially around
the cone mode, have been cleaned up.
* ds/sparse-checkout-harden:
sparse-checkout: fix cone mode behavior mismatch
sparse-checkout: improve docs around 'set' in cone mode
sparse-checkout: escape all glob characters on write
sparse-checkout: use C-style quotes in 'list' subcommand
sparse-checkout: unquote C-style strings over --stdin
sparse-checkout: write escaped patterns in cone mode
sparse-checkout: properly match escaped characters
sparse-checkout: warn on globs in cone patterns
sparse-checkout: detect short patterns
sparse-checkout: cone mode does not recognize "**"
sparse-checkout: fix documentation typo for core.sparseCheckoutCone
clone: fix --sparse option with URLs
sparse-checkout: create leading directories
t1091: improve here-docs
t1091: use check_files to reduce boilerplate
Unneeded connectivity check is now disabled in a partial clone when
fetching into it.
* jt/connectivity-check-optim-in-partial-clone:
fetch: forgo full connectivity check if --filter
connected: verify promisor-ness of partial clone
"git rebase -i" (and friends) used to unnecessarily check out the
tip of the branch to be rebased, which has been corrected.
* ag/rebase-avoid-unneeded-checkout:
rebase -i: stop checking out the tip of the branch to rebase
Traditionally, we avoided threaded grep while searching in objects
(as opposed to files in the working tree) as accesses to the object
layer is not thread-safe. This limitation is getting lifted.
* mt/threaded-grep-in-object-store:
grep: use no. of cores as the default no. of threads
grep: move driver pre-load out of critical section
grep: re-enable threads in non-worktree case
grep: protect packed_git [re-]initialization
grep: allow submodule functions to run in parallel
submodule-config: add skip_if_read option to repo_read_gitmodules()
grep: replace grep_read_mutex by internal obj read lock
object-store: allow threaded access to object reading
replace-object: make replace operations thread-safe
grep: fix racy calls in grep_objects()
grep: fix race conditions at grep_submodule()
grep: fix race conditions on userdiff calls
Two help messages given when "git add" notices the user gave it
nothing to add have been updated to use advise() API.
* hw/advice-add-nothing:
add: change advice config variables used by the add API
add: use advise function to display hints
"git grep --no-index" should not get affected by the contents of
the .gitmodules file but when "--recurse-submodules" is given or
the "submodule.recurse" variable is set, it did. Now these
settings are ignored in the "--no-index" mode.
* pb/do-not-recurse-grep-no-index:
grep: ignore --recurse-submodules if --no-index is given
"git restore --staged" did not correctly update the cache-tree
structure, resulting in bogus trees to be written afterwards, which
has been corrected.
* nd/switch-and-restore:
restore: invalidate cache-tree when removing entries with --staged
"git commit" gives output similar to "git status" when there is
nothing to commit, but without honoring the advise.statusHints
configuration variable, which has been corrected.
* hw/commit-advise-while-rejecting:
commit: honor advice.statusHints when rejecting an empty commit
Just as rev-list recently learned to combine filters and bitmaps, let's
do the same for pack-objects. The infrastructure is all there; we just
need to pass along our filter options, and the pack-bitmap code will
decide to use bitmaps or not.
This unsurprisingly makes things faster for partial clones of large
repositories (here we're cloning linux.git):
Test HEAD^ HEAD
------------------------------------------------------------------------------
5310.11: simulated partial clone 38.94(37.28+5.87) 11.06(11.27+4.07) -71.6%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This just passes the filter-options struct to prepare_bitmap_walk().
Since the bitmap code doesn't actually support any filters yet, it will
fallback to the non-bitmap code if any --filter is specified. But this
lets us exercise that rejection code path, as well as getting us ready
to test filters via rev-list when we _do_ support them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently you can't use object filters with bitmaps, but we plan to
support at least some filters with bitmaps. Let's introduce some
infrastructure that will help us do that:
- prepare_bitmap_walk() now accepts a list_objects_filter_options
parameter (which can be NULL for no filtering; all the current
callers pass this)
- we'll bail early if the filter is incompatible with bitmaps (just as
we would if there were no bitmaps at all). Currently all filters are
incompatible.
- we'll filter the resulting bitmap; since there are no supported
filters yet, this is always a noop.
There should be no behavior change yet, but we'll support some actual
filters in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since we added reachability bitmap support, we've been able to use
it with rev-list to get the full list of objects, like:
git rev-list --objects --use-bitmap-index --all
But you can't do so without --objects, since we weren't ready to just
show the commits. However, the internals of the bitmap code are mostly
ready for this: they avoid opening up trees when walking to fill in the
bitmaps. We just need to actually pass in the rev_info to
traverse_bitmap_commit_list() so it knows which types to bother
triggering our callback for.
For completeness, the perf test now covers both the existing --objects
case, as well as the new commits-only behavior (the objects one got way
faster when we introduced bitmaps, but obviously isn't improved now).
Here are numbers for linux.git:
Test HEAD^ HEAD
------------------------------------------------------------------------
5310.7: rev-list (commits) 8.29(8.10+0.19) 1.76(1.72+0.04) -78.8%
5310.8: rev-list (objects) 8.06(7.94+0.12) 8.14(7.94+0.13) +1.0%
That run was cheating a little, as I didn't have any commit-graph in the
repository, and we'd built it by default these days when running git-gc.
Here are numbers with a commit-graph:
Test HEAD^ HEAD
------------------------------------------------------------------------
5310.7: rev-list (commits) 0.70(0.58+0.12) 0.51(0.46+0.04) -27.1%
5310.8: rev-list (objects) 6.20(6.09+0.10) 6.27(6.16+0.11) +1.1%
Still an improvement, but a lot less impressive.
We could have the perf script remove any commit-graph to show the
out-sized effect, but it probably makes sense to leave it in what would
be a more typical setup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prior commit taught "--count --objects" to work without bitmaps. We
should be able to get the same answer much more quickly with bitmaps.
Note that we punt on the max_count case here. This perhaps _could_ be
made to work if we find all of the boundary commits and treat them as
UNINTERESTING, subtracting them (and their reachable objects) from the
set we return. That implies an actual commit traversal, but we'd still
be faster due to avoiding opening up any trees. Given the complexity and
the fact that anyone is unlikely to want this, it makes sense to just
fall back to the non-bitmap case for now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>