The patch-id code which powers "log --cherry-pick" doesn't
look at whether each commit is a merge or not. It just feeds
the commit's first parent to the diff, and ignores any
additional parents.
In theory, this might be useful if you wanted to find
equivalence between, say, a merge commit and a squash-merge
that does the same thing. But it also promotes a false
equivalence between distinct merges. For example, every
"merge -s ours" would look identical to an empty commit
(which is true in a sense, but presumably there was a value
in merging in the discarded history). Since patch-ids are
meant for throwing away duplicates, we should err on the
side of _not_ matching such merges.
Moreover, we may spend a lot of extra time computing these
merge diffs. In the case that inspired this patch, a "git
format-patch --cherry-pick" dropped from over 3 minutes to
less than 3 seconds.
This seems pretty drastic, but is easily explained. The
command was invoked by a "git rebase" of an older topic
branch; there had been tens of thousands of commits on the
upstream branch in the meantime. In addition, this project
used a topic-branch workflow with occasional "back-merges"
from "master" to each topic (to resolve conflicts on the
topics rather than in the merge commits). So there were not
only extra merges, but the diffs for these back-merges were
generally quite large (because they represented _everything_
that had been merged to master since the topic branched).
This patch treats a merge fed to commit_patch_id() or
add_commit_patch_id() as an error, and a lookup for such a
merge via has_commit_patch_id() will always return NULL.
An earlier version of the patch tried to distinguish between
"error" and "patch id for merges not defined", but that
becomes unnecessarily complicated. The only callers are:
1. revision traversals which want to do --cherry-pick;
they call add_commit_patch_id(), but do not care if it
fails. They only want to add what we can, look it up
later with has_commit_patch_id(), and err on the side
of not-matching.
2. format-patch --base, which calls commit_patch_id().
This _does_ notice errors, but should never feed a
merge in the first place (and if it were to do so
accidentally, then this patch is a strict improvement;
we notice the bug rather than generating a bogus
patch-id).
So in both cases, this does the right thing.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We may remove elements from the list while we are iterating,
which requires using a second temporary pointer. Otherwise
stepping to the next element of the list might involve
looking at freed memory (which generally works in practice,
as we _just_ freed it, but of course is wrong to rely on;
valgrind notices it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of v2.9.0, `git commit-tree` no longer heeds the `commit.gpgsign`
config setting. This broke committing with GPG signature in Git GUI.
This fixes https://github.com/git-for-windows/git/issues/850
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this patch, --batch can be combined with --textconv or --filters.
For this to work, the input needs to have the form
<object name><single white space><path>
so that the filters can be chosen appropriately.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are circumstances when it is relatively easy to figure out the
object name for a given path, but not the name of the containing tree.
For example, when looking at a diff generated by Git, the object names
are recorded, but not the revision. As a matter of fact, the revisions
from which the diff was generated may not even exist locally.
In such a case, the user would have to generate a fake revision just to
be able to use --textconv or --filters.
Let's simplify this dramatically, because we do not really need that
revision at all: all we care about is that we know the path. In the
scenario described above, we do know the path, and we just want to
specify it separately from the object name.
Example usage:
git cat-file --textconv --path=main.c 0f1937fd
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --filters option applies the convert_to_working_tree() filter for
the path when showing the contents of a regular file blob object;
the contents are written out as-is for other types of objects.
This feature comes in handy when a 3rd-party tool wants to work with
the contents of files from past revisions as if they had been checked
out, but without detouring via temporary files.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alternate refs backends might still use files to store per-worktree
refs. So provide a way to iterate over only the per-worktree references
in a ref_store. The other backend can set up a files ref_store and
iterate using the new DO_FOR_EACH_PER_WORKTREE_ONLY flag when iterating.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of including a files-backend-specific struct ref_lock, change
the generic ref_update struct to include a void pointer that backends
can use for their own arbitrary data.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes the last caller of function get_files_ref_store(), so
remove it.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Alternate refs backends might not need the refs/heads directory and so
on, so we make ref db initialization part of the backend.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the file-based backend, delete_refs has some special optimization
to deal with packed refs. In other backends, we might be able to make
ref deletion faster by putting all deletions into a single
transaction. So we need a special backend function for this.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ronnie Sahlberg <rsahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the file-based backend, the reflog piggybacks on the ref lock.
Since other backends won't have the same sort of ref lock, ref backends
must also handle reflogs.
Signed-off-by: Ronnie Sahlberg <rsahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For now it only supports the main reference store.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reference backends will be able to customize this function to implement
reference reading.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we don't have to strip trailing '/' from the submodule path, then
don't allocate and copy the submodule name.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref_recursively() can handle references in arbitrary files
reference stores, so use it to resolve "gitlink" (i.e., submodule)
references. Aside from removing redundant code, this allows submodule
lookups to benefit from the much more robust code that we use for
reading non-submodule references. And, since the code is now agnostic
about reference backends, it will work for any future references
backend (so move its definition to refs.c).
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new function, resolve_ref_recursively(), which is basically like
the old resolve_ref_unsafe() except that it takes a (ref_store *)
argument and also works for submodules.
Re-implement resolve_ref_unsafe() as a thin wrapper around
resolve_ref_recursively().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that resolve_packed_ref() can work with an arbitrary
files_ref_store, there is no need to have a separate
resolve_gitlink_packed_ref() function.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move resolve_gitlink_ref() and related functions lower in the file to
avoid the need for forward declarations in the next step.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions currently only work in the main repository, so add an
assert_main_repository() check to each function.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want ref_stores to be polymorphic, so invent a base class of which
files_ref_store is a derived class. For now there is exactly one
ref_store for the main repository and one for any submodules whose
references have been accessed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a `struct ref_storage_be` to represent types of reference stores. In
OO notation, this is the class, and will soon hold some class
methods (e.g., a factory to create new ref_store instances) and will
also serve as the vtable for ref_store instances of that type.
As yet, the backends cannot do anything.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The greater goal of this patch series is to develop the concept of a
reference store, which is a place that references, their values, and
their reflogs are stored, and to virtualize the reference interface so
that different types of ref_stores can be implemented. We will then, for
example, use ref_store instances to access submodule references and
worktree references.
Currently, we keep a ref_cache for each submodule that has had its
references iterated over. It is a far cry from a ref_store, but they are
stored the way we will want to store ref_stores, and ref_stores will
eventually have to hold the reference caches. So let's treat ref_caches
as embryo ref_stores, and build them out from there.
As the first step, simply rename `ref_cache` to `files_ref_store`, and
rename some functions and attributes correspondingly.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
And improve the internal variable names.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The patch-id code may be running inside another porcelain
like "git log" or "git format-patch", and therefore may have
set diff_detect_rename_default, either via the diff-ui
config, or by default since 5404c11 (diff: activate
diff.renames by default, 2016-02-25). This is the case even
if a command is run with `--no-renames`, as that is applied
only to the diff-options used by the command itself.
Rename detection doesn't help the patch-id results. It
_may_ actually hurt, as minor differences in the files that
would be overlooked by patch-id's canonicalization might
result in different renames (though I'd doubt that it ever
comes up in practice).
But mostly it is just a waste of CPU to compute these
renames.
Note that this does have one user-visible impact: the
prerequisite patches listed by "format-patch --base". There
may be some confusion between different versions of git as
older ones will enable renames, but newer ones will not.
However, this was already a problem, as people with
different settings for the "diff.renames" config would get
different results. After this patch, everyone should get the
same results, regardless of their config.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning an empty repository served by standard git, "git clone" produces
the following reassuring message:
$ git clone git://localhost/tmp/empty
Cloning into 'empty'...
warning: You appear to have cloned an empty repository.
Checking connectivity... done.
Meanwhile when cloning an empty repository served by JGit, the output is more
haphazard:
$ git clone git://localhost/tmp/empty
Cloning into 'empty'...
Checking connectivity... done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.
This is a common command to run immediately after creating a remote repository
as preparation for adding content to populate it and pushing. The warning is
confusing and needlessly worrying.
The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities
with no refs in upload service., 2013-08-08), JGit's ref advertisement includes
a ref named capabilities^{} to advertise its capabilities on, while git's ref
advertisement is empty in this case. This allows the client to learn about the
server's capabilities and is needed, for example, for fetch-by-sha1 to work
when no refs are advertised.
This also affects "ls-remote". For example, against an empty repository served
by JGit:
$ git ls-remote git://localhost/tmp/empty
0000000000000000000000000000000000000000 capabilities^{}
Git advertises the same capabilities^{} ref in its ref advertisement for push
but since it never did so for fetch, the client didn't need to handle this
case. Handle it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A server hanging up immediately to mark access being denied does not
send any .have refs, shallow lines, or anything else before hanging
up. If the server has sent anything, then the hangup is unexpected.
That is, if the server hangs up after a shallow line but before sending
any refs, then git should tell me so:
fatal: The remote end hung up upon initial contact
instead of suggesting an access control problem:
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Noticed while examining this code. This case isn't likely to come up
in practice but tightening the check makes the code easier to read and
manipulate.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>