The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.
The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating a pack using objects that reside in existing packs, we try
to avoid recomputing futile delta between an object (trg) and a candidate
for its base object (src) if they are stored in the same packfile, and trg
is not recorded as a delta already. This heuristics makes sense because it
is likely that we tried to express trg as a delta based on src but it did
not produce a good delta when we created the existing pack.
As the pack heuristics prefer producing delta to remove data, and Linus's
law dictates that the size of a file grows over time, we tend to record
the newest version of the file as inflated, and older ones as delta
against it.
When creating a thin-pack to transfer recent history, it is likely that we
will try to send an object that is recorded in full, as it is newer. But
the heuristics to avoid recomputing futile delta effectively forbids us
from attempting to express such an object as a delta based on another
object. Sending an object in full is often more expensive than sending a
suboptimal delta based on other objects, and it is even more so if we
could use an object we know the receiving end already has (i.e. preferred
base object) as the delta base.
Tweak the recomputation avoidance logic, so that we do not punt on
computing delta against a preferred base object.
The effect of this change can be seen on two simulated upload-pack
workloads. The first is based on 44 reflog entries from my git.git
origin/master reflog, and represents the packs that kernel.org sent me git
updates for the past month or two. The second workload represents much
larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
v1.2.0, and so on.
The table below shows the average generated pack size and the average CPU
time consumed for each dataset, both before and after the patch:
dataset
| reflog | tags
---------------------------------
before | 53358 | 2750977
size after | 32398 | 2668479
change | -39% | -3%
---------------------------------
before | 0.18 | 1.12
CPU after | 0.18 | 1.15
change | +0% | +3%
This patch makes a much bigger difference for packs with a shorter slice
of history (since its effect is seen at the boundaries of the pack) though
it has some benefit even for larger packs.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function frees the individual "struct match_attr"s we
have allocated, but forgot to free the array holding their
pointers, leading to a minor memory leak (but it can add up
after checking attributes for paths in many directories).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add commit message to avoid commit's aborting due to the lack of
commit message, not because there are INTENT_TO_ADD entries in index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of rerere.enabled left the user in the dark as to who
might create an rr-cache directory. Add a note that simply invoking
rerere does this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thas would de-dent the body of a function that has grown rather large over
time, making it a bit easier to read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In prepare_attr_stack, we pop the old elements of the stack
(which were left from a previous lookup and may or may not
be useful to us). Our loop to do so checks that we never
reach the top of the stack. However, the code immediately
afterwards will segfault if we did actually reach the top of
the stack.
Fortunately, this is not an actual bug, since we will never
pop all of the stack elements (we will always keep the root
gitattributes, as well as the builtin ones). So the extra
check in the loop condition simply clutters the code and
makes the intent less clear. Let's get rid of it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we prepare the attribute stack for a lookup on a path,
we start with the cached stack from the previous lookup
(because it is common to do several lookups in the same
directory hierarchy). So the first thing we must do in
preparing the stack is to pop any entries that point to
directories we are no longer interested in.
For example, if our stack contains gitattributes for:
foo/bar/baz
foo/bar
foo
but we want to do a lookup in "foo/bar/bleep", then we want
to pop the top element, but retain the others.
To do this we walk down the stack from the top, popping
elements that do not match our lookup directory. However,
the test do this simply checked strncmp, meaning we would
mistake "foo/bar/baz" as a leading directory of
"foo/bar/baz_plus". We must also check that the character
after our match is '/', meaning we matched the whole path
component.
There are two special cases to consider:
1. The top of our attr stack has the empty path. So we
must not check for '/', but rather special-case the
empty path, which always matches.
2. Typically when matching paths in this way, you would
also need to check for a full string match (i.e., the
character after is '\0'). We don't need to do so in
this case, though, because our path string is actually
just the directory component of the path to a file
(i.e., we know that it terminates with "/", because the
filename comes after that).
Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The wording seems to suggest that creating the directory is needed and the
setting of rerere.enabled is only for disabling the feature by setting it
to 'false'. But the configuration is meant to be the primary control and
setting it to 'true' will enable it; the rr-cache directory will be
created as necessary and the user does not have to create it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the relative submodule URLs have been introduced in f31a522a2d, they
do not conform to the rules for resolving relative URIs but rather to
those of relative directories.
Document that behavior.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 34110cd4 (Make 'unpack_trees()' have a separate source and
destination index) it is no longer true that a subdirectory with
the same prefix must not exist.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ExtUtils::MakeMaker generates MYMETA.json in addition to MYMETA.yml
since version 6.57_07. As it suggests, it is just meta information about
the build and is cleaned up with 'make clean', so it should be ignored.
Signed-off-by: Jack Nagel <jacknagel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing the raw timestamp, we format the numeric
seconds-since-epoch into a buffer, followed by the timezone
string. This string has come straight from the commit
object. A well-formed object should have a timezone string
of only a few bytes, but we could be operating on data
pushed by a malicious user.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we fetch from a remote, we print a status table like:
From url
* [new branch] foo -> origin/foo
We create this table in a static buffer using sprintf. If
the remote refnames are long, they can overflow this buffer
and smash the stack.
Instead, let's use a strbuf to build the string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The keepcr flag is only used in the split_patches function, which is
only called before a patch application has to stopped for user input,
not after resuming. It is therefore unnecessary to persist the
flag. This seems to have been the case since it was introduced in
ad2c928 (git-am: Add command line parameter `--keep-cr` passing it to
git-mailsplit, 2010-02-27).
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
POSIX says that last parameter to waitpid should be 'int',
so let's make it so.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old "git symbolic-ref" manpage seemed to imply in one place that
symlinks are still the default way to represent symbolic references
and in another that symlinks are deprecated. Fix the text and shorten
the justification for the change of implementation.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment on top of stripspace() claims that the buffer
will no longer be NUL-terminated. However, this has not been
the case at least since the move to using strbuf in 2007.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file is auto-generated by newer versions of ExtUtils::MakeMaker
(presumably starting with the version shipping with Perl 5.14). It just
contains extra information about the environment and arguments to the
Makefile-building process, and should be ignored.
Signed-off-by: Sebastian Morr <sebastian@morr.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 1127148 (Loosen "working file will be lost" check in
Porcelain-ish - 2006-12-04), git-checkout.sh learned to quietly
overwrite ignored files. Howver the code only took .gitignore files
into account.
Standard ignored files include all specified in .gitignore files in
working directory _and_ $GIT_DIR/info/exclude. This patch makes sure
ignored files in info/exclude can also be overwritten automatically in
the spirit of the original patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let the documentation for -t list valid *diff* tools,
not valid *merge* tools.
Signed-off-by: Thomas Hochstein <thh@inter.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the streaming filter API was introduced in v1.7.7-rc0~60^2~7
(2011-05-20), we forgot to add its header to LIB_H. Most translation
units depend on streaming.h via cache.h.
v1.7.5-rc0~48 (Fix sparse warnings, 2011-03-22) introduced undeclared
dependencies by url.o on url.h and thread-utils.o on thread-utils.h.
Noticed by make CHECK_HEADER_DEPENDENCIES=1.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The macro is variadic, which breaks support for pre-C99 compilers,
and it hides an "if", which can make code hard to understand on
first reading if some arguments have side-effects.
The OUTPUT macro seems to have been inspired by the "output" function
from merge-recursive. But that function in merge-recursive exists to
indent output based on the level of recursion and there is no similar
justification for such a function in "notes merge".
Noticed with 'make CC="gcc -std=c89 -pedantic"':
notes-merge.c:24:22: warning: anonymous variadic macros were introduced in C99 [-Wvariadic-macros]
Encouraged-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is to avoid reaching free of uninitialized members.
With an invalid .mailmap (and perhaps in other cases), it can reach
free(mi->name) with garbage for example.
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This even dates back to the very beginning of "git name-rev";
it does not make much sense to dump all objects in the repository
and label non-commits as "undefined".
Signed-off-by: Junio C Hamano <gitster@pobox.com>