Revese the order of delta applying so that by the time a delta is
applied, its base is either non-delta or already inflated.
get_base_data() is still recursive, but because base's data is always
ready, the inner get_base_data() call never has any chance to call
itself again.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current find_unresolved_deltas() links all bases together in a form of
tree, using struct base_data, with prev_base pointer to point to
parent node. Then it traverses down from parent to children in
recursive manner with all base_data allocated on stack.
To eliminate recursion, we simply need to put all on heap
(parse_pack_objects and fix_unresolved_deltas). After that, it's
simple non-recursive depth-first traversal loop. Each node also
maintains its own state (ofs and ref indices) to iterate over all
children nodes.
So we process one node:
- if it returns a new (child) node (a parent base), we link it to our
tree, then process the new node.
- if it returns nothing, the node is done, free it. We go back to
parent node and resume whatever it's doing.
and do it until we have no nodes to process.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recursion in a DAG is generally a bad idea because it could be very
deep. Be defensive and avoid recursion in mark_parents_uninteresting()
and clear_commit_marks().
mark_parents_uninteresting() learns a trick from clear_commit_marks()
to avoid malloc() in (dominant) single-parent case.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pathspec structure has a few bits of data to drive various operation
modes after we unified the pathspec matching logic in various codepaths.
For example, max_depth field is there so that "git grep" can limit the
output for files found in limited depth of tree traversal. Also in order
to show just the surface level differences in "git diff-tree", recursive
field stops us from descending into deeper level of the tree structure
when it is set to false, and this also affects pathspec matching when
we have wildcards in the pathspec.
The diff-index has always wanted the recursive behaviour, and wanted to
match pathspecs without any depth limit. But we forgot to do so when we
updated tree_entry_interesting() logic to unify the pathspec matching
logic.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's actually unlimited recursion if wildcards are active regardless
--max-depth
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two "^" characters were incorrectly being interpreted as markup for
superscripting. Fix them by writing them as attribute references
"{caret}".
Although a single "^" character in a paragraph cannot be
misinterpreted in this way, also write other "^" characters as
"{caret}" in the interest of good hygiene (unless they are in literal
paragraphs, of course, in which context attribute references are not
recognized).
Spell "{}" consistently, namely *not* quoted as "\{\}". Since the
braces are empty, they cannot be interpreted as an attribute
reference, and either spelling is OK. So arbitrarily choose one
variation and use it consistently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Update draft release notes to 1.7.8.4
Update draft release notes to 1.7.7.6
Update draft release notes to 1.7.6.6
thin-pack: try harder to use preferred base objects as base
When creating a pack using objects that reside in existing packs, we try
to avoid recomputing futile delta between an object (trg) and a candidate
for its base object (src) if they are stored in the same packfile, and trg
is not recorded as a delta already. This heuristics makes sense because it
is likely that we tried to express trg as a delta based on src but it did
not produce a good delta when we created the existing pack.
As the pack heuristics prefer producing delta to remove data, and Linus's
law dictates that the size of a file grows over time, we tend to record
the newest version of the file as inflated, and older ones as delta
against it.
When creating a thin-pack to transfer recent history, it is likely that we
will try to send an object that is recorded in full, as it is newer. But
the heuristics to avoid recomputing futile delta effectively forbids us
from attempting to express such an object as a delta based on another
object. Sending an object in full is often more expensive than sending a
suboptimal delta based on other objects, and it is even more so if we
could use an object we know the receiving end already has (i.e. preferred
base object) as the delta base.
Tweak the recomputation avoidance logic, so that we do not punt on
computing delta against a preferred base object.
The effect of this change can be seen on two simulated upload-pack
workloads. The first is based on 44 reflog entries from my git.git
origin/master reflog, and represents the packs that kernel.org sent me git
updates for the past month or two. The second workload represents much
larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
v1.2.0, and so on.
The table below shows the average generated pack size and the average CPU
time consumed for each dataset, both before and after the patch:
dataset
| reflog | tags
---------------------------------
before | 53358 | 2750977
size after | 32398 | 2668479
change | -39% | -3%
---------------------------------
before | 0.18 | 1.12
CPU after | 0.18 | 1.15
change | +0% | +3%
This patch makes a much bigger difference for packs with a shorter slice
of history (since its effect is seen at the boundaries of the pack) though
it has some benefit even for larger packs.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The word-diff logic accumulates + and - lines until another line type
appears (normally [ @\]), at which point it generates the word diff.
This is usually correct, but it breaks when the preimage does not have
a newline at EOF:
$ printf "%s" "a a a" >a
$ printf "%s\n" "a ab a" >b
$ git diff --no-index --word-diff a b
diff --git 1/a 2/b
index 9f68e94..6a7c02f 100644
--- 1/a
+++ 2/b
@@ -1 +1 @@
[-a a a-]
No newline at end of file
{+a ab a+}
Because of the order of the lines in a unified diff
@@ -1 +1 @@
-a a a
\ No newline at end of file
+a ab a
the '\' line flushed the buffers, and the - and + lines were never
matched with each other.
A proper fix would defer such markers until the end of the hunk.
However, word-diff is inherently whitespace-ignoring, so as a cheap
fix simply ignore the marker (and hide it from the output).
We use a prefix match for '\ ' to parallel the logic in
apply.c:parse_fragment(). We currently do not localize this string
(just accept other variants of it in git-apply), but this should be
future-proof.
Noticed-by: Ivan Shirokoff <shirokoff@yandex-team.ru>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tightening done in (ee27ca4a: archive: don't let remote clients
get unreachable commits, 2011-11-17) went too far and disallowed
HEAD:Documentation as it would try to find "HEAD:Documentation" as a
ref.
Only DWIM the "HEAD" part to see if it exists as a ref. Once we're
sure that we've been given a valid ref, we follow the normal code
path. This still disallows attempts to access commits which are not
branch tips.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function frees the individual "struct match_attr"s we
have allocated, but forgot to free the array holding their
pointers, leading to a minor memory leak (but it can add up
after checking attributes for paths in many directories).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose the cherry-picking machinery through a public
sequencer_pick_revisions() (renamed from pick_revisions() in
builtin/revert.c), so that cherry-picking and reverting are special
cases of a general sequencer operation. The cherry-pick builtin is
now a thin wrapper that does command-line argument parsing before
calling into sequencer_pick_revisions(). In the future, we can write
a new "foo" builtin that calls into the sequencer like:
memset(&opts, 0, sizeof(opts));
opts.action = REPLAY_FOO;
opts.revisions = xmalloc(sizeof(*opts.revs));
parse_args_populate_opts(argc, argv, &opts);
init_revisions(opts.revs);
sequencer_pick_revisions(&opts);
This patch does not intend to make any functional changes. Check
with:
$ git blame -s -C HEAD^..HEAD -- sequencer.c | grep -C3 '^[^^]'
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
REVERT and CHERRY_PICK and are unsuitable names for an enumerator in a
public interface, because they are generic enough to be likely to
clash with identifiers with other meanings. Rename to REPLAY_REVERT
and REPLAY_PICK as preparation for exposing them.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Due to MSYS path mangling GIT_DIR contains a Windows-style path when
checked inside a Perl script even if GIT_DIR was previously set to an
MSYS-style path in a shell script. So explicitly convert to an MSYS-style
path before calling Perl's rel2abs() to make it work.
This fix was inspired by a very similar patch in WebKit:
http://trac.webkit.org/changeset/76255/trunk/Tools/Scripts/commit-log-editor
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Tested-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For details, see the commit message of 4114156ae9. Note that while using
$PWD as part of GIT_DIR is not required here, it does no harm and it is
more consistent. In addition, on MSYS using an environment variable should
be slightly faster than spawning an external executable.
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
unix_stream_connect and unix_stream_listen return -1 on error, with
errno set by the failing underlying call to allow the caller to write
a useful diagnosis.
Unfortunately the error path involves a few system calls itself, such
as close(), that can themselves touch errno.
This is not as worrisome as it might sound. If close() fails, this
just means substituting one meaningful error message for another,
which is perfectly fine. However, when the call _succeeds_, it is
allowed to (and sometimes might) clobber errno along the way with some
undefined value, so it is good higiene to save errno and restore it
immediately before returning to the caller. Do so.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since its very first description of -k, the documentation for
git-mailinfo claimed that (in the case without -k) after cleaning up
bracketed strings [blah], it would insert [PATCH].
It doesn't; on the contrary, one of the important jobs of mailinfo is
to remove those strings.
Since we're already there, rewrite the paragraph to give a complete
enumeration of all the transformations. Specifically, it was missing
the whitespace normalization (run of isspace(c) -> ' ') and the
removal of leading ':'.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce new tests that look more closely at overlay situations
when there are conflicting files. Five of these are broken.
Document the brokenness.
This is a fundamental problem with how git-p4 only "borrows" a
client spec. At some sync operation, a new change can contain
a file which is already in the repo or explicitly deleted through
another mapping. To sort this out would involve listing all the
files in the client spec to find one with a higher priority.
While this is not too hard for the initial import, subsequent
sync operations would be very costly.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test relied on what now is seen as broken behavior
in --use-client-spec. Change it to make sure it works
according to the new behavior as described in
ecb7cf9 (git-p4: rewrite view handling, 2012-01-02) and
c700b68 (git-p4: test client view handling, 2012-01-02).
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Catch the case where a ... exists at the end, and also elsehwere.
Reported-by: Gary Gibbons <ggibbons@perforce.com>
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reason why the trailing slash is needed is obvious. refs/stash and
HEAD are not namespace, but complete refs. Do full string compare on them.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add commit message to avoid commit's aborting due to the lack of
commit message, not because there are INTENT_TO_ADD entries in index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command takes the "start" argument and computes the merge base
between it and the commit to be pulled so that we can show the diffstat,
but uses the "start" argument as-is when composing the message
The following changes since commit $X are available
to tell the integrator which commit the work is based on. Giving "origin"
(most of the time it resolves to refs/remotes/origin/master) as the start
argument is often convenient, but it is usually not the fork point, and
does not help the integrator at all.
Use the real fork point, which is the merge base we already compute, when
composing that part of the message.
Suggested-by: Linus Torvalds
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of rerere.enabled left the user in the dark as to who
might create an rr-cache directory. Add a note that simply invoking
rerere does this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thas would de-dent the body of a function that has grown rather large over
time, making it a bit easier to read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In prepare_attr_stack, we pop the old elements of the stack
(which were left from a previous lookup and may or may not
be useful to us). Our loop to do so checks that we never
reach the top of the stack. However, the code immediately
afterwards will segfault if we did actually reach the top of
the stack.
Fortunately, this is not an actual bug, since we will never
pop all of the stack elements (we will always keep the root
gitattributes, as well as the builtin ones). So the extra
check in the loop condition simply clutters the code and
makes the intent less clear. Let's get rid of it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>