Commit Graph

48496 Commits

Author SHA1 Message Date
Stefan Beller
201c14e375 attr.c: drop hashmap_cmp_fn cast
MAke the code more readable and less error prone by avoiding the cast
of the compare function pointer in hashmap_init, but instead have the
correctly named void pointers to casted to the specific data structure.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 13:53:12 -07:00
Junio C Hamano
50ff9ea4a0 Fourteenth batch for 2.14
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 13:33:51 -07:00
Junio C Hamano
00b7cf2379 Merge branch 'jt/unify-object-info'
Code clean-ups.

* jt/unify-object-info:
  sha1_file: refactor has_sha1_file_with_flags
  sha1_file: do not access pack if unneeded
  sha1_file: teach sha1_object_info_extended more flags
  sha1_file: refactor read_object
  sha1_file: move delta base cache code up
  sha1_file: rename LOOKUP_REPLACE_OBJECT
  sha1_file: rename LOOKUP_UNKNOWN_OBJECT
  sha1_file: teach packed_object_info about typename
2017-07-05 13:32:57 -07:00
Junio C Hamano
8e90578ffb Merge branch 'cc/shared-index-permfix'
The split index code did not honor core.sharedrepository setting
correctly.

* cc/shared-index-permfix:
  t1700: make sure split-index respects core.sharedrepository
  t1301: move modebits() to test-lib-functions.sh
  read-cache: use shared perms when writing shared index
2017-07-05 13:32:57 -07:00
Junio C Hamano
5ab148dda0 Merge branch 'rs/sha1-name-readdir-optim'
Optimize "what are the object names already taken in an alternate
object database?" query that is used to derive the length of prefix
an object name is uniquely abbreviated to.

* rs/sha1-name-readdir-optim:
  sha1_file: guard against invalid loose subdirectory numbers
  sha1_file: let for_each_file_in_obj_subdir() handle subdir names
  p4205: add perf test script for pretty log formats
  sha1_name: cache readdir(3) results in find_short_object_filename()
2017-07-05 13:32:56 -07:00
Junio C Hamano
85ce4a6828 Merge branch 'bw/repo-object'
Introduce a "repository" object to eventually make it easier to
work in multiple repositories (the primary focus is to work with
the superproject and its submodules) in a single process.

* bw/repo-object:
  ls-files: use repository object
  repository: enable initialization of submodules
  submodule: convert is_submodule_initialized to work on a repository
  submodule: add repo_read_gitmodules
  submodule-config: store the_submodule_cache in the_repository
  repository: add index_state to struct repo
  config: read config from a repository object
  path: add repo_worktree_path and strbuf_repo_worktree_path
  path: add repo_git_path and strbuf_repo_git_path
  path: worktree_git_path() should not use file relocation
  path: convert do_git_path to take a 'struct repository'
  path: convert strbuf_git_common_path to take a 'struct repository'
  path: always pass in commondir to update_common_dir
  path: create path.h
  environment: store worktree in the_repository
  environment: place key repository state in the_repository
  repository: introduce the repository object
  environment: remove namespace_len variable
  setup: add comment indicating a hack
  setup: don't perform lazy initialization of repository state
2017-07-05 13:32:56 -07:00
Jeff King
2272d3e542 reflog-walk: skip over double-null oid due to HEAD rename
Since 39ee4c6c2f (branch: record creation of renamed branch
in HEAD's log, 2017-02-20), a rename on the currently
checked out branch will create two entries in the HEAD
reflog: one where the branch goes away (switching to the
null oid), and one where it comes back (switching away from
the null oid).

This confuses the reflog-walk code. When walking backwards,
it first sees the null oid in the "old" field of the second
entry. Thanks to the "root commit" logic added by 71abeb753f
(reflog: continue walking the reflog past root commits,
2016-06-03), we keep looking for the next entry by scanning
the "new" field from the previous entry. But that field is
also null! We need to go just a tiny bit further, and look
at its "old" field. But with the current code, we decide the
reflog has nothing else to show and just give up. To the
user this looks like the reflog was truncated by the rename
operation, when in fact those entries are still there.

This patch does the absolute minimal fix, which is to look
back that one extra level and keep traversing.

The resulting behavior may not be the _best_ thing to do in
the long run (for example, we show both reflog entries each
with the same commit id), but it's a simple way to fix the
problem without risking further regressions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 10:34:00 -07:00
Johannes Schindelin
8722947e5c t5534: fix misleading grep invocation
It seems to be a little-known feature of `grep` (and it certainly came
as a surprise to this here developer who believed to know the Unix tools
pretty well) that multiple patterns can be passed in the same
command-line argument simply by separating them by newlines. Watch, and
learn:

	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
	1
	3

That behavior also extends to patterns passed via `-e`, and it is not
modified by passing the option `-E` (but trying this with -P issues the
error "grep: the -P option only supports a single pattern").

It seems that there are more old Unix hands who are surprised by this
behavior, as grep invocations of the form

	grep "$(git rev-parse A B) C" file

were introduced in a85b377d04 (push: the beginning of "git push
--signed", 2014-09-12), and later faithfully copy-edited in b9459019bb
(push: heed user.signingkey for signed pushes, 2014-10-22).

Please note that the output of `git rev-parse A B` separates the object
IDs via *newlines*, not via spaces, and those newlines are preserved
because the interpolation is enclosed in double quotes.

As a consequence, these tests try to validate that the file contains
either A's object ID, or B's object ID followed by C, or both. Clearly,
however, what the test wanted to see is that there is a line that
contains all of them.

This is clearly unintended, and the grep invocations in question really
match too many lines.

Fix the test by avoiding the newlines in the patterns.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 09:26:52 -07:00
xiaoqiang zhao
5453b83bdf send-email: --batch-size to work around some SMTP server limit
Some email servers (e.g. smtp.163.com) limit the number emails to be
sent per session (connection) and this will lead to a faliure when
sending many messages.

Teach send-email to disconnect after sending a number of messages
(configurable via the --batch-size=<num> option), wait for a few
seconds (configurable via the --relogin-delay=<seconds> option) and
reconnect, to work around such a limit.

Also add two configuration variables to give these options the default.

Note:

  We will use this as a band-aid for now, but in the longer term, we
  should look at and react to the SMTP error code from the server;
  Xianqiang reports that 450 and 451 are returned by problematic
  servers.

  cf. https://public-inbox.org/git/7993e188.d18d.15c3560bcaf.Coremail.zxq_yx_007@163.com/

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05 09:09:45 -07:00
Junio C Hamano
cac87dc01d sha1collisiondetection: automatically enable when submodule is populated
If a user wants to experiment with the version of collision
detecting sha1 from the submodule, the user needed to not just
populate the submodule but also needed to turn the knob.

A Makefile trick is easy enough to do so, so let's do this.  When
somebody with a copy of the submodule populated wants not to use it,
that can be done by overriding it in config.mak or from the command
line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:09:37 -07:00
Ævar Arnfjörð Bjarmason
86cfd61e6b sha1dc: optionally use sha1collisiondetection as a submodule
Add an option to use the sha1collisiondetection library from the
submodule in sha1collisiondetection/ instead of in the copy in the
sha1dc/ directory.

This allows us to try out the submodule in sha1collisiondetection
without breaking the build for anyone who's not expecting them as we
work out any kinks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:09:34 -07:00
Ævar Arnfjörð Bjarmason
9936c1b52a sha1dc: update from upstream
Update sha1dc from the latest version by the upstream maintainer[1].

See commit 6b851e536b ("sha1dc: update from upstream", 2017-06-06) for
the last update.

This solves the Big Endian detection on Solaris reported against
v2.13.2[2], hopefully without any regressions. A version of this has
been tested on two Solaris SPARC installations, Cygwin (by jturney on
cygwin@Freenode), and on numerous more boring systems (mainly
linux/x86_64). See [3] for a discussion of the implementation and
platform-specific issues.

See commit a0103914c2 ("sha1dc: update from upstream", 2017-05-20) and
6b851e536b ("sha1dc: update from upstream", 2017-06-06) for previous
attempts in the 2.13 series to address various compile-time feature
detection in this library.

1. 19d97bf5af
2. <CAKKM46tHq13XiW5C8sux3=PZ1VHSu_npG8ExfWwcPD7rkZkyRQ@mail.gmail.com>
   (https://public-inbox.org/git/CAKKM46tHq13XiW5C8sux3=PZ1VHSu_npG8ExfWwcPD7rkZkyRQ@mail.gmail.com/)
3. https://github.com/cr-marcstevens/sha1collisiondetection/pull/34

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:09:22 -07:00
Michael Haggerty
9308b7f3ca read_packed_refs(): die if packed-refs contains bogus data
The old code ignored any lines that it didn't understand, including
unterminated lines. This is dangerous. Instead, `die()` if the
`packed-refs` file contains any unterminated lines or lines that we
don't know how to handle.

This fixes the tests added in the last commit.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:57 -07:00
Michael Haggerty
02a1a42056 t3210: add some tests of bogus packed-refs file contents
If `packed-refs` contains indecipherable lines, we should emit an
error and quit rather than just skipping the lines. Unfortunately, we
currently do the latter. Add some failing tests demonstrating the
problem.

This will be fixed in the next commit.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:57 -07:00
Michael Haggerty
e5cc7d7d2b repack_without_refs(): don't lock or unlock the packed refs
Change `repack_without_refs()` to expect the packed-refs lock to be
held already, and not to release the lock before returning. Change the
callers to deal with lock management.

This change makes it possible for callers to hold the packed-refs lock
for a longer span of time, a possibility that will eventually make it
possible to fix some longstanding races.

The only semantic change here is that `repack_without_refs()` used to
forget to release the lock in the `if (!removed)` exit path. That
omission is now fixed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:56 -07:00
Ævar Arnfjörð Bjarmason
3b702239d6 strbuf: change an always NULL/"" strbuf_addftime() param to bool
strbuf_addftime() allows callers to pass a time zone name for
expanding %Z. The only current caller either passes the empty string
or NULL, in which case %Z is handed over verbatim to strftime(3).
Replace that string parameter with a flag controlling whether to
remove %Z from the format specification. This simplifies the code.

Commit-message-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-01 10:47:05 -07:00
René Scharfe
8bc172e5f2 apply: use starts_with() in gitdiff_verify_name()
Avoid running over the end of line -- a C string whose length is not
known to this function -- by using starts_with() instead of memcmp(3)
for checking if it starts with "/dev/null".  Also simply include the
newline in the string constant to compare against.  Drop a comment that
just states the obvious.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-01 10:39:51 -07:00
Stefan Beller
61e89eaae8 diff: document the new --color-moved setting
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
Stefan Beller
86b452e276 diff.c: add dimming to moved line detection
Any lines inside a moved block of code are not interesting. Boundaries
of blocks are only interesting if they are next to another block of moved
code.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
Stefan Beller
176841f0c9 diff.c: color moved lines differently, plain mode
Add the 'plain' mode for move detection of code. This omits the checking
for adjacent blocks, so it is not as useful. If you have a lot of the
same blocks moved in the same patch, the 'Zebra' would end up slow as it
is O(n^2) (n is number of same blocks). So this may be useful there and
is generally easy to add. Instead be very literal at the move detection,
do not skip over short blocks here.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
Stefan Beller
2e2d5ac184 diff.c: color moved lines differently
When a patch consists mostly of moving blocks of code around, it can
be quite tedious to ensure that the blocks are moved verbatim, and not
undesirably modified in the move. To that end, color blocks that are
moved within the same patch differently. For example (OM, del, add,
and NM are different colors):

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OM]  -        if (!is_authorized_user())
    [OM]  -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OM]  -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NM]  +        sensitive_stuff(spanning,
    [NM]  +                        multiple,
    [NM]  +                        lines);
    [NM]  +}

However adjacent blocks may be problematic. For example, in this
potentially malicious patch, the swapping of blocks can be spotted:

    [OM]  -void sensitive_stuff(void)
    [OM]  -{
    [OMA] -        if (!is_authorized_user())
    [OMA] -                die("unauthorized");
    [OM]  -        sensitive_stuff(spanning,
    [OM]  -                        multiple,
    [OM]  -                        lines);
    [OMA] -}

           void another_function()
           {
    [del] -        printf("foo");
    [add] +        printf("bar");
           }

    [NM]  +void sensitive_stuff(void)
    [NM]  +{
    [NMA] +        sensitive_stuff(spanning,
    [NMA] +                        multiple,
    [NMA] +                        lines);
    [NM]  +        if (!is_authorized_user())
    [NM]  +                die("unauthorized");
    [NMA] +}

If the moved code is larger, it is easier to hide some permutation in the
code, which is why some alternative coloring is needed.

This patch implements the first mode:
* basic alternating 'Zebra' mode
  This conveys all information needed to the user.  Defer customization to
  later patches.

First I implemented an alternative design, which would try to fingerprint
a line by its neighbors to detect if we are in a block or at the boundary.
This idea iss error prone as it inspected each line and its neighboring
lines to determine if the line was (a) moved and (b) if was deep inside
a hunk by having matching neighboring lines. This is unreliable as the
we can construct hunks which have equal neighbors that just exceed the
number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter
as a line, that is permutated to AXYZCXYZBXYZD..').

Instead this provides a dynamic programming greedy algorithm that finds
the largest moved hunk and then has several modes on highlighting bounds.

A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

It is useful to have moved lines colored, but there are annoying corner
cases, such as a single line moved, that is very common. For example
in a typical patch of C code, we have closing braces that end statement
blocks or functions.

While it is technically true that these lines are moved as they show up
elsewhere, it is harmful for the review as the reviewers attention is
drawn to such a minor side annoyance.

For now let's have a simple solution of hardcoding the number of
moved lines to be at least 3 before coloring them. Note, that the
length is applied across all blocks to find the 'lonely' blocks
that pollute new code, but do not interfere with a permutated
block where each permutation has less lines than 3.

Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:59:42 -07:00
Lars Schneider
2841e8f81c convert: add "status=delayed" to filter process protocol
Some `clean` / `smudge` filters may require a significant amount of
time to process a single blob (e.g. the Git LFS smudge filter might
perform network requests). During this process the Git checkout
operation is blocked and Git needs to wait until the filter is done to
continue with the checkout.

Teach the filter process protocol, introduced in edcc8581 ("convert: add
filter.<driver>.process option", 2016-10-16), to accept the status
"delayed" as response to a filter request. Upon this response Git
continues with the checkout operation. After the checkout operation Git
calls "finish_delayed_checkout" which queries the filter for remaining
blobs. If the filter is still working on the completion, then the filter
is expected to block. If the filter has completed all remaining blobs
then an empty response is expected.

Git has a multiple code paths that checkout a blob. Support delayed
checkouts only in `clone` (in unpack-trees.c) and `checkout` operations
for now. The optimization is most effective in these code paths as all
files of the tree are processed.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:50:41 -07:00
Lars Schneider
1514c8edd6 convert: refactor capabilities negotiation
The code to negotiate long running filter capabilities was very
repetitive for new capabilities. Replace the repetitive conditional
statements with a table-driven approach. This is useful for the
subsequent patch 'convert: add "status=delayed" to filter process
protocol'.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:50:21 -07:00
Junio C Hamano
5116f791c1 Thirteenth batch for 2.14
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:47:49 -07:00
Junio C Hamano
748cffc22b Merge branch 'vs/typofixes'
Many typofixes.

* vs/typofixes:
  Spelling fixes
2017-06-30 13:45:25 -07:00
Junio C Hamano
53ee6b8f1a Merge branch 'rs/apply-validate-input'
Tighten error checks for invalid "git apply" input.

* rs/apply-validate-input:
  apply: check git diffs for mutually exclusive header lines
  apply: check git diffs for invalid file modes
  apply: check git diffs for missing old filenames
2017-06-30 13:45:24 -07:00
Junio C Hamano
ca069a3c5c Merge branch 'jc/pack-bitmap-unaligned'
An unaligned 32-bit access in pack-bitmap code ahs been corrected.

* jc/pack-bitmap-unaligned:
  pack-bitmap: don't perform unaligned memory access
2017-06-30 13:45:24 -07:00
Junio C Hamano
9bab852f65 Merge branch 'ah/doc-pretty-color-auto-prefix'
Doc update.

* ah/doc-pretty-color-auto-prefix:
  doc: clarify syntax for %C(auto,...) in pretty formats
2017-06-30 13:45:23 -07:00
Junio C Hamano
d5d6a44099 Merge branch 'ks/submodule-add-doc'
Doc update.

* ks/submodule-add-doc:
  Documentation/git-submodule: cleanup "add" section
2017-06-30 13:45:22 -07:00
Junio C Hamano
7e46f19a10 Merge branch 'ks/status-initial-commit'
"git status" has long shown essentially the same message as "git
commit"; the message it gives while preparing for the root commit,
i.e. "Initial commit", was hard to understand for some new users.
Now it says "No commits yet" to stress more on the current status
(rather than the commit the user is preparing for, which is more in
line with the focus of "git commit").

* ks/status-initial-commit:
  status: contextually notify user about an initial commit
2017-06-30 13:45:22 -07:00
Junio C Hamano
c7ee0baae7 Merge branch 'ab/die-errors-in-threaded'
Traditionally, the default die() routine had a code to prevent it
from getting called multiple times, which interacted badly when a
threaded program used it (one downside is that the real error may
be hidden and instead the only error message given to the user may
end up being "die recursion detected", which is not very useful).

* ab/die-errors-in-threaded:
  die(): stop hiding errors due to overzealous recursion guard
2017-06-30 13:45:21 -07:00
Junio C Hamano
5452224710 Merge branch 'pw/rebase-i-regression-fix-tests'
Fix a recent regression to "git rebase -i" and add tests that would
have caught it and others.

* pw/rebase-i-regression-fix-tests:
  t3420: fix under GETTEXT_POISON build
  rebase: add more regression tests for console output
  rebase: add regression tests for console output
  rebase -i: add test for reflog message
  sequencer: print autostash messages to stderr
2017-06-30 13:45:21 -07:00
Stefan Beller
e6e045f803 diff.c: buffer all output if asked to
Introduce a new option 'emitted_symbols' in the struct diff_options which
controls whether all output is buffered up until all output is available.
It is set internally in diff.c when necessary.

We'll have a new struct 'emitted_string' in diff.c which will be used to
buffer each line.  The emitted_string will duplicate the memory of the
line to buffer as that is easiest to reason about for now. In a future
patch we may want to decrease the memory usage by not duplicating all
output for buffering but rather we may want to store offsets into the
file or in case of hunk descriptions such as the similarity score, we
could just store the relevant number and reproduce the text later on.

This approach was chosen as a first step because it is quite simple
compared to the alternative with less memory footprint.

emit_diff_symbol factors out the emission part and depending on the
diff_options->emitted_symbols the emission will be performed directly
when calling emit_diff_symbol or after the whole process is done, i.e.
by buffering we have add the possibility for a second pass over the
whole output before doing the actual output.

In 6440d34 (2012-03-14, diff: tweak a _copy_ of diff_options with
word-diff) we introduced a duplicate diff options struct for word
emissions as we may have different regex settings in there.
When buffering the output, we need to operate on just one buffer,
so we have to copy back the emissions of the word buffer into the
main buffer.

Unconditionally enable output via buffer in this patch as it yields
a great opportunity for testing, i.e. all the diff tests from the
test suite pass without having reordering issues (i.e. only parts
of the output got buffered, and we forgot to buffer other parts).
The test suite passes, which gives confidence that we converted all
functions to use emit_string for output.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
146fdb0dfe diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
30b7e1e7ef diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
bd033291d5 diff.c: convert word diffing to use emit_diff_symbol
The word diffing is not line oriented and would need some serious
effort to be transformed into a line oriented approach, so
just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
0911c475c8 diff.c: convert show_stats to use emit_diff_symbol
We call print_stat_summary from builtin/apply, so we still
need the version with a file pointer, so introduce
print_stat_summary_0 that uses emit_string machinery and
keep print_stat_summary with the same arguments around.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
4eed0ebd4d diff.c: convert emit_binary_diff_body to use emit_diff_symbol
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
f3597138df submodule.c: migrate diff output to use emit_diff_symbol
As the submodule process is no longer attached to the same file pointer
'o->file' as the superprojects process, there is a different result in
color.c::check_auto_color. That is why we need to pass coloring explicitly,
such that the submodule coloring decision will be made by the child process
processing the submodule. Only DIFF_SYMBOL_SUBMODULE_PIPETHROUGH contains
color, the other symbols are for embedding the submodule output into the
superprojects output.

Remove the colors from the function signatures, as all the coloring
decisions will be made either inside the child process or the final
emit_diff_symbol, but not in the functions driving the submodule diff.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
5af6ea957c diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:02 -07:00
Stefan Beller
4acaaa7af6 diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
we could save a little bit of memory when buffering in a later mode
by just passing the inner part ("%s and %s", file1, file 2), but
those a just a few bytes, so instead let's reuse the implementation from
DIFF_SYMBOL_HEADER and keep the whole line around.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
a29b0a13bd diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
The header is constructed lazily including line breaks, so just emit
the raw string as is.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
3ee8b7bfe4 diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS}
We have to use fprintf instead of emit_line, because we want to emit the
tab after the color. This is important for ancient versions of gnu patch
AFAICT, although we probably do not want to feed colored output to the
patch utility, such that it would not matter if the trailing tab is
colored. Keep the corner case as-is though.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
f2bb1218f1 diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
The context marker use the exact same output pattern, so reuse it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
ff958679cd diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN]
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
091f8e28b4 diff.c: migrate emit_line_checked to use emit_diff_symbol
Add a new flags field to emit_diff_symbol, that will be used by
context lines for:
* white space rules that are applicable (The first 12 bits)
  Take a note in cahe.c as well, when this ws rules are extended we have
  to fix the bits in the flags field.
* how the rules are evaluated (actually this double encodes the sign
  of the line, but the code is easier to keep this way, bits 13,14,15)
* if the line a blank line at EOF (bit 16)

The check if new lines need to be marked up as extra lines at the end of
file, is now done unconditionally. That should be ok, as
'new_blank_line_at_eof' has a quick early return.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
b9cbfde6b1 diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
68abc6f1c7 diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
c64b420b4c diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_MARKER
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00
Stefan Beller
36a4cefdf4 diff.c: introduce emit_diff_symbol
In a later patch we want to buffer all output before emitting it as a
new feature ("markup moved lines") conceptually cannot be implemented
in a single pass over the output.

There are different approaches to buffer all output such as:
* Buffering on the char level, i.e. we'd have a char[] which would
  grow at approximately 80 characters a line. This would keep the
  output completely unstructured, but might be very easy to implement,
  such as redirecting all output to a temporary file and working off
  that. The later passes over the buffer are quite complicated though,
  because we have to parse back any output and then decide if it should
  be modified.

* Buffer on a line level. As the output is mostly line oriented already,
  this would make sense, but it still is a bit awkward as we'd have to
  make sense of it again by looking at the first characters of a line
  to decide what part of a diff a line is.

* Buffer semantically. Imagine there is a formal grammar for the diff
  output and we'd keep the symbols of this grammar around. This keeps
  the highest level of structure in the buffered data, such that the
  actual memory requirements are less than say the first option. Instead
  of buffering the characters of the line, we'll buffer what we intend
  to do plus additional information for the specifics. An output of

    diff --git a/new.txt b/new.txt
    index fa69b07..412428c 100644
    Binary files a/new.txt and b/new.txt differ

  could be buffered as
     DIFF_SYMBOL_DIFF_START + new.txt
     DIFF_SYMBOL_INDEX_MODE + fa69b07 412428c "non-executable" flag
     DIFF_SYMBOL_BINARY_FILES + new.txt

This and the following patches introduce the third option of buffering
by first moving any output to emit_diff_symbol, and then introducing the
buffering in this function.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:13:01 -07:00