"git shortlog" used to accumulate various pieces of information
regardless of what was asked to be shown in the final output. It
has been optimized by noticing what need not to be collected
(e.g. there is no need to collect the log messages when showing
only the number of changes).
* jk/shortlog:
shortlog: don't warn on empty author
shortlog: optimize out useless string list
shortlog: optimize out useless "<none>" normalization
shortlog: optimize "--summary" mode
shortlog: replace hand-parsing of author with pretty-printer
shortlog: use strbufs to read from stdin
shortlog: match both "Author:" and "author" on stdin
The preliminary clean-up for jc/peace-with-crlf topic.
* jc/strbuf-getline:
strbuf: give strbuf_getline() to the "most text friendly" variant
checkout-index: there are only two possible line terminations
update-index: there are only two possible line terminations
check-ignore: there are only two possible line terminations
check-attr: there are only two possible line terminations
mktree: there are only two possible line terminations
strbuf: introduce strbuf_getline_{lf,nul}()
strbuf: make strbuf_getline_crlf() global
strbuf: miniscule style fix
Beginning of the upstreaming process of Git for Windows effort.
* js/msys2:
mingw: uglify (a, 0) definitions to shut up warnings
mingw: squash another warning about a cast
mingw: avoid warnings when casting HANDLEs to int
mingw: avoid redefining S_* constants
compat/winansi: support compiling with MSys2
compat/mingw: support MSys2-based MinGW build
nedmalloc: allow compiling with MSys2's compiler
config.mak.uname: supporting 64-bit MSys2
config.mak.uname: support MSys2
"interpret-trailers" has been taught to optionally update a file in
place, instead of always writing the result to the standard output.
* tk/interpret-trailers-in-place:
interpret-trailers: add option for in-place editing
trailer: allow to write to files other than stdout
The description for SANITY prerequisite the test suite uses has
been clarified both in the comment and in the implementation.
* jk/sanity:
test-lib: clarify and tighten SANITY
A recent optimization to filter-branch in v2.7.0 introduced a
regression when --prune-empty filter is used, which has been
corrected.
* jk/filter-branch-no-index:
filter-branch: resolve $commit^{tree} in no-index case
The low-level code that is used to create symbolic references has
been updated to share more code with the code that deals with
normal references.
* jk/symbolic-ref:
lock_ref_sha1_basic: handle REF_NODEREF with invalid refs
lock_ref_sha1_basic: always fill old_oid while holding lock
checkout,clone: check return value of create_symref
create_symref: write reflog while holding lock
create_symref: use existing ref-lock code
create_symref: modernize variable names
"git format-patch" learned to notice format.outputDirectory
configuration variable. This allows "-o <dir>" option to be
omitted on the command line if you always use the same directory in
your workflow.
* ak/format-patch-odir-config:
format-patch: introduce format.outputDirectory configuration
Many codepaths that run "gc --auto" before exiting kept packfiles
mapped and left the file descriptors to them open, which was not
friendly to systems that cannot remove files that are open. They
now close the packs before doing so.
* js/close-packs-before-gc:
receive-pack: release pack files before garbage-collecting
merge: release pack files before garbage-collecting
am: release pack files before garbage-collecting
fetch: release pack files before garbage-collecting
"git rebase", unlike all other callers of "gc --auto", did not
ignore the exit code from "gc --auto".
* jk/ok-to-fail-gc-auto-in-rebase:
rebase: ignore failures from "gc --auto"
"git pull --rebase" has been extended to allow invoking
"rebase -i".
* js/pull-rebase-i:
completion: add missing branch.*.rebase values
remote: handle the config setting branch.*.rebase=interactive
pull: allow interactive rebase with --rebase=interactive
A shell script style update to change `command substitution` into
$(command substitution). Coverts contrib/ and much of the t/
directory contents.
* ep/shell-command-substitution-style: (92 commits)
t9901-git-web--browse.sh: use the $( ... ) construct for command substitution
t9501-gitweb-standalone-http-status.sh: use the $( ... ) construct for command substitution
t9350-fast-export.sh: use the $( ... ) construct for command substitution
t9300-fast-import.sh: use the $( ... ) construct for command substitution
t9150-svk-mergetickets.sh: use the $( ... ) construct for command substitution
t9145-git-svn-master-branch.sh: use the $( ... ) construct for command substitution
t9138-git-svn-authors-prog.sh: use the $( ... ) construct for command substitution
t9137-git-svn-dcommit-clobber-series.sh: use the $( ... ) construct for command substitution
t9132-git-svn-broken-symlink.sh: use the $( ... ) construct for command substitution
t9130-git-svn-authors-file.sh: use the $( ... ) construct for command substitution
t9129-git-svn-i18n-commitencoding.sh: use the $( ... ) construct for command substitution
t9119-git-svn-info.sh: use the $( ... ) construct for command substitution
t9118-git-svn-funky-branch-names.sh: use the $( ... ) construct for command substitution
t9114-git-svn-dcommit-merge.sh: use the $( ... ) construct for command substitution
t9110-git-svn-use-svm-props.sh: use the $( ... ) construct for command substitution
t9109-git-svn-multi-glob.sh: use the $( ... ) construct for command substitution
t9108-git-svn-glob.sh: use the $( ... ) construct for command substitution
t9107-git-svn-migrate.sh: use the $( ... ) construct for command substitution
t9105-git-svn-commit-diff.sh: use the $( ... ) construct for command substitution
t9104-git-svn-follow-parent.sh: use the $( ... ) construct for command substitution
...
"git subtree" (in contrib/) records the tag object name in the
commit log message when a subtree is added using a tag, without
peeling it down to the underlying commit. The tag needs to be
peeled when "git subtree split" wants to work on the commit, but
the command forgot to do so.
* rm/subtree-unwrap-tags:
contrib/subtree: unwrap tag refs
"git grep" by default does not fall back to its "--no-index"
behaviour outside a directory under Git's control (otherwise the
user may by mistake end up running a huge recursive search); with a
new configuration (set in $HOME/.gitconfig--by definition this
cannot be set in the config file per project), this safety can be
disabled.
* tg/grep-no-index-fallback:
builtin/grep: add grep.fallbackToNoIndex config
t7810: correct --no-index test
Asking gitweb for a nonexistent commit left a warning in the server
log.
Somebody may want to follow this up with a new test, perhaps?
IIRC, we do test that no Perl warnings are given to the server log,
so this should have been caught if our test coverage were good.
* ho/gitweb-squelch-undef-warning:
gitweb: squelch "uninitialized value" warning
Some codepaths used fopen(3) when opening a fixed path in $GIT_DIR
(e.g. COMMIT_EDITMSG) that is meant to be left after the command is
done. This however did not work well if the repository is set to
be shared with core.sharedRepository and the umask of the previous
user is tighter. They have been made to work better by calling
unlink(2) and retrying after fopen(3) fails with EPERM.
* js/fopen-harder:
Handle more file writes correctly in shared repos
commit: allow editing the commit message even in shared repos
Documentation for "git fetch --depth" has been updated for clarity.
* ss/clone-depth-single-doc:
docs: clarify that --depth for git-fetch works with newly initialized repos
docs: say "commits" in the --depth option wording for git-clone
docs: clarify that passing --depth to git-clone implies --single-branch
The ignore mechanism saw a few regressions around untracked file
listing and sparse checkout selection areas in 2.7.0; the change
that is responsible for the regression has been reverted.
* nd/exclusion-regression-fix:
Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
"git reflog" incorrectly assumed that all objects that used to be
at the tip of a ref must be commits, which caused it to segfault.
* dk/reflog-walk-with-non-commit:
reflog-walk: don't segfault on non-commit sha1's in the reflog
The documentation has been updated to hint the connection between
the '--signoff' option and DCO.
* dw/signoff-doc:
Expand documentation describing --signoff
A few unportable C construct have been spotted by clang compiler
and have been fixed.
* jk/clang-pedantic:
bswap: add NO_UNALIGNED_LOADS define
avoid shifting signed integers 31 bits
"git send-email" was confused by escaped quotes stored in the alias
files saved by "mutt", which has been corrected.
* ew/send-email-mutt-alias-fix:
git-send-email: do not double-escape quotes from mutt
Drop a few old "todo" items by deciding that the change one of them
suggests is not such a good idea, and doing the change the other
one suggested to do.
* ss/user-manual:
user-manual: add addition gitweb information
user-manual: add section documenting shallow clones
glossary: define the term shallow clone
user-manual: remove temporary branch entry from todo list
d95138e6 (setup: set env $GIT_WORK_TREE when work tree is set, like
$GIT_DIR, 2015-06-26) attempted to work around a glitch in alias
handling by overwriting GIT_WORK_TREE environment variable to
affect subprocesses when set_git_work_tree() gets called, which
resulted in a rather unpleasant regression to "clone" and "init".
Try to address the same issue by always restoring the environment
and respawning the real underlying command when handling alias.
* nd/clear-gitenv-upon-use-of-alias:
run-command: don't warn on SIGPIPE deaths
git.c: make sure we do not leak GIT_* to alias scripts
setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when ..
git.c: make it clear save_env() is for alias handling only
Paths that have been told the index about with "add -N" are not
quite yet in the index, but a few commands behaved as if they
already are in a harmful way.
* nd/ita-cleanup:
grep: make it clear i-t-a entries are ignored
add and use a convenience macro ce_intent_to_add()
blame: remove obsolete comment
The "exclude_list" structure has the usual "alloc, nr" pair of
fields to be used by ALLOC_GROW(), but clear_exclude_list() forgot
to reset 'alloc' to 0 when it cleared 'nr'to discard the managed
array.
* nd/dir-exclude-cleanup:
dir.c: clean the entire struct in clear_exclude_list()
In-core storage of the reverse index for .pack files (which lets
you go from a pack offset to an object name) has been streamlined.
* jk/pack-revindex:
pack-revindex: store entries directly in packed_git
pack-revindex: drop hash table
Some "git notes" operations, e.g. "git log --notes=<note>", should
be able to read notes from any tree-ish that is shaped like a notes
tree, but the notes infrastructure required that the argument must
be a ref under refs/notes/. Loosen it to require a valid ref only
when the operation would update the notes (in which case we must
have a place to store the updated notes tree, iow, a ref).
* mh/notes-allow-reading-treeish:
notes: allow treeish expressions as notes ref
Commit 348d4f2 (filter-branch: skip index read/write when
possible, 2015-11-06) taught filter-branch to optimize out
the final "git write-tree" when we know we haven't touched
the tree with any of our filters. It does by simply putting
the literal text "$commit^{tree}" into the "$tree" variable,
avoiding a useless rev-parse call.
However, when we pass this to git_commit_non_empty_tree(),
it gets confused; it resolves "$commit^{tree}" itself, and
compares our string to the 40-hex sha1, which obviously
doesn't match. As a result, "--prune-empty" (or any custom
filter using git_commit_non_empty_tree) will fail to drop
an empty commit (when filter-branch is used without a tree
or index filter).
Let's resolve $tree to the 40-hex ourselves, so that
git_commit_non_empty_tree can work. Unfortunately, this is a
bit slower due to the extra process overhead:
$ cd t/perf && ./run 348d4f2 HEAD p7000-filter-branch.sh
[...]
Test 348d4f2 HEAD
--------------------------------------------------------------
7000.2: noop filter 3.76(0.24+0.26) 4.54(0.28+0.24) +20.7%
We could try to make git_commit_non_empty_tree more clever.
However, the value of $tree here is technically
user-visible. The user can provide arbitrary shell code at
this stage, which could itself have a similar assumption to
what is in git_commit_non_empty_tree. So the conservative
choice to fix this regression is to take the 20% hit and
give the pre-348d4f2 behavior. We still end up much faster
than before the optimization:
$ cd t/perf && ./run 348d4f2^ HEAD p7000-filter-branch.sh
[...]
Test 348d4f2^ HEAD
--------------------------------------------------------------
7000.2: noop filter 9.51(4.32+0.40) 4.51(0.28+0.23) -52.6%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f400e51c (test-lib.sh: set prerequisite SANITY by testing what we
really need, 2015-01-27) improved the way SANITY prerequisite was
determined, but made the resulting code (incorrectly) imply that
SANITY is all about effects of permission bits of the containing
directory has on the files contained in it by the comment it added,
its log message and the actual tests.
State what SANITY is about more clearly in the comment, and test
that a file whose permission bits says should be unreadble truly
cannot be read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git tries to avoid creating a commit with an empty author
name or email. However, commits created by older, less
strict versions of git may still be in the history. There's
not much point in issuing a warning to stderr for an empty
author. The user can't do anything about it now, and we are
better off to simply include it in the shortlog output as an
empty name/email, and let the caller process it however they
see fit.
Older versions of shortlog differentiated between "author
header not present" (which complained) and "author
name/email are blank" (which included the empty ident in the
output). But since switching to format_commit_message, we
complain to stderr about either case (linux.git has a blank
author deep in its history which triggers this).
We could try to restore the older behavior (complaining only
about the missing header), but in retrospect, there's not
much point in differentiating these cases. A missing
author header is bogus, but as for the "blank" case, the
only useful behavior is to add it to the "empty name"
collection.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are in "--summary" mode, then we do not care about the
actual list of subject onelines associated with each author.
We care only about the number. So rather than store a
string-list for each author full of "<none>", let's just
keep a count.
This drops my best-of-five for "git shortlog -ns HEAD" on
linux.git from:
real 0m5.194s
user 0m5.028s
sys 0m0.168s
to:
real 0m5.057s
user 0m4.916s
sys 0m0.144s
That's about 2.5%.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are in --summary mode, we will always pass <none> to
insert_one_record, which will then do some normalization
(e.g., cutting out "[PATCH]"). There's no point in doing so
if we aren't going to use the result anyway.
This drops my best-of-five for "git shortlog -ns HEAD" on
linux.git from:
real 0m5.257s
user 0m5.104s
sys 0m0.156s
to:
real 0m5.194s
user 0m5.028s
sys 0m0.168s
That's only 1%, but arguably the result is clearer to read,
as we're able to group our variable declarations inside the
conditional block. It also opens up further optimization
possibilities for future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user asked us only to show counts for each author,
rather than the individual summary lines, then there is no
point in us generating the summaries only to throw them
away. With this patch, I measured the following speedup for
"git shortlog -ns HEAD" on linux.git (best-of-five):
[before]
real 0m5.644s
user 0m5.472s
sys 0m0.176s
[after]
real 0m5.257s
user 0m5.104s
sys 0m0.156s
That's only ~7%, but it's so easy to do, there's no good
reason not to. We don't have to touch any downstream code,
since we already fill in the magic string "<none>" to handle
commits without a message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When gathering the author and oneline subject for each
commit, we hand-parse the commit headers to find the
"author" line, and then continue past to the blank line at
the end of the header.
We can replace this tricky hand-parsing by simply asking the
pretty-printer for the relevant items. This also decouples
the author and oneline parsing, opening up some new
optimizations in further commits.
One reason to avoid the pretty-printer is that it might be
less efficient than hand-parsing. However, I measured no
slowdown at all running "git shortlog -ns HEAD" on
linux.git.
As a bonus, we also fix a memory leak in the (uncommon) case
that the author field is blank.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently use fixed-size buffers with fgets(), which
could lead to incorrect results in the unlikely event that a
line had something like "Author:" at exactly its 1024th
character.
But it's easy to convert this to a strbuf, and because we
can reuse the same buffer through the loop, we don't even
pay the extra allocation cost.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original git-shortlog could read both the normal "git
log" output as well as "git log --format=raw". However, when
it was converted to C by b8ec592 (Build in shortlog,
2006-10-22), the trailing colon became mandatory, and we no
longer matched the raw output.
Given the amount of intervening time without any bug
reports, it's probable that nobody cares. But it's
relatively easy to fix, and the end result is hopefully more
readable than the original.
Note that this no longer matches "author: ", which we did
before, but that has never been a format generated by git.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the result of a (a, 0) expression is not used, MSys2's GCC version
finds it necessary to complain with a warning:
right-hand operand of comma expression has no effect
Let's just pretend to use the 0 value and have a peaceful and quiet life
again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
MSys2's compiler is correct that casting a "void *" to a "DWORD" loses
precision, but in the case of pthread_exit() we know that the value
fits into a DWORD.
Just like casting handles to DWORDs, let's work around this issue by
casting to "intrptr_t" first, and immediately cast to the final type.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HANDLE is defined internally as a void *, but in many cases it is
actually guaranteed to be a 32-bit integer. In these cases, GCC should
not warn about a cast of a pointer to an integer of a different type
because we know exactly what we are doing.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When compiling with MSys2's compiler, these constants are already defined.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now there is no direct caller to strbuf_getline(), we can demote it
to file-scope static that is private to strbuf.c and rename it to
strbuf_getdelim(). Rename strbuf_getline_crlf(), which is designed
to be the most "text friendly" variant, and allow it to take over
this simplest name, strbuf_getline(), so we can add more uses of it
without having to type _crlf over and over again in the coming
steps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The program by default reads LF terminated lines, with an option to
use NUL terminated records. Instead of pretending that there can be
other useful values for line_termination, use a boolean variable,
nul_term_line, to tell if NUL terminated records are used, and
switch between strbuf_getline_{lf,nul} based on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The program by default reads LF terminated lines, with an option to
use NUL terminated records. Instead of pretending that there can be
other useful values for line_termination, use a boolean variable,
nul_term_line, to tell if NUL terminated records are used, and
switch between strbuf_getline_{lf,nul} based on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>