In addition to generating a shortlog based on committer, author, or the
identity in one or more specified trailers, it can be useful to generate
a shortlog based on an arbitrary commit format.
This can be used, for example, to generate a distribution of commit
activity over time, like so:
$ git shortlog --group='%cd' --date='format:%Y-%m' -s v2.37.0..
117 2022-06
274 2022-07
324 2022-08
263 2022-09
7 2022-10
Arbitrary commit formats can be used. In fact, `git shortlog`'s default
behavior (to count by commit authors) can be emulated as follows:
$ git shortlog --group='%aN <%aE>' ...
and future patches will make the default behavior (as well as
`--committer`, and `--group=trailer:<trailer>`) special cases of the
more flexible `--group` option.
Note also that the SHORTLOG_GROUP_FORMAT enum value is used only to
designate that `--group:<format>` is in use when in stdin mode to
declare that the combination is invalid.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for a future patch which will introduce arbitrary pretty formats
via the `--group` argument.
To allow additional customizability (for example, to support something
like `git shortlog -s --group='%aD' --date='format:%Y-%m' ...` (which
groups commits by the datestring 'YYYY-mm' according to author date), we
must store off the `--date` parsed from calling `parse_revision_opt()`.
Note that this also affects custom output `--format` strings in `git
shortlog`. Though this is a behavior change, this is arguably fixing a
long-standing bug (ie., that `--format` strings are not affected by
`--date` specifiers as they should be).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Carefully excluding t4013 and t4015, which see independent development
elsewhere at the time of writing, we use `main` as the default branch
name in t4*. This trick was performed via
$ (cd t &&
sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
-e 's/Master/Main/g' -- t4*.sh t4211/*.export &&
git checkout HEAD -- t4013\*)
This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to the manual adjustment to let the `linux-gcc` CI job run
the test suite with `master` and then with `main`, this patch makes sure
that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts
that currently rely on the initial branch name being `master by default.
To determine which test scripts to mark up, the first step was to
force-set the default branch name to `master` in
- all test scripts that contain the keyword `master`,
- t4211, which expects `t/t4211/history.export` with a hard-coded ref to
initialize the default branch,
- t5560 because it sources `t/t556x_common` which uses `master`,
- t8002 and t8012 because both source `t/annotate-tests.sh` which also
uses `master`)
This trick was performed by this command:
$ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
' $(git grep -l master t/t[0-9]*.sh) \
t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh
After that, careful, manual inspection revealed that some of the test
scripts containing the needle `master` do not actually rely on a
specific default branch name: either they mention `master` only in a
comment, or they initialize that branch specificially, or they do not
actually refer to the current default branch. Therefore, the
aforementioned modification was undone in those test scripts thusly:
$ git checkout HEAD -- \
t/t0027-auto-crlf.sh t/t0060-path-utils.sh \
t/t1011-read-tree-sparse-checkout.sh \
t/t1305-config-include.sh t/t1309-early-config.sh \
t/t1402-check-ref-format.sh t/t1450-fsck.sh \
t/t2024-checkout-dwim.sh \
t/t2106-update-index-assume-unchanged.sh \
t/t3040-subprojects-basic.sh t/t3301-notes.sh \
t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \
t/t3436-rebase-more-options.sh \
t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \
t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \
t/t5511-refspec.sh t/t5526-fetch-submodules.sh \
t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \
t/t5548-push-porcelain.sh \
t/t5552-skipping-fetch-negotiator.sh \
t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \
t/t5614-clone-submodules-shallow.sh \
t/t7508-status.sh t/t7606-merge-custom.sh \
t/t9302-fast-import-unpack-limit.sh
We excluded one set of test scripts in these commands, though: the range
of `git p4` tests. The reason? `git p4` stores the (foreign) remote
branch in the branch called `p4/master`, which is obviously not the
default branch. Manual analysis revealed that only five of these tests
actually require a specific default branch name to pass; They were
modified thusly:
$ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
' t/t980[0167]*.sh t/t9811*.sh
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that shortlog supports reading from trailers, it can be useful to
combine counts from multiple trailers, or between trailers and authors.
This can be done manually by post-processing the output from multiple
runs, but it's non-trivial to make sure that each name/commit pair is
counted only once.
This patch teaches shortlog to accept multiple --group options on the
command line, and pull data from all of them. That makes it possible to
run:
git shortlog -ns --group=author --group=trailer:co-authored-by
to get a shortlog that counts authors and co-authors equally.
The implementation is mostly straightforward. The "group" enum becomes a
bitfield, and the trailer key becomes a list. I didn't bother
implementing the multi-group semantics for reading from stdin. It would
be possible to do, but the existing matching code makes it awkward, and
I doubt anybody cares.
The duplicate suppression we used for trailers now covers authors and
committers as well (though in non-trailer single-group mode we can skip
the hash insertion and lookup, since we only see one value per commit).
There is one subtlety: we now care about the case when no group bit is
set (in which case we default to showing the author). The caller in
builtin/log.c needs to be adapted to ask explicitly for authors, rather
than relying on shortlog_init(). It would be possible with some
gymnastics to make this keep working as-is, but it's not worth it for a
single caller.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Trailers don't necessarily contain name/email identity values, so
shortlog has so far treated them as opaque strings. However, since many
trailers do contain identities, it's useful to treat them as such when
they can be parsed. That lets "-e" work as usual, as well as mailmap.
When they can't be parsed, we'll continue with the old behavior of
treating them as a single string (there's no new test for that here,
since the existing tests cover a trailer like this).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current documentation is vague about what happens with
--group=trailer:signed-off-by when we see a commit with:
Signed-off-by: One
Signed-off-by: Two
Signed-off-by: One
We clearly should credit both "One" and "Two", but should "One" get
credited twice? The current code does so, but mostly because that was
the easiest thing to do. It's probably more useful to count each commit
at most once. This will become especially important when we allow
values from multiple sources in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a project uses commit trailers, this patch lets you use
shortlog to see who is performing each action. For example,
running:
git shortlog -ns --group=trailer:reviewed-by
in git.git shows who has reviewed. You can even use a custom
format to see things like who has helped whom:
git shortlog --format="...helped %an (%ad)" \
--group=trailer:helped-by
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding more grouping types, let's refactor the
committer/author grouping code and add a user-facing option that binds
them together. In particular:
- the main option is now "--group", to make it clear
that the various group types are mutually exclusive. The
"--committer" option is an alias for "--group=committer".
- we keep an enum rather than a binary flag, to prepare
for more values
- we prefer switch statements to ternary assignment, since
other group types will need more custom code
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using 'test_must_be_empty' is preferable to 'test ! -s', because it
gives a helpful error message if the given file is unexpectedly no
empty, while the latter remains completely silent. Furthermore, it
also catches cases when the given file unexpectedly does not exist at
all.
This patch was created by:
sed -i -e 's/test ! -s/test_must_be_empty/' t[0-9]*.sh
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch all uses of $_x40 to $OID_REGEX so that they work correctly with
larger hashes. This commit was created by using the following sed
command to modify all files in the t directory except t/test-lib.sh:
sed -i 's/\$_x40/$OID_REGEX/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are outside a repo and have any arguments left after
option-parsing, `setup_revisions()` will try to do its job and
something like this will happen:
$ git shortlog v2.16.0..
BUG: environment.c:183: git environment hasn't been setup
Aborted (core dumped)
The usage is wrong, but we could obviously handle this better. Note that
commit abe549e179 (shortlog: do not require to run from inside a git
repository, 2008-03-14) explicitly enabled `git shortlog` to run from
outside a repo, since we do not need a repo for parsing data from stdin.
Disallow left-over arguments when run from outside a repo.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test for '--abbrev' in t4201-shortlog.sh assumes that the commits
generated in the test can always be uniquely abbreviated to 5 hex digits
but this is not always the case. If you were unlucky and happened to run
the test at (say) Thu Jun 22 03:04:49 2017 +0000, you would find that
the first commit generated would collide with a tree object created
later in the same test.
This can be simulated in the version of t4201-shortlog.sh prior to this
commit by setting GIT_COMMITTER_DATE and GIT_AUTHOR_DATE to 1498100689
after sourcing test-lib.sh.
Change the test to test --abbrev=35 instead of --abbrev=5 to almost
completely avoid the possibility of a partial collision and add a call
to test_tick in the setup to make the test repeatable (the latter alone
is sufficient to make it robust enough).
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sure the tests do not depend on the result of the previous
tests. With MINGW prerequisite satisfied, a "reset to original and
rebuild" in an earlier test was skipped, resulting in different
history being tested with this and the next tests.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This puts the final touches on the feature added by
fbfda15fb8 (shortlog: group by committer information,
2016-10-11).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust t4201 to pass on Windows; a couple of test cases need to be
skipped on Windows which leads to a different shortlog than on Linux.
Let's just fix that by limiting the shortlog's commit range to traverse
only one commit: that guarantees that it does not matter how many test
cases were skipped.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Thanks to the diff option parsing, we already know about this option.
We just have to make use of it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git log" shows the log message indented by 4-spaces, the
remainder of a line after a HT does not align in the way the author
originally intended. The command now expands tabs by default in
such a case, and allows the users to override it with a new option,
'--no-expand-tabs'.
* lt/pretty-expand-tabs:
pretty: test --expand-tabs
pretty: allow tweaking tabwidth in --expand-tabs
pretty: enable --expand-tabs by default for selected pretty formats
pretty: expand tabs in indented logs to make things line up properly
"git log --pretty={medium,full,fuller}" and "git log" by default
prepend 4 spaces to the log message, so it makes sense to enable
the new "expand-tabs" facility by default for these formats.
Add --no-expand-tabs option to override the new default.
The change alone breaks a test in t4201 that runs "git shortlog"
on the output from "git log", and expects that the output from
"git log" does not do such a tab expansion. Adjust the test to
explicitly disable expand-tabs with --no-expand-tabs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git tries to avoid creating a commit with an empty author
name or email. However, commits created by older, less
strict versions of git may still be in the history. There's
not much point in issuing a warning to stderr for an empty
author. The user can't do anything about it now, and we are
better off to simply include it in the shortlog output as an
empty name/email, and let the caller process it however they
see fit.
Older versions of shortlog differentiated between "author
header not present" (which complained) and "author
name/email are blank" (which included the empty ident in the
output). But since switching to format_commit_message, we
complain to stderr about either case (linux.git has a blank
author deep in its history which triggers this).
We could try to restore the older behavior (complaining only
about the missing header), but in retrospect, there's not
much point in differentiating these cases. A missing
author header is bogus, but as for the "blank" case, the
only useful behavior is to add it to the "empty name"
collection.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original git-shortlog could read both the normal "git
log" output as well as "git log --format=raw". However, when
it was converted to C by b8ec592 (Build in shortlog,
2006-10-22), the trailing colon became mandatory, and we no
longer matched the raw output.
Given the amount of intervening time without any bug
reports, it's probable that nobody cares. But it's
relatively easy to fix, and the end result is hopefully more
readable than the original.
Note that this no longer matches "author: ", which we did
before, but that has never been a format generated by git.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Support for Back when bdccd3c1 (test-lib: allow negation of
prerequisites, 2012-11-14) introduced negated predicates
(e.g. "!MINGW,!CYGWIN"), we already had 5 test files that use
NOT_MINGW (and a few MINGW) as prerequisites.
Let's not add NOT_FOO and rewrite existing ones as !FOO for both
MINGW and CYGWIN.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Windows, all native APIs are Unicode-based. It is impossible to pass
legacy encoded byte arrays to a process via command line or environment
variables. Disable the tests that try to do so.
In t3901, most tests still work if we don't mess up the repository encoding
in setup, so don't switch to ISO-8859-1 on MinGW.
Note that i18n tests that do their encoding tricks via encoded files (such
as t3900) are not affected by this.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Stepan Kasal <kasal@ucw.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"log --exclude=<glob> --all | shortlog" worked as expected, but
"shortlog --exclude=<glob> --all" was not accepted at the command
line argument parser level.
* jc/shortlog-ref-exclude:
shortlog: allow --exclude=<glob> to be passed
These two commands are supposed to be equivalent:
$ git log --exclude=refs/notes/\* --all --no-merges --since=2.days |
git shortlog
$ git shortlog --exclude=refs/notes/\* --all --no-merges --since=2.days
However, the latter does not understand the ref-exclusion command
line option, even though other options understood by "log", such as
"--all" and "--no-merges", are understood.
This was because e7b432c5 (revision: introduce --exclude=<glob> to
tame wildcards, 2013-08-30) did not wire the new option fully to the
machinery. A new option understood by handle_revision_pseudo_opt()
must be told to handle_revision_opt() as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of git's traversals are robust against minor breakages
in commit data. For example, "git log" will still output an
entry for a commit that has a broken encoding or missing
author, and will not abort the whole operation.
Shortlog, on the other hand, will die as soon as it sees a
commit without an author, meaning that a repository with
a broken commit cannot get any shortlog output at all.
Let's downgrade this fatal error to a warning, and continue
the operation.
We simply ignore the commit and do not count it in the total
(since we do not have any author under which to file it).
Alternatively, we could output some kind of "<empty>" record
to collect these bogus commits. It is probably not worth it,
though; we have already warned to stderr, so the user is
aware that such bogosities exist, and any placeholder we
came up with would either be syntactically invalid, or would
potentially conflict with real data.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent commit [1] fixed a off-by-one wrapping error. As a
side-effect, the conditional in add_wrapped_shortlog_msg() to decide
whether to append a newline needs to be removed. The function
should always append a newline, which was the case before the
off-by-one fix, because strbuf_add_wrapped_text() never returns a
value of wraplen; when it returns wraplen, the string does not end
with a newline, so this caller needs to add one anyway.
[1] 14e1a4e1ff utf8: fix off-by-one
wrapping of text
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Breaks in a test assertion's && chain can potentially hide
failures from earlier commands in the chain.
Commands intended to fail should be marked with !, test_must_fail, or
test_might_fail. The examples in this patch do not require that.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to this, the output of git log -1 --format=%h was always 7
characters long, without regard to whether --abbrev had been passed.
Signed-off-by: Will Palmer <wmpalmer@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Do not document the --pretty synonym, since it takes too long to
explain the name to people.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow the current prevailing style. This also has the benefit of
capturing any stray output and noticing if any of the setup commands
start failing.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some ancient platforms do not have an extensive list of alternate names for
character encodings. For example, Solaris 7 and IRIX 6.5 do not know that
ISO-8859-1 is the same as ISO8859-1. Modern platforms do know this, so use
the older name.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't take the author name information without re-encoding from the raw
commit object buffer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had a handful test updates since we accepted 82ebb0b (add test_cmp
function for test scripts). This fixes them up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/portable:
t6000lib: re-fix tr portability
t7505: use SHELL_PATH in hook
t9112: add missing #!/bin/sh header
filter-branch: use $SHELL_PATH instead of 'sh'
filter-branch: don't use xargs -0
add NO_EXTERNAL_GREP build option
t6000lib: tr portability fix
t4020: don't use grep -a
add test_cmp function for test scripts
remove use of "tail -n 1" and "tail -1"
grep portability fix: don't use "-e" or "-q"
more tr portability test script fixes
t0050: perl portability fix
tr portability fixes
Once upon a time shortlog could be run from a non-git directory
and still do its job. Fix this regression and add a small test
for it.
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many scripts compare actual and expected output using
"diff -u". This is nicer than "cmp" because the output shows
how the two differ. However, not all versions of diff
understand -u, leading to unnecessary test failure.
This adds a test_cmp function to the test scripts and
switches all "diff -u" invocations to use it. The function
uses the contents of "$GIT_TEST_CMP" to compare its
arguments; the default is "diff -u".
On systems with a less-capable diff, you can do:
GIT_TEST_CMP=cmp make test
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that git-commit got chatty, we have to shut it up again.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Make every test executable. Remove exec-attribute from included shell files,
they can't used standalone anyway.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
instead of embedded subshell. It actually breaks here (dash as /bin/sh):
t4201-shortlog.sh: 27: Syntax error: Missing '))'
FATAL: Unexpected exit with code 2
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some oneline descriptions are just too long. In shortlog, it looks much
nicer when they are wrapped. Since print_wrapped_text() is UTF-8 aware,
it also works with those descriptions.
[jc: with minimum fixes]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>