cat-file: add mailmap support to --batch-check option

Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `repo_read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Siddharth Asthana 2022-12-20 11:31:13 +05:30 committed by Junio C Hamano
parent 49050a043b
commit a797c0ea04
3 changed files with 87 additions and 13 deletions

View File

@ -91,26 +91,49 @@ OPTIONS
--batch:: --batch::
--batch=<format>:: --batch=<format>::
Print object information and contents for each object provided Print object information and contents for each object provided
on stdin. May not be combined with any other options or arguments on stdin. May not be combined with any other options or arguments
except `--textconv` or `--filters`, in which case the input lines except `--textconv`, `--filters`, or `--use-mailmap`.
also need to specify the path, separated by whitespace. See the +
section `BATCH OUTPUT` below for details. * When used with `--textconv` or `--filters`, the input lines
must specify the path, separated by whitespace. See the section
`BATCH OUTPUT` below for details.
+
* When used with `--use-mailmap`, for commit and tag objects, the
contents part of the output shows the identities replaced using the
mailmap mechanism, while the information part of the output shows
the size of the object as if it actually recorded the replacement
identities.
--batch-check:: --batch-check::
--batch-check=<format>:: --batch-check=<format>::
Print object information for each object provided on stdin. May Print object information for each object provided on stdin. May not be
not be combined with any other options or arguments except combined with any other options or arguments except `--textconv`, `--filters`
`--textconv` or `--filters`, in which case the input lines also or `--use-mailmap`.
need to specify the path, separated by whitespace. See the +
section `BATCH OUTPUT` below for details. * When used with `--textconv` or `--filters`, the input lines must
specify the path, separated by whitespace. See the section
`BATCH OUTPUT` below for details.
+
* When used with `--use-mailmap`, for commit and tag objects, the
printed object information shows the size of the object as if the
identities recorded in it were replaced by the mailmap mechanism.
--batch-command:: --batch-command::
--batch-command=<format>:: --batch-command=<format>::
Enter a command mode that reads commands and arguments from stdin. May Enter a command mode that reads commands and arguments from stdin. May
only be combined with `--buffer`, `--textconv` or `--filters`. In the only be combined with `--buffer`, `--textconv`, `--use-mailmap` or
case of `--textconv` or `--filters`, the input lines also need to specify `--filters`.
the path, separated by whitespace. See the section `BATCH OUTPUT` below +
for details. * When used with `--textconv` or `--filters`, the input lines must
specify the path, separated by whitespace. See the section
`BATCH OUTPUT` below for details.
+
* When used with `--use-mailmap`, for commit and tag objects, the
`contents` command shows the identities replaced using the
mailmap mechanism, while the `info` command shows the size
of the object as if it actually recorded the replacement
identities.
+ +
`--batch-command` recognizes the following commands: `--batch-command` recognizes the following commands:
+ +

View File

@ -444,6 +444,9 @@ static void batch_object_write(const char *obj_name,
if (!data->skip_object_info) { if (!data->skip_object_info) {
int ret; int ret;
if (use_mailmap)
data->info.typep = &data->type;
if (pack) if (pack)
ret = packed_object_info(the_repository, pack, offset, ret = packed_object_info(the_repository, pack, offset,
&data->info); &data->info);
@ -457,6 +460,18 @@ static void batch_object_write(const char *obj_name,
fflush(stdout); fflush(stdout);
return; return;
} }
if (use_mailmap && (data->type == OBJ_COMMIT || data->type == OBJ_TAG)) {
size_t s = data->size;
char *buf = NULL;
buf = repo_read_object_file(the_repository, &data->oid, &data->type,
&data->size);
buf = replace_idents_using_mailmap(buf, &s);
data->size = cast_size_t_to_ulong(s);
free(buf);
}
} }
strbuf_reset(scratch); strbuf_reset(scratch);

View File

@ -1051,4 +1051,40 @@ test_expect_success 'git cat-file -s returns correct size with --use-mailmap for
test_cmp expect actual test_cmp expect actual
' '
test_expect_success 'git cat-file --batch-check returns correct size with --use-mailmap' '
test_when_finished "rm .mailmap" &&
cat >.mailmap <<-\EOF &&
C O Mitter <committer@example.com> Orig <orig@example.com>
EOF
git cat-file commit HEAD >commit.out &&
commit_size=$(wc -c <commit.out) &&
commit_sha=$(git rev-parse HEAD) &&
echo $commit_sha commit $commit_size >expect &&
git cat-file --use-mailmap commit HEAD >commit.out &&
commit_size=$(wc -c <commit.out) &&
echo $commit_sha commit $commit_size >>expect &&
echo "HEAD" >in &&
git cat-file --batch-check <in >actual &&
git cat-file --use-mailmap --batch-check <in >>actual &&
test_cmp expect actual
'
test_expect_success 'git cat-file --batch-command returns correct size with --use-mailmap' '
test_when_finished "rm .mailmap" &&
cat >.mailmap <<-\EOF &&
C O Mitter <committer@example.com> Orig <orig@example.com>
EOF
git cat-file commit HEAD >commit.out &&
commit_size=$(wc -c <commit.out) &&
commit_sha=$(git rev-parse HEAD) &&
echo $commit_sha commit $commit_size >expect &&
git cat-file --use-mailmap commit HEAD >commit.out &&
commit_size=$(wc -c <commit.out) &&
echo $commit_sha commit $commit_size >>expect &&
echo "info HEAD" >in &&
git cat-file --batch-command <in >actual &&
git cat-file --use-mailmap --batch-command <in >>actual &&
test_cmp expect actual
'
test_done test_done