Commit Graph

60039 Commits

Author SHA1 Message Date
Junio C Hamano
efafdca421 Merge branch 'dl/test-must-fail-fixes-5'
The effort to avoid using test_must_fail on non-git command continues.

* dl/test-must-fail-fixes-5:
  lib-submodule-update: pass 'test_must_fail' as an argument
  lib-submodule-update: prepend "git" to $command
  lib-submodule-update: consolidate --recurse-submodules
  lib-submodule-update: add space after function name
2020-07-06 22:09:18 -07:00
Junio C Hamano
0a23331aa6 Merge branch 'jk/fast-export-anonym-alt'
"git fast-export --anonymize" learned to take customized mapping to
allow its users to tweak its output more usable for debugging.

* jk/fast-export-anonym-alt:
  fast-export: use local array to store anonymized oid
  fast-export: anonymize "master" refname
  fast-export: allow seeding the anonymized mapping
  fast-export: add a "data" callback parameter to anonymize_str()
  fast-export: move global "idents" anonymize hashmap into function
  fast-export: use a flex array to store anonymized entries
  fast-export: stop storing lengths in anonymized hashmaps
  fast-export: tighten anonymize_mem() interface to handle only strings
  fast-export: store anonymized oids as hex strings
  fast-export: use xmemdupz() for anonymizing oids
  t9351: derive anonymized tree checks from original repo
2020-07-06 22:09:17 -07:00
Junio C Hamano
0ac0947b14 Merge branch 'js/diff-files-i-t-a-fix-for-difftool'
"git difftool" has trouble dealing with paths added to the index
with the intent-to-add bit.

* js/diff-files-i-t-a-fix-for-difftool:
  difftool -d: ensure that intent-to-add files are handled correctly
  diff-files --raw: show correct post-image of intent-to-add files
2020-07-06 22:09:17 -07:00
Junio C Hamano
11cbda2add Merge branch 'js/default-branch-name'
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.

* js/default-branch-name:
  contrib: subtree: adjust test to change in fmt-merge-msg
  testsvn: respect `init.defaultBranch`
  remote: use the configured default branch name when appropriate
  clone: use configured default branch name when appropriate
  init: allow setting the default for the initial branch name via the config
  init: allow specifying the initial branch name for the new repository
  docs: add missing diamond brackets
  submodule: fall back to remote's HEAD for missing remote.<name>.branch
  send-pack/transport-helper: avoid mentioning a particular branch
  fmt-merge-msg: stop treating `master` specially
2020-07-06 22:09:17 -07:00
Junio C Hamano
480e78595e Merge branch 'rs/pack-bits-in-object-better'
By renumbering object flag bits, "struct object" managed to lose
bloated inter-field padding.

* rs/pack-bits-in-object-better:
  revision: reallocate TOPO_WALK object flags
2020-07-06 22:09:17 -07:00
Junio C Hamano
67d99b82de Merge branch 'bc/http-push-flagsfix'
The code to push changes over "dumb" HTTP had a bad interaction
with the commit reachability code due to incorrect allocation of
object flag bits, which has been corrected.

* bc/http-push-flagsfix:
  http-push: ensure unforced pushes fail when data would be lost
2020-07-06 22:09:17 -07:00
Junio C Hamano
8a78e4d615 Merge branch 'js/pu-to-seen'
The documentation and some tests have been adjusted for the recent
renaming of "pu" branch to "seen".

* js/pu-to-seen:
  tests: reference `seen` wherever `pu` was referenced
  docs: adjust the technical overview for the rename `pu` -> `seen`
  docs: adjust for the recent rename of `pu` to `seen`
2020-07-06 22:09:16 -07:00
Junio C Hamano
0258ed1e08 Merge branch 'cb/is-descendant-of'
Code clean-up.

* cb/is-descendant-of:
  commit-reach: avoid is_descendant_of() shim
2020-07-06 22:09:16 -07:00
Junio C Hamano
5c61d10b16 Merge branch 'mk/pb-pretty-email-without-domain-part-fix'
Docfix.

* mk/pb-pretty-email-without-domain-part-fix:
  doc: fix author vs. committer copy/paste error
2020-07-06 22:09:15 -07:00
Junio C Hamano
65ffaca0e4 Merge branch 'jl/complete-git-prune'
Add "git prune" to the completion (in contrib/), which could be
typed by end-users from the command line.

* jl/complete-git-prune:
  bash-completion: add git-prune into bash completion
2020-07-06 22:09:15 -07:00
Junio C Hamano
645f63111b Merge branch 'es/get-worktrees-unsort'
API cleanup for get_worktrees()

* es/get-worktrees-unsort:
  worktree: drop get_worktrees() unused 'flags' argument
  worktree: drop get_worktrees() special-purpose sorting option
2020-07-06 22:09:15 -07:00
Junio C Hamano
e7e113a1df Merge branch 'bc/sha-256-cvs-svn-updates'
CVS/SVN interface have been prepared for SHA-256 transition

* bc/sha-256-cvs-svn-updates:
  git-cvsexportcommit: port to SHA-256
  git-cvsimport: port to SHA-256
  git-cvsserver: port to SHA-256
  git-svn: set the OID length based on hash algorithm
  perl: make SVN code hash independent
  perl: make Git::IndexInfo work with SHA-256
  perl: create and switch variables for hash constants
  t/lib-git-svn: make hash size independent
  t9101: make hash independent
  t9104: make hash size independent
  t9100: make test work with SHA-256
  t9108: make test hash independent
  t9168: make test hash independent
  t9109: make test hash independent
2020-07-06 22:09:14 -07:00
Junio C Hamano
d80bea479d Merge branch 'ak/commit-graph-to-slab'
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.

* ak/commit-graph-to-slab:
  commit-graph: minimize commit_graph_data_slab access
  commit: move members graph_pos, generation to a slab
  commit-graph: introduce commit_graph_data_slab
  object: drop parsed_object_pool->commit_count
2020-07-06 22:09:14 -07:00
Junio C Hamano
0cc4dcacb3 Merge branch 'en/sparse-status'
"git status" learned to report the status of sparse checkout.

* en/sparse-status:
  git-prompt: include sparsity state as well
  git-prompt: document how in-progress operations affect the prompt
  wt-status: show sparse checkout status as well
2020-07-06 22:09:13 -07:00
Junio C Hamano
33a22c1a88 Merge branch 'ps/ref-transaction-hook'
A new hook.

* ps/ref-transaction-hook:
  refs: implement reference transaction hook
2020-07-06 22:09:13 -07:00
Junio C Hamano
12210859da Merge branch 'bc/sha-256-part-2'
SHA-256 migration work continues.

* bc/sha-256-part-2: (44 commits)
  remote-testgit: adapt for object-format
  bundle: detect hash algorithm when reading refs
  t5300: pass --object-format to git index-pack
  t5704: send object-format capability with SHA-256
  t5703: use object-format serve option
  t5702: offer an object-format capability in the test
  t/helper: initialize the repository for test-sha1-array
  remote-curl: avoid truncating refs with ls-remote
  t1050: pass algorithm to index-pack when outside repo
  builtin/index-pack: add option to specify hash algorithm
  remote-curl: detect algorithm for dumb HTTP by size
  builtin/ls-remote: initialize repository based on fetch
  t5500: make hash independent
  serve: advertise object-format capability for protocol v2
  connect: parse v2 refs with correct hash algorithm
  connect: pass full packet reader when parsing v2 refs
  Documentation/technical: document object-format for protocol v2
  t1302: expect repo format version 1 for SHA-256
  builtin/show-index: provide options to determine hash algo
  t5302: modernize test formatting
  ...
2020-07-06 22:09:13 -07:00
Han-Wen Nienhuys
9e35a6a986 lib-t6000.sh: write tag using git-update-ref
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-06 21:38:32 -07:00
René Scharfe
01faa91cb7 revision: disable min_age optimization with line-log
If one of the options --before, --min-age or --until is given,
limit_list() filters out younger commits early on.  Line-log needs all
those commits to trace the movement of line ranges, though.  Skip this
optimization if both are used together.

Reported-by: Мария Долгополова <dolgopolovamariia@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-06 18:38:03 -07:00
Johannes Schindelin
3080c50980 difftool -d: ensure that intent-to-add files are handled correctly
In https://github.com/git-for-windows/git/issues/2677, a `git difftool
-d` problem was reported. The underlying cause was a bug in `git
diff-files --raw` that we just fixed: it reported intent-to-add files
with the empty _tree_ as the post-image OID, when we need to show
an all-zero (or, "null") OID instead, to indicate to the caller that
they have to look at the worktree file.

The symptom of that problem shown by `git difftool` was this:

	error: unable to read sha1 file of <path> (<empty-tree-OID>)
	error: could not write '<filename>'

Make sure that the reported `difftool` problem stays fixed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 16:15:45 -07:00
Johannes Schindelin
85953a3187 diff-files --raw: show correct post-image of intent-to-add files
The documented behavior of `git diff-files --raw` is to display

	[...] 0{40} if creation, unmerged or "look at work tree".

on the right hand (i.e. postimage) side. This happens for files that
have unstaged modifications, and for files that are unmodified but
stat-dirty.

For intent-to-add files, we used to show the empty blob's hash instead.
In c26022ea8f (diff: convert diff_addremove to struct object_id,
2017-05-30), we made that worse by inadvertently changing that to the
hash of the empty tree.

Let's make the behavior consistent with files that have unstaged
modifications (which applies to intent-to-add files, too) by showing
all-zero values also for intent-to-add files.

Accordingly, this patch adjusts the expectations set by the regression
test introduced in feea6946a5 (diff-files: treat "i-t-a" files as
"not-in-index", 2020-06-20).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 16:15:43 -07:00
Rafael Aquini
f9f60d7066 send-email: restore --in-reply-to superseding behavior
git send-email --in-reply-to= fails to override In-Reply-To email headers,
if they're present in the output of format-patch, even when explicitly
told to do so by the option --no-thread, which breaks the contract of the
command line switch option, per its man page.

"
   --in-reply-to=<identifier>
       Make the first mail (or all the mails with --no-thread) appear as
       a reply to the given Message-Id, which avoids breaking threads to
       provide a new patch series.
"

This patch fixes the aformentioned issue, by bringing --in-reply-to's old
overriding behavior back.

The test was donated by Carlo Marcelo Arenas Belón.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 16:12:21 -07:00
Christian Couder
0172f7834a cat-file: add missing [=<format>] to usage/synopsis
When displaying cat-file usage, the fact that a <format> can
be specified is only visible when lookling at the --batch and
--batch-check options which are shown like this:

    --batch[=<format>]    show info and content of objects fed from the standard input
    --batch-check[=<format>]
                          show info about objects fed from the standard input

It seems more coherent and improves discovery to also show it
on the usage line.

In the documentation the DESCRIPTION tells us that "The output
format can be overridden using the optional <format> argument",
but we can't see the <format> argument in the SYNOPSIS above
the description which is confusing.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 15:54:05 -07:00
Ville Skyttä
c2dbcd206d completion: nounset mode fixes
Accessing unset variables results an errors when the shell is in
nounset/-u mode. This fixes the cases I've come across while using git
completion in a shell running in that mode for a while. It's hard to
tell if this is the complete set, but at least it improves things.

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:55:30 -07:00
SZEDER Gábor
c525ce95b4 commit-graph: check all leading directories in changed path Bloom filters
The file 'dir/subdir/file' can only be modified if its leading
directories 'dir' and 'dir/subdir' are modified as well.

So when checking modified path Bloom filters looking for commits
modifying a path with multiple path components, then check not only
the full path in the Bloom filters, but all its leading directories as
well.  Take care to check these paths in "deepest first" order,
because it's the full path that is least likely to be modified, and
the Bloom filter queries can short circuit sooner.

This can significantly reduce the average false positive rate, by
about an order of magnitude or three(!), and can further speed up
pathspec-limited revision walks.  The table below compares the average
false positive rate and runtime of

  git rev-list HEAD -- "$path"

before and after this change for 5000+ randomly* selected paths from
each repository:

                    Average false           Average        Average
                    positive rate           runtime        runtime
                  before     after     before     after   difference
  ------------------------------------------------------------------
  git             3.220%   0.7853%     0.0558s   0.0387s   -30.6%
  linux           2.453%   0.0296%     0.1046s   0.0766s   -26.8%
  tensorflow      2.536%   0.6977%     0.0594s   0.0420s   -29.2%

*Path selection was done with the following pipeline:

	git ls-tree -r --name-only HEAD | sort -R | head -n 5000

The improvements in runtime are much smaller than the improvements in
average false positive rate, as we are clearly reaching diminishing
returns here.  However, all these timings depend on that accessing
tree objects is reasonably fast (warm caches).  If we had a partial
clone and the tree objects had to be fetched from a promisor remote,
e.g.:

  $ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git
  $ git -C webkit.git -c core.modifiedPathBloomFilters=1 \
        commit-graph write --reachable
  $ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/
  $ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \
        rev-list HEAD -- "$path"

then checking all leading path component can reduce the runtime from
over an hour to a few seconds (and this is with the clone and the
promisor on the same machine).

This adjusts the tracing values in t4216-log-bloom.sh, which provides a
concrete way to notice the improvement.

Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
Taylor Blau
f3c2a36810 revision: empty pathspecs should not use Bloom filters
The prepare_to_use_bloom_filter() method was not intended to be called
on an empty pathspec. However, 'git log -- .' and 'git log' are subtly
different: the latter reports all commits while the former will simplify
commits that do not change the root tree.

This means that the path used to construct the bloom_key might be empty,
and that value is not added to the Bloom filter during construction.
That means that the results are likely incorrect!

To resolve the issue, be careful about the length of the path and stop
filling Bloom filters. To be completely sure we do not use them, drop
the pointer to the bloom_filter_settings from the commit-graph. That
allows our test to look at the trace2 logs to verify no Bloom filter
statistics are reported.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
Derrick Stolee
dc8e95ba7c revision.c: fix whitespace
Here, four spaces were used instead of tab characters.

Reported-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
SZEDER Gábor
2dd4fed927 commit-graph: check chunk sizes after writing
In my experience while experimenting with new commit-graph chunks,
early versions of the corresponding new write_commit_graph_my_chunk()
functions are, sadly but not surprisingly, often buggy, and write more
or less data than they are supposed to, especially if the chunk size
is not directly proportional to the number of commits.  This then
causes all kinds of issues when reading such a bogus commit-graph
file, raising the question of whether the writing or the reading part
happens to be buggy this time.

Let's catch such issues early, already when writing the commit-graph
file, and check that each write_graph_chunk_*() function wrote the
amount of data that it was expected to, and what has been encoded in
the Chunk Lookup table.  Now that all commit-graph chunks are written
in a loop we can do this check in a single place for all chunks, and
any chunks added in the future will get checked as well.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
SZEDER Gábor
17e6275fc9 commit-graph: simplify chunk writes into loop
In write_commit_graph_file() we now have one block of code filling the
array of 'struct chunk_info' with the IDs and sizes of chunks to be
written, and an other block of code calling the functions responsible
for writing individual chunks.  In case of optional chunks like Extra
Edge List an Base Graphs List there is also a condition checking
whether that chunk is necessary/desired, and that same condition is
repeated in both blocks of code. Other, newer chunks have similar
optional conditions.

Eliminate these repeated conditions by storing the function pointers
responsible for writing individual chunks in the 'struct chunk_info'
array as well, and calling them in a loop to write the commit-graph
file.  This will open up the possibility for a bit of foolproofing in
the following patch.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
SZEDER Gábor
9bab081dfa commit-graph: unify the signatures of all write_graph_chunk_*() functions
Update the write_graph_chunk_*() helper functions to have the same
signature:

  - Return an int error code from all these functions.
    write_graph_chunk_base() already has an int error code, now the
    others will have one, too, but since they don't indicate any
    error, they will always return 0.

  - Drop the hash size parameter of write_graph_chunk_oids() and
    write_graph_chunk_data(); its value can be read directly from
    'the_hash_algo' inside these functions as well.

This opens up the possibility for further cleanups and foolproofing in
the following two patches.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
Derrick Stolee
0087a87ba8 commit-graph: persist existence of changed-paths
The changed-path Bloom filters were released in v2.27.0, but have a
significant drawback. A user can opt-in to writing the changed-path
filters using the "--changed-paths" option to "git commit-graph write"
but the next write will drop the filters unless that option is
specified.

This becomes even more important when considering the interaction with
gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of
features.experimental). These config options trigger commit-graph writes
that the user did not signal, and hence there is no --changed-paths
option available.

Allow a user that opts-in to the changed-path filters to persist the
property of "my commit-graph has changed-path filters" automatically. A
user can drop filters using the --no-changed-paths option.

In the process, we need to be extremely careful to match the Bloom
filter settings as specified by the commit-graph. This will allow future
versions of Git to customize these settings, and the version with this
change will persist those settings as commit-graphs are rewritten on
top.

Use the trace2 API to signal the settings used during the write, and
check that output in a test after manually adjusting the correct bytes
in the commit-graph file.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
Derrick Stolee
949197420e bloom: fix logic in get_bloom_filter()
The get_bloom_filter() method is a bit complicated in some parts where
it does not need to be. In particular, it needs to return a NULL filter
only when compute_if_not_present is zero AND the filter data cannot be
loaded from a commit-graph file. This currently happens by accident
because the commit-graph does not load changed-path Bloom filters from
an existing commit-graph when writing a new one. This will change in a
later patch.

Also clean up some style issues while we are here.

One side-effect of returning a NULL filter is that the filters that are
reported as "too large" will now be reported as NULL insead of length
zero. This case was not properly covered before, so add a test. Further,
remote the counting of the zero-length filters from revision.c and the
trace2 logs.

Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:17:43 -07:00
Đoàn Trần Công Danh
508fd8e8ba contrib: subtree: adjust test to change in fmt-merge-msg
We're starting to stop treating `master' specially in fmt-merge-msg.
Adjust the test to reflect that change.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-30 08:41:15 -07:00
Junio C Hamano
a08a83db2b The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-29 14:17:27 -07:00
Junio C Hamano
298d704e70 Merge branch 'sk/diff-files-show-i-t-a-as-new'
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.

* sk/diff-files-show-i-t-a-as-new:
  diff-files: treat "i-t-a" files as "not-in-index"
2020-06-29 14:17:27 -07:00
Junio C Hamano
fa2c57d562 Merge branch 'rs/commit-reach-leakfix'
Leakfix.

* rs/commit-reach-leakfix:
  commit-reach: plug minor memory leak after using is_descendant_of()
2020-06-29 14:17:27 -07:00
Junio C Hamano
b381c98891 Merge branch 'rs/pull-leakfix'
Leakfix.

* rs/pull-leakfix:
  pull: plug minor memory leak after using is_descendant_of()
2020-06-29 14:17:26 -07:00
Junio C Hamano
610486749a Merge branch 'rs/retire-strbuf-write-fd'
A misdesigned strbuf_write_fd() function has been retired.

* rs/retire-strbuf-write-fd:
  strbuf: remove unreferenced strbuf_write_fd method.
  bugreport.c: replace strbuf_write_fd with write_in_full
2020-06-29 14:17:26 -07:00
Junio C Hamano
1ea1f93fd9 Merge branch 'dl/diff-usage-comment-update'
An in-code comment in "git diff" has been updated.

* dl/diff-usage-comment-update:
  builtin/diff: fix botched update of usage comment
  builtin/diff: update usage comment
2020-06-29 14:17:25 -07:00
Junio C Hamano
1033b98291 Merge branch 'xl/upgrade-repo-format'
Allow runtime upgrade of the repository format version, which needs
to be done carefully.

There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.

* xl/upgrade-repo-format:
  check_repository_format_gently(): refuse extensions for old repositories
  sparse-checkout: upgrade repository to version 1 when enabling extension
  fetch: allow adding a filter after initial clone
  repository: add a helper function to perform repository format upgrade
2020-06-29 14:17:24 -07:00
Jeff King
f39ad38410 fast-export: use local array to store anonymized oid
Some older versions of gcc complain about this line:

  builtin/fast-export.c:412:2: error: dereferencing type-punned pointer
       will break strict-aliasing rules [-Werror=strict-aliasing]
    put_be32(oid.hash + hashsz - 4, counter++);
    ^

This seems to be a false positive, as there's no type-punning at all
here. oid.hash is an array of unsigned char; when we pass it to a
function it decays to a pointer to unsigned char. We do take a void
pointer in put_be32(), but it's immediately aliased with another pointer
to unsigned char (and clearly the compiler is looking inside the inlined
put_be32(), since the warning doesn't happen with -O0).

This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6,
7, 8, and 9).

We can work around it by using a local array instead of an object_id
struct. This is a little more intimate with the details of object_id,
but for whatever reason doesn't seem to trigger the compiler warning.
We can revert this patch once we decide that those gcc versions are too
old to care about for a warning like this (gcc 4.8 is the default
compiler for Ubuntu Trusty, which is out-of-support but not fully
end-of-life'd until April 2022).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00
Jeff King
8a49495583 fast-export: anonymize "master" refname
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:

  - it helped to have some known reference point between the original
    and anonymized repository

  - since it's historically the default branch name, it doesn't leak any
    information

Now that we can ask fast-export to retain particular tokens, we have a
much better tool for the first one (because it works for any ref, not
just master).

For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.

Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. We could just use
--anonymize-map=master to keep the same output, but then we wouldn't
know if it works because of our hard-coded master or because of the
explicit map.

So let's flip the test a bit, and confirm that we anonymize "master",
but keep "other" in the output.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00
Jeff King
65b5d9fae7 fast-export: allow seeding the anonymized mapping
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.

Let's make it possible to seed the anonymization map. This lets users
either:

  - mark names to be retained as-is, if they don't consider them secret
    (in which case their original commands would just work)

  - map names to new values, which lets them adapt the reproduction
    recipe to the new names without revealing the originals

The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.

This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.

One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00
Junio C Hamano
f402ea6816 The fifth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 12:36:26 -07:00
Junio C Hamano
f33b5bddaf Merge branch 'pb/t4014-unslave'
A branch name used in a test has been clarified to match what is
going on.

* pb/t4014-unslave:
  t4014: do not use "slave branch" nomenclature
2020-06-25 12:27:48 -07:00
Junio C Hamano
34e849b05a Merge branch 'jt/cdn-offload'
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.

* jt/cdn-offload:
  upload-pack: fix a sparse '0 as NULL pointer' warning
  upload-pack: send part of packfile response as uri
  fetch-pack: support more than one pack lockfile
  upload-pack: refactor reading of pack-objects out
  Documentation: add Packfile URIs design doc
  Documentation: order protocol v2 sections
  http-fetch: support fetching packfiles by URL
  http-fetch: refactor into function
  http: refactor finish_http_pack_request()
  http: use --stdin when indexing dumb HTTP pack
2020-06-25 12:27:47 -07:00
Junio C Hamano
10462829e3 Merge branch 'ss/submodule-set-branch-in-c'
Rewrite of parts of the scripted "git submodule" Porcelain command
continues; this time it is "git submodule set-branch" subcommand's
turn.

* ss/submodule-set-branch-in-c:
  submodule: port subcommand 'set-branch' from shell to C
2020-06-25 12:27:47 -07:00
Junio C Hamano
dc4b3cfb92 Merge branch 'ds/merge-base-is-ancestor-optim'
"git merge-base --is-ancestor" is taught to take advantage of the
commit graph.

* ds/merge-base-is-ancestor-optim:
  commit-reach: use fast logic in repo_in_merge_base
  commit-reach: create repo_is_descendant_of()
2020-06-25 12:27:47 -07:00
Junio C Hamano
7b2685ef2d Merge branch 'dl/branch-cleanup'
Code clean-up around "git branch" with a minor bugfix.

* dl/branch-cleanup:
  branch: don't mix --edit-description
  t3200: test for specific errors
  t3200: rename "expected" to "expect"
2020-06-25 12:27:47 -07:00
Junio C Hamano
eb52351a1c Merge branch 'cc/upload-pack-data-3'
Code clean-up in the codepath that serves "git fetch" continues.

* cc/upload-pack-data-3:
  upload-pack: refactor common code into do_got_oid()
  upload-pack: move oldest_have to upload_pack_data
  upload-pack: pass upload_pack_data to got_oid()
  upload-pack: pass upload_pack_data to ok_to_give_up()
  upload-pack: pass upload_pack_data to send_acks()
  upload-pack: pass upload_pack_data to process_haves()
  upload-pack: change allow_unadvertised_object_request to an enum
  upload-pack: move allow_unadvertised_object_request to upload_pack_data
  upload-pack: move extra_edge_obj to upload_pack_data
  upload-pack: move shallow_nr to upload_pack_data
  upload-pack: pass upload_pack_data to send_unshallow()
  upload-pack: pass upload_pack_data to deepen_by_rev_list()
  upload-pack: pass upload_pack_data to deepen()
  upload-pack: pass upload_pack_data to send_shallow_list()
2020-06-25 12:27:46 -07:00
Junio C Hamano
1457886ce2 Merge branch 'ct/diff-with-merge-base-clarification'
"git diff" used to take arguments in random and nonsense range
notation, e.g. "git diff A..B C", "git diff A..B C...D", etc.,
which has been cleaned up.

* ct/diff-with-merge-base-clarification:
  Documentation: usage for diff combined commits
  git diff: improve range handling
  t/t3430: avoid undefined git diff behavior
2020-06-25 12:27:46 -07:00