Instead of using a hard-coded all-zeros object ID, use $ZERO_OID.
Compute the length of the object IDs in use and use this instead of
hard-coding the constant 40.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes. In addition, use cut to filter out the object
IDs and verify only the information that we're really interested in.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow lines which start with either a 40- or 64-character hex object ID,
to allow for both SHA-1 and SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One assertion in this test invokes git with core.abbrev set to "40".
Since we're expecting the full hash length, use test_oid to look up the
full hash length for the hash in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compute the length of an object ID instead of hard-coding 40-based
values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the ZERO_OID variable to abbreviate the all-zeros object ID for
maintainability and to avoid depending on a specific size for the hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test to sanitize the diffs and strip out object IDs from
them, as it does for other object IDs, since we are not interested in
the particular values used.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use $ZERO_OID instead of hard-coding a fixed size all-zeros object ID.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using cut with hard-coded hash sizes, use cut with fields, or
where that's not possible, sed with $OID_REGEX, so that the tests are
independent of hash size.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the test so that it computes variables for object IDs instead of
using hard-coded hashes.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_oid instead of hard-coding algorithm-specific constants and
all-zero values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a specific invalid hard-coded object ID, look one
up from the translation table.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test contains hard-coded invalid object IDs. Make it hash size
independent by generating invalid object IDs using the translation
tables. Add a setup target to ensure the output of test_oid_init is
checked properly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this test, we want to produce several blobs whose first two hex
characters are "17", since we look at this object directory as a proxy
for how many loose objects there are before we need to GC. Use
test_oid_cache to specify strings that will hash to the right values
when turned into blobs.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding a fixed length example object ID in the test,
compute one using the translation tables. Move a variable into the
setup block so that we can ensure the exit status of test_oid is
checked.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use $OID_REGEX instead of a hard-coded regular expression.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding a constant 40, split the output of rev-list by
field.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The idea of the magic value "ac4f2ee" in this test is to make the
reworded commit `collide2` have the same shortened ID as the commit
`collide3`.
To port the same idea to the SHA-256 version of Git, we therefore need
another magic value that causes the same collision, but this time with
the SHA-256 version of the commit IDs.
In this patch, we add code guarded by `GIT_TEST_FIND_COLLIDER` to do
exactly that. Essentially, a large number of integers is appended to the
commit message "collide2" to find such a collision. To make it easier to
find such a collision, we reduce the number of digits to 4.
As the tests are no longer dependent on SHA-1, we also rename their
titles to talk about "commit IDs" instead of "SHA-1s".
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When computing the fanout length, let's use test_oid to look up the
hexadecimal size of the hash in question instead of hard-coding a value.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use $ZERO_OID to make the test hash independent.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The bloom filter code relies on reading object IDs using parse_oid_hex.
In order to make that work with an appropriate size, we need to have
initialized the repository's hash algorithm. Since the values we're
processing depend on the repository in use, let's set up the repository
when we run the test helper.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2.28-rc0, we corrected a bug that some repository extensions are
honored by mistake even in a version 0 repositories (these
configuration variables in extensions.* namespace were supposed to
have special meaning in repositories whose version numbers are 1 or
higher), but this was a bit too big a change.
* jn/v0-with-extensions-fix:
repository: allow repository format upgrade with extensions
Revert "check_repository_format_gently(): refuse extensions for old repositories"
Now that we officially permit repository extensions in repository
format v0, permit upgrading a repository with extensions from v0 to v1
as well.
For example, this means a repository where the user has set
"extensions.preciousObjects" can use "git fetch --filter=blob:none
origin" to upgrade the repository to use v1 and the partial clone
extension.
To avoid mistakes, continue to forbid repository format upgrades in v0
repositories with an unrecognized extension. This way, a v0 user
using a misspelled extension field gets a chance to correct the
mistake before updating to the less forgiving v1 format.
While we're here, make the error message for failure to upgrade the
repository format a bit shorter, and present it as an error, not a
warning.
Reported-by: Huan Huan Chen <huanhuanchen@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 14c7fa269e.
The core.repositoryFormatVersion field was introduced in ab9cb76f66
(Repository format version check., 2005-11-25), providing a welcome
bit of forward compatibility, thanks to some welcome analysis by
Martin Atukunda. The semantics are simple: a repository with
core.repositoryFormatVersion set to 0 should be comprehensible by all
Git implementations in active use; and Git implementations should
error out early instead of trying to act on Git repositories with
higher core.repositoryFormatVersion values representing new formats
that they do not understand.
A new repository format did not need to be defined until 00a09d57eb
(introduce "extensions" form of core.repositoryformatversion,
2015-06-23). This provided a finer-grained extension mechanism for
Git repositories. In a repository with core.repositoryFormatVersion
set to 1, Git implementations can act on "extensions.*" settings that
modify how a repository is interpreted. In repository format version
1, unrecognized extensions settings cause Git to error out.
What happens if a user sets an extension setting but forgets to
increase the repository format version to 1? The extension settings
were still recognized in that case; worse, unrecognized extensions
settings do *not* cause Git to error out. So combining repository
format version 0 with extensions settings produces in some sense the
worst of both worlds.
To improve that situation, since 14c7fa269e
(check_repository_format_gently(): refuse extensions for old
repositories, 2020-06-05) Git instead ignores extensions in v0 mode.
This way, v0 repositories get the historical (pre-2015) behavior and
maintain compatibility with Git implementations that do not know about
the v1 format. Unfortunately, users had been using this sort of
configuration and this behavior change came to many as a surprise:
- users of "git config --worktree" that had followed its advice
to enable extensions.worktreeConfig (without also increasing the
repository format version) would find their worktree configuration
no longer taking effect
- tools such as copybara[*] that had set extensions.partialClone in
existing repositories (without also increasing the repository format
version) would find that setting no longer taking effect
The behavior introduced in 14c7fa269e might be a good behavior if we
were traveling back in time to 2015, but we're far too late. For some
reason I thought that it was what had been originally implemented and
that it had regressed. Apologies for not doing my research when
14c7fa269e was under development.
Let's return to the behavior we've had since 2015: always act on
extensions.* settings, regardless of repository format version. While
we're here, include some tests to describe the effect on the "upgrade
repository version" code path.
[*] ca76c0b1e1
Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "fetch.writeCommitGraph" configuration is set in a shallow
repository and a fetch moves the shallow boundary, we wrote out
broken commit-graph files that do not match the reality, which has
been corrected.
* tb/fix-persistent-shallow:
commit.c: don't persist substituted parents when unshallowing
"git log -Lx,y:path --before=date" lost track of where the range
should be because it didn't take the changes made by the youngest
commits that are omitted from the output into account.
* rs/line-log-until:
revision: disable min_age optimization with line-log
"git send-email --in-reply-to=<msg>" did not use the In-Reply-To:
header with the value given from the command line, and let it be
overridden by the value on In-Reply-To: header in the messages
being sent out (if exists).
* ra/send-email-in-reply-to-from-command-line-wins:
send-email: restore --in-reply-to superseding behavior
Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git knows how to reset stat-validity checks for the
$GIT_DIR/shallow file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').
However, when $GIT_DIR/shallow changes, Git does not alter or remove any
grafts (nor substituted parents) in memory.
This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
set to true. Ordinarily in a shallow repository (and before 37b9dcabfc,
even in this case), commit_graph_compatible() would return false,
indicating that the repository should not be used to write a
commit-graphs (since commit-graph files cannot represent a shallow
history). But since 37b9dcabfc, in an --unshallow operation that check
succeeds.
Thus even though the repository isn't shallow any longer (that is, we
have all of the objects), the in-core representation of those objects
still has munged parents at the shallow boundaries. When the
commit-graph write proceeds, we use the incorrect parentage, producing
wrong results.
There are two ways for a user to work around this: either (1) set
'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after
unshallowing.
One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. That would produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, attach a new bit to the pool,
'substituted_parent', which indicates if the repository *ever* stored a
commit which had its parents modified (i.e., the shallow boundary
prior to unshallowing).
This bit needs to be sticky because all reads subsequent to modifying a
commit's parents are unreliable when unshallowing. Modify the check in
'commit_graph_compatible' to take this bit into account, and correctly
avoid generating commit-graphs in this case, thus solving the bug.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Jay Conrod <jayconrod@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The effort to avoid using test_must_fail on non-git command continues.
* dl/test-must-fail-fixes-5:
lib-submodule-update: pass 'test_must_fail' as an argument
lib-submodule-update: prepend "git" to $command
lib-submodule-update: consolidate --recurse-submodules
lib-submodule-update: add space after function name
"git fast-export --anonymize" learned to take customized mapping to
allow its users to tweak its output more usable for debugging.
* jk/fast-export-anonym-alt:
fast-export: use local array to store anonymized oid
fast-export: anonymize "master" refname
fast-export: allow seeding the anonymized mapping
fast-export: add a "data" callback parameter to anonymize_str()
fast-export: move global "idents" anonymize hashmap into function
fast-export: use a flex array to store anonymized entries
fast-export: stop storing lengths in anonymized hashmaps
fast-export: tighten anonymize_mem() interface to handle only strings
fast-export: store anonymized oids as hex strings
fast-export: use xmemdupz() for anonymizing oids
t9351: derive anonymized tree checks from original repo
"git difftool" has trouble dealing with paths added to the index
with the intent-to-add bit.
* js/diff-files-i-t-a-fix-for-difftool:
difftool -d: ensure that intent-to-add files are handled correctly
diff-files --raw: show correct post-image of intent-to-add files
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.
* js/default-branch-name:
contrib: subtree: adjust test to change in fmt-merge-msg
testsvn: respect `init.defaultBranch`
remote: use the configured default branch name when appropriate
clone: use configured default branch name when appropriate
init: allow setting the default for the initial branch name via the config
init: allow specifying the initial branch name for the new repository
docs: add missing diamond brackets
submodule: fall back to remote's HEAD for missing remote.<name>.branch
send-pack/transport-helper: avoid mentioning a particular branch
fmt-merge-msg: stop treating `master` specially
The code to push changes over "dumb" HTTP had a bad interaction
with the commit reachability code due to incorrect allocation of
object flag bits, which has been corrected.
* bc/http-push-flagsfix:
http-push: ensure unforced pushes fail when data would be lost
The documentation and some tests have been adjusted for the recent
renaming of "pu" branch to "seen".
* js/pu-to-seen:
tests: reference `seen` wherever `pu` was referenced
docs: adjust the technical overview for the rename `pu` -> `seen`
docs: adjust for the recent rename of `pu` to `seen`
API cleanup for get_worktrees()
* es/get-worktrees-unsort:
worktree: drop get_worktrees() unused 'flags' argument
worktree: drop get_worktrees() special-purpose sorting option
CVS/SVN interface have been prepared for SHA-256 transition
* bc/sha-256-cvs-svn-updates:
git-cvsexportcommit: port to SHA-256
git-cvsimport: port to SHA-256
git-cvsserver: port to SHA-256
git-svn: set the OID length based on hash algorithm
perl: make SVN code hash independent
perl: make Git::IndexInfo work with SHA-256
perl: create and switch variables for hash constants
t/lib-git-svn: make hash size independent
t9101: make hash independent
t9104: make hash size independent
t9100: make test work with SHA-256
t9108: make test hash independent
t9168: make test hash independent
t9109: make test hash independent
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.
* ak/commit-graph-to-slab:
commit-graph: minimize commit_graph_data_slab access
commit: move members graph_pos, generation to a slab
commit-graph: introduce commit_graph_data_slab
object: drop parsed_object_pool->commit_count
SHA-256 migration work continues.
* bc/sha-256-part-2: (44 commits)
remote-testgit: adapt for object-format
bundle: detect hash algorithm when reading refs
t5300: pass --object-format to git index-pack
t5704: send object-format capability with SHA-256
t5703: use object-format serve option
t5702: offer an object-format capability in the test
t/helper: initialize the repository for test-sha1-array
remote-curl: avoid truncating refs with ls-remote
t1050: pass algorithm to index-pack when outside repo
builtin/index-pack: add option to specify hash algorithm
remote-curl: detect algorithm for dumb HTTP by size
builtin/ls-remote: initialize repository based on fetch
t5500: make hash independent
serve: advertise object-format capability for protocol v2
connect: parse v2 refs with correct hash algorithm
connect: pass full packet reader when parsing v2 refs
Documentation/technical: document object-format for protocol v2
t1302: expect repo format version 1 for SHA-256
builtin/show-index: provide options to determine hash algo
t5302: modernize test formatting
...
If one of the options --before, --min-age or --until is given,
limit_list() filters out younger commits early on. Line-log needs all
those commits to trace the movement of line ranges, though. Skip this
optimization if both are used together.
Reported-by: Мария Долгополова <dolgopolovamariia@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In https://github.com/git-for-windows/git/issues/2677, a `git difftool
-d` problem was reported. The underlying cause was a bug in `git
diff-files --raw` that we just fixed: it reported intent-to-add files
with the empty _tree_ as the post-image OID, when we need to show
an all-zero (or, "null") OID instead, to indicate to the caller that
they have to look at the worktree file.
The symptom of that problem shown by `git difftool` was this:
error: unable to read sha1 file of <path> (<empty-tree-OID>)
error: could not write '<filename>'
Make sure that the reported `difftool` problem stays fixed.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documented behavior of `git diff-files --raw` is to display
[...] 0{40} if creation, unmerged or "look at work tree".
on the right hand (i.e. postimage) side. This happens for files that
have unstaged modifications, and for files that are unmodified but
stat-dirty.
For intent-to-add files, we used to show the empty blob's hash instead.
In c26022ea8f (diff: convert diff_addremove to struct object_id,
2017-05-30), we made that worse by inadvertently changing that to the
hash of the empty tree.
Let's make the behavior consistent with files that have unstaged
modifications (which applies to intent-to-add files, too) by showing
all-zero values also for intent-to-add files.
Accordingly, this patch adjusts the expectations set by the regression
test introduced in feea6946a5 (diff-files: treat "i-t-a" files as
"not-in-index", 2020-06-20).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git send-email --in-reply-to= fails to override In-Reply-To email headers,
if they're present in the output of format-patch, even when explicitly
told to do so by the option --no-thread, which breaks the contract of the
command line switch option, per its man page.
"
--in-reply-to=<identifier>
Make the first mail (or all the mails with --no-thread) appear as
a reply to the given Message-Id, which avoids breaking threads to
provide a new patch series.
"
This patch fixes the aformentioned issue, by bringing --in-reply-to's old
overriding behavior back.
The test was donated by Carlo Marcelo Arenas Belón.
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.
* sk/diff-files-show-i-t-a-as-new:
diff-files: treat "i-t-a" files as "not-in-index"
Allow runtime upgrade of the repository format version, which needs
to be done carefully.
There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.
* xl/upgrade-repo-format:
check_repository_format_gently(): refuse extensions for old repositories
sparse-checkout: upgrade repository to version 1 when enabling extension
fetch: allow adding a filter after initial clone
repository: add a helper function to perform repository format upgrade
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:
- it helped to have some known reference point between the original
and anonymized repository
- since it's historically the default branch name, it doesn't leak any
information
Now that we can ask fast-export to retain particular tokens, we have a
much better tool for the first one (because it works for any ref, not
just master).
For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.
Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. We could just use
--anonymize-map=master to keep the same output, but then we wouldn't
know if it works because of our hard-coded master or because of the
explicit map.
So let's flip the test a bit, and confirm that we anonymize "master",
but keep "other" in the output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.
Let's make it possible to seed the anonymization map. This lets users
either:
- mark names to be retained as-is, if they don't consider them secret
(in which case their original commands would just work)
- map names to new values, which lets them adapt the reproduction
recipe to the new names without revealing the originals
The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.
This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.
One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>