The commit-graph format reserved a byte among the header of the file to
store a "hash version". During the SHA-256 work, this was not modified
because file formats are not necessarily intended to work across hash
versions. If a repository has SHA-256 as its hash algorithm, it
automatically up-shifts the lengths of object names in all necessary
formats.
However, since we have this byte available for adjusting the version, we
can make the file formats more obviously incompatible instead of relying
on other context from the repository.
Update the oid_version() method in commit-graph.c to add a new value, 2,
for sha-256. This automatically writes the new value in a SHA-256
repository _and_ verifies the value is correct. This is a breaking
change relative to the current 'master' branch since 092b677 (Merge
branch 'bc/sha-256-cvs-svn-updates', 2020-08-13) but it is not breaking
relative to any released version of Git.
The test impact is relatively minor: the output of 'test-tool
read-graph' lists the header information, so those instances of '1' need
to be replaced with a variable determined by GIT_TEST_DEFAULT_HASH. A
more careful test is added that specifically creates a repository of
each type then swaps the commit-graph files. The important value here is
that the "git log" command succeeds while writing a message to stderr.
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git blame --first-parent" option was not documented, but now
it is.
* rp/blame-first-parent-doc:
blame-options.txt: document --first-parent option
A new helper function has_object() has been introduced to make it
easier to mark object existence checks that do and don't want to
trigger lazy fetches, and a few such checks are converted using it.
* jt/has_object:
fsck: do not lazy fetch known non-promisor object
pack-objects: no fetch when allow-{any,promisor}
apply: do not lazy fetch when applying binary
sha1-file: introduce no-lazy-fetch has_object()
CMake support to build with MSVC for Windows bypassing the Makefile.
* ss/cmake-build:
ci: modification of main.yml to use cmake for vs-build job
cmake: support for building git on windows with msvc and clang.
cmake: support for building git on windows with mingw
cmake: support for testing git when building out of the source tree
cmake: support for testing git with ctest
cmake: installation support for git
cmake: generate the shell/perl/python scripts and templates, translations
Introduce CMake support for configuring Git
The component to respond to "git fetch" request is made more
configurable to selectively allow or reject object filtering
specification used for partial cloning.
* tb/upload-pack-filters:
t5616: use test_i18ngrep for upload-pack errors
upload-pack.c: introduce 'uploadpackfilter.tree.maxDepth'
upload-pack.c: allow banning certain object filter(s)
list_objects_filter_options: introduce 'list_object_filter_config_name'
Doc cleanup around "worktree".
* es/worktree-doc-cleanups:
git-worktree.txt: link to man pages when citing other Git commands
git-worktree.txt: make start of new sentence more obvious
git-worktree.txt: fix minor grammatical issues
git-worktree.txt: consistently use term "working tree"
git-worktree.txt: employ fixed-width typeface consistently
The final leg of SHA-256 transition.
* bc/sha-256-part-3: (39 commits)
t: remove test_oid_init in tests
docs: add documentation for extensions.objectFormat
ci: run tests with SHA-256
t: make SHA1 prerequisite depend on default hash
t: allow testing different hash algorithms via environment
t: add test_oid option to select hash algorithm
repository: enable SHA-256 support by default
setup: add support for reading extensions.objectformat
bundle: add new version for use with SHA-256
builtin/verify-pack: implement an --object-format option
http-fetch: set up git directory before parsing pack hashes
t0410: mark test with SHA1 prerequisite
t5308: make test work with SHA-256
t9700: make hash size independent
t9500: ensure that algorithm info is preserved in config
t9350: make hash size independent
t9301: make hash size independent
t9300: use $ZERO_OID instead of hard-coded object ID
t9300: abstract away SHA-1-specific constants
t8011: make hash size independent
...
Update "git help guides" documentation organization.
* pb/guide-docs:
git.txt: add list of guides
Documentation: don't hardcode command categories twice
help: drop usage of 'common' and 'useful' for guides
command-list.txt: add missing 'gitcredentials' and 'gitremote-helpers'
All "mergy" operations that internally use the merge-recursive
machinery should honor the merge.renormalize configuration, but
many of them didn't.
* en/eol-attrs-gotchas:
checkout: support renormalization with checkout -m <paths>
merge: make merge.renormalize work for all uses of merge machinery
t6038: remove problematic test
t6038: make tests fail for the right reason
Small fixes and workarounds.
* jk/compiler-fixes-and-workarounds:
revision: avoid leak when preparing bloom filter for "/"
revision: avoid out-of-bounds read/write on empty pathspec
config: work around gcc-10 -Wstringop-overflow warning
Adjust tests in contrib/ to the recent change to fmt-merge-msg.
* es/adjust-subtree-test-for-merge-msg-update:
Revert "contrib: subtree: adjust test to change in fmt-merge-msg"
Code cleanup around "worktree" API implementation.
* es/worktree-cleanup:
worktree: retire special-case normalization of main worktree path
worktree: drop bogus and unnecessary path munging
worktree: drop unused code from get_linked_worktree()
worktree: drop pointless strbuf_release()
The argv_array API is useful for not just managing argv but any
"vector" (NULL-terminated array) of strings, and has seen adoption
to a certain degree. It has been renamed to "strvec" to reduce the
barrier to adoption.
* jk/strvec:
strvec: rename struct fields
strvec: drop argv_array compatibility layer
strvec: update documention to avoid argv_array
strvec: fix indentation in renamed calls
strvec: convert remaining callers away from argv_array name
strvec: convert more callers away from argv_array name
strvec: convert builtin/ callers away from argv_array name
quote: rename sq_dequote_to_argv_array to mention strvec
strvec: rename files from argv-array to strvec
argv-array: rename to strvec
argv-array: use size_t for count and alloc
Drop whitespace in the value of `$test_description` and in a test body
and use `test_write_lines`.
Stop defining `$u` with a trailing space just so that we can tuck it in
like `git foo $u$more...` and get minimal whitespace in the command:
`git foo $u $more...` is more readable at the "cost" of an empty `$u`
yielding `git foo something...`.
Finally, avoid using single quotes within the test scripts to repeatedly
close and reopen the quotes that wrap the test scripts (see the previous
commit). This "unnecessary" quoting does mean that the verbose test
output shows the interpolated values, i.e., the shell code we're
running. But the downside is that the source of the script does *not*
show the shell code we're eventually executing, leaving the reader to
reason about what we really do and whether there are any quoting issues.
(There aren't.)
Where we run through loops to generate several "identical but different"
tests, the test message contains the interpolated variables we're
looping on, meaning one can always identify exactly which instance has
failed, even if the verbose test output shows the exact same test body
several times.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test scripts, the recommended style is, e.g.:
test_expect_success 'name' '
do-something somehow &&
do-some-more testing
'
When using this style, any single quote in the multi-line test section
is actually closing the lone single quotes that surround it.
It can be a non-issue in practice:
test_expect_success 'sed a little' '
sed -e 's/hi/lo/' in >out # "ok": no whitespace in s/hi/lo/
'
Or it can be a bug in the test, e.g., because variable interpolation
happens before the test even begins executing:
v=abc
test_expect_success 'variable interpolation' '
v=def &&
echo '"$v"' # abc
'
Change several such in-test single quotes to use double quotes instead
or, in a few cases, drop them altogether. These were identified using
some crude grepping. We're not fixing any test bugs here, but we're
hopefully making these tests slightly easier to grok and to maintain.
There are legitimate use cases for closing a quote and opening a new
one, e.g., both '\'' and '"'"' can be used to produce a literal single
quote. I'm not touching any of those here.
In t9401, tuck the redirecting ">" to the filename while we're touching
those lines.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
blame/annotate have supported --first-parent since commit 95a4fb0eac
("blame: handle --first-parent"). This adds a blurb on that option to
the documentation.
Signed-off-by: Raymond E. Pasco <ray@ameretat.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a call to has_object_file(), which lazily fetches missing
objects in a partial clone, when the object is known to not be
a promisor object. Change that call to has_object(), which does not do
any lazy fetching.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options --missing=allow-{any,promisor} were introduced in caf3827e2f
("rev-list: add list-objects filtering support", 2017-11-22) with the
following note in the commit message:
This patch introduces handling of missing objects to help
debugging and development of the "partial clone" mechanism,
and once the mechanism is implemented, for a power user to
perform operations that are missing-object aware without
incurring the cost of checking if a missing link is expected.
The idea that these options are missing-object aware (and thus do not
need to lazily fetch objects, unlike unaware commands that assume that
all objects are present) are assumed in later commits such as 07ef3c6604
("fetch test: use more robust test for filtered objects", 2020-01-15).
However, the current implementations of these options use
has_object_file(), which indeed lazily fetches missing objects. Teach
these implementations not to do so. Also, update the documentation of
these options to be clearer.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When applying a binary patch, as an optimization, "apply" checks if the
postimage is already present. During this fetch, it is perfectly
expected for the postimage not to be present, so there is no need to
lazy-fetch missing objects. Teach "apply" not to lazy-fetch in this
case.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There have been a few bugs wherein Git fetches missing objects whenever
the existence of an object is checked, even though it does not need to
perform such a fetch. To resolve these bugs, we could look at all the
places that has_object_file() (or a similar function) is used. As a
first step, introduce a new function has_object() that checks for the
existence of an object, with a default behavior of not fetching if the
object is missing and the repository is a partial clone. As we verify
each has_object_file() (or similar) usage, we can replace it with
has_object(), and we will know that we are done when we can delete
has_object_file() (and the other similar functions).
Also, the new function has_object() has more appropriate defaults:
besides not fetching, it also does not recheck packed storage.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The change in 6e9c4d408d ("git-cvsexportcommit: port to SHA-256",
2020-06-22) added the use of a temporary directory for the index.
However, the form we used doesn't work in versions of Perl before
5.10.1. For example, version 5.10.0 contains a version of File::Temp
from 2007 that doesn't contain "newdir".
In order to make the code work with 5.8.8, which we support, let's
change to use the static method "tempdir" with the argument "CLEANUP",
which provides the same behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests added to t5616 in 6dd3456a8c (upload-pack.c: allow banning
certain object filter(s), 2020-08-03) can fail racily, but only with
GETTEXT_POISON enabled.
The tests in question look something like this:
test_must_fail ok=sigpipe git clone --filter=blob:none ... 2>err &&
grep "filter blob:none not supported' err
The remote upload-pack process writes that error message both as an ERR
packet, but also via a die() message. In theory we should see the
message twice in the "err" file. The client relays the message from the
packet to its stderr (with a "remote error:" prefix), and because this
is a local-system clone, upload-pack's stderr goes to the same place.
But because clone may be writing to the pipe when upload-pack calls
die(), it may get SIGPIPE and fail to relay the message. That's why we
need our "ok=sigpipe" trick. But our grep should still work reliably in
that case. Either:
- we got SIGPIPE on the client, which means upload-pack completed its
die(), and we'll see that version of the message.
- the client didn't get SIGPIPE, and so it successfully relays the
message.
In theory we'd see both copies of the message in the second case. But
now always! As soon as the client sees ERR, it exits and we run grep.
But we have no guarantee that the upload-pack process has exited at this
point, or even written its die() message. We might only see the client
version of the message.
Normally that's OK. We only need to see one or the other to pass the
test. But now consider GETTEXT_POISON. upload-pack doesn't translate the
die() message nor the ERR packet. But once the client receives it, it
calls:
die(_("remote error: %s"), buffer + 4);
That message _is_ marked for translation. Normally we'd just replace the
"remote error:" portion of it, but in GETTEXT_POISON mode, we replace
the whole thing with "# GETTEXT POISON #" and don't include the "%s"
part at all. So the whole text from the ERR packet is dropped, and so we
may racily see a test failure if upload-pack's die() call wasn't yet
written.
We can fix it by using test_i18ngrep, which just makes this grep a noop
in the poison mode.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not all man5/man7 guides are mentioned in the 'git(1)' documentation,
which makes the missing ones somewhat hard to find.
Add a list of the guides to git(1) by leveraging the existing
`Documentation/cmd-list.perl` script to generate a file `cmds-guide.txt`
which gets included in git.txt.
Also, do not hard-code the manual section '1'. Instead, use a regex so
that the manual section is discovered from the first line of each
`git*.txt` file.
This addition was hinted at in 1b81d8cb19 (help: use command-list.txt
for the source of guides, 2018-05-20).
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of hard-coding the list of command categories in both
`Documentation/Makefile` and `Documentation/cmd-list.perl`, make the
Makefile the authoritative source and tweak `cmd-list.perl` so that it
receives the list of command categories as argument.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 1b81d8cb19 (help: use command-list.txt for the source of guides,
2018-05-20), all man5/man7 guides listed in command-list.txt appear in
the output of 'git help -g'.
However, 'git help -g' still prefixes this list with "The common Git
guides are:", which makes one wonder if there are others!
In the same spirit, the man page for 'git help' describes the '--guides'
option as listing 'useful' guides, which is not false per se but can
also be taken to mean that there are other guides that exist but are not
useful.
Instead of 'common' and 'useful', use 'Git concept guides' in both
places. To keep the code in line with this change, rename
help.c::list_common_guides_help to list_guides_help.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The guides 'gitcredentials' and 'gitremote-helpers' do not currently
appear in command-list.txt.
'gitcredentials' was forgotten back when guides were added to
command-list.txt in 1b81d8cb19 (help: use command-list.txt for the
source of guides, 2018-05-20).
'gitremote-helpers' was moved to section 7 in 439cc74632 (docs: move
gitremote-helpers into section 7, 2019-03-25), but command-list.txt was
not updated at the time.
Add these two guides to the list of guides in 'command-list.txt', so
that they appear in the output of 'git help --guides', and capitalize
the first word of the description of 'gitcredentials', as was done in
1b81d8c (help: use command-list.txt for the source of guides,
2018-05-20) for the other guides.
While at it, add a comment in Documentation/Makefile to remind developers
to update command-list.txt if they add a new guide.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of the trailing dot and mark for translation.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pretend-object mechanism checks if the given object already
exists in the object store before deciding to keep the data
in-core, but the check would have triggered lazy fetching of such
an object from a promissor remote.
* jt/pretend-object-never-come-from-elsewhere:
sha1-file: make pretend_object_file() not prefetch
While packing many objects in a repository with a promissor remote,
lazily fetching missing objects from the promissor remote one by
one may be inefficient---the code now attempts to fetch all the
missing objects in batch (obviously this won't work for a lazy
clone that lazily fetches tree objects as you cannot even enumerate
what blobs are missing until you learn which trees are missing).
* jt/pack-objects-prefetch-in-batch:
pack-objects: prefetch objects to be packed
pack-objects: refactor to oid_object_info_extended
If we're given an empty pathspec, we refuse to set up bloom filters, as
described in f3c2a36810 (revision: empty pathspecs should not use Bloom
filters, 2020-07-01).
But before the empty string check, we drop any trailing slash by
allocating a new string without it. So a pathspec consisting only of "/"
will allocate that string, but then still cause us to bail, leaking the
new string. Let's make sure to free it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running t4216 with ASan results in it complaining of an out-of-bounds
read in prepare_to_use_bloom_filter(). The issue is this code to strip a
trailing slash:
last_index = pi->len - 1;
if (pi->match[last_index] == '/') {
because we have no guarantee that pi->len isn't zero. This can happen if
the pathspec is ".", as we translate that to an empty string. And if
that read of random memory does trigger the conditional, we'd then do an
out-of-bounds write:
path_alloc = xstrdup(pi->match);
path_alloc[last_index] = '\0';
Let's make sure to check the length before subtracting. Note that for an
empty pathspec, we'd end up bailing from the function a few lines later,
which makes it tempting to just:
if (!pi->len)
return;
early here. But our code here is stripping a trailing slash, and we need
to check for emptiness after stripping that slash, too. So we'd have two
blocks, which would require repeating some cleanup code.
Instead, just skip the trailing-slash for an empty string. Setting
last_index at all in the case is awkward since it will have a nonsense
value (and it uses an "int", which is a too-small type for a string
anyway). So while we're here, let's:
- drop last_index entirely; it's only used in two spots right next to
each other and writing out "pi->len - 1" in both is actually easier
to follow
- use xmemdupz() to duplicate the string. This is slightly more
efficient, but more importantly makes the intent more clear by
allocating the correct-sized substring in the first place. It also
eliminates any question of whether path_alloc is as long as
pi->match (which it would not be if pi->match has any embedded NULs,
though in practice this is probably impossible).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Compiling with gcc-10, -O2, and -fsanitize=undefined results in a
compiler warning:
config.c: In function ‘git_config_copy_or_rename_section_in_file’:
config.c:3170:17: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
3170 | output[0] = '\t';
| ~~~~~~~~~~^~~~~~
config.c:3076:7: note: at offset -1 to object ‘buf’ with size 1024 declared here
3076 | char buf[1024];
| ^~~
This is a false positive. The interesting lines of code are:
int i;
char *output = buf;
...
for (i = 0; buf[i] && isspace(buf[i]); i++)
; /* do nothing */
...
int offset;
offset = section_name_match(&buf[i], old_name);
if (offset > 0) {
...
output += offset + i;
if (strlen(output) > 0) {
/*
* More content means there's
* a declaration to put on the
* next line; indent with a
* tab
*/
output -= 1;
output[0] = '\t';
}
}
So we do assign output to buf initially. Later we increment it based on
"offset" and "i" and then subtract "1" from it. That latter step is what
the compiler is complaining about; it could lead to going off the left
side of the array if "output == buf" at the moment of the subtraction.
For that to be the case, then "offset + i" would have to be 0. But that
can't happen:
- we know that "offset" is at least 1, since we're in a conditional
block that checks that
- we know that "i" is not negative, since it started at 0 and only
incremented over whitespace
So the sum must be at least 1, and therefore it's OK to subtract one
from "output".
But that's not quite the whole story. Since "i" is an int, it could in
theory be possible to overflow to negative (when counting whitespace on
a very large string). But we know that's impossible because we're
counting the 1024-byte buffer we just fed to fgets(), so it can never be
larger than that.
Switching the type of "i" to "unsigned" makes the warning go away, so
let's do that.
Arguably size_t is an even better type (for this and for the other
length fields), but switching to it produces a similar but distinct
warning:
config.c: In function ‘git_config_copy_or_rename_section_in_file’:
config.c:3170:13: error: array subscript -1 is outside array bounds of ‘char[1024]’ [-Werror=array-bounds]
3170 | output[0] = '\t';
| ~~~~~~^~~
config.c:3076:7: note: while referencing ‘buf’
3076 | char buf[1024];
| ^~~
If we were to ever switch off of fgets() to strbuf_getline() or similar,
we'd probably need to use size_t to avoid other overflow problems. But
for now we know we're safe because of the small fixed size of our
buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When citing other Git commands, rather than merely formatting them with
a fixed-width typeface, improve the reader experience by linking to them
directly via `linkgit:`.
Suggested-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When reading the rendered description of `add`, it's easy to trip over
and miss the end of one sentence and the start of the next, making it
seem as if they are part of the same statement, separated only by a
dash:
... specific files such as HEAD, index, etc. - may also be
specified as <commit-ish>; it is synonymous with...
This can be particularly confusing since the thoughts expressed by the
two sentences are unrelated. Reduce the likelihood of confusion by
making it obvious that the two sentences are distinct.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a few grammatical problems to improve the reading experience.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As originally composed, git-worktree.txt employed a mix of "worktree"
and "working tree" which was inconsistent and potentially confusing to
readers. bc483285b7 (Documentation/git-worktree: consistently use term
"linked working tree", 2015-07-20) undertook the task of employing the
term "working tree" consistently throughout the document and avoiding
"worktree" altogether for descriptive text. Since that time, some
instances of "worktree" have crept back in. Continue the work of
bc483285b7 by transforming these to "working tree", as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-worktree documentation generally does a good job of formatting
literal text using a fixed-width typeface, however, some instances of
unformatted literal text have crept in over time. Fix these.
While at it, also fix a few incorrect typefaces resulting from wrong
choice of Asciidoc quotes.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>