In "fetch", get_ref_map() iterates over all refs to populate
"existing_refs" in order to populate peer_ref->old_oid in the returned
refmap, even if the refmap has no peer_ref set - which is the case when
only literal hashes (i.e. no refs by name) are fetched.
Iterating over refs causes the targets of those refs to be checked for
existence. Avoiding this is especially important when we use "git fetch"
to perform lazy fetches in a partial clone because a target of such a
ref may need to be itself lazy-fetched (and otherwise causing an
infinite loop).
Therefore, avoid populating "existing_refs" until necessary. With this
patch, because Git lazy-fetches objects by literal hashes (to be done in
a subsequent commit), it will then be able to guarantee avoiding reading
targets of refs.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In "fetch", there are two parameters submodule_fetch_jobs_config and
recurse_submodules that can be set in a variety of ways: through
.gitmodules, through .git/config, and through the command line.
Currently "fetch" handles this by first reading .gitmodules, then
reading .git/config (allowing it to overwrite existing values), then
reading the command line (allowing it to overwrite existing values).
Notice that we can avoid reading .gitmodules if .git/config and/or the
command line already provides us with what we need. In addition, if
recurse_submodules is found to be "no", we do not need the value of
submodule_fetch_jobs_config.
Avoiding reading .gitmodules is especially important when we use "git
fetch" to perform lazy fetches in a partial clone because the
.gitmodules file itself might need to be lazy fetched (and otherwise
causing an infinite loop).
In light of all this, avoid reading .gitmodules until necessary. When
reading it, we may only need one of the two parameters it provides, so
teach fetch_config_from_gitmodules() to support NULL arguments. With
this patch, users (including Git itself when invoking "git fetch" to
lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
passing --recurse-submodules=no.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent patch, partial clones will be taught to fetch missing
objects using a "git fetch" subprocess. Because the number of objects
fetched may be too numerous to fit on the command line, teach "fetch" to
accept refspecs passed through stdin.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run fetch but record the result in remote-tracking branches,
and either if you do nothing with the fetched refs (e.g. you are
merely mirroring) or if you always work from the remote-tracking
refs (e.g. you fetch and then merge origin/branchname separately),
you can get away with having no FETCH_HEAD at all.
Teach "git fetch" a command line option "--[no-]write-fetch-head".
The default is to write FETCH_HEAD, and the option is primarily
meant to be used with the "--no-" prefix to override this default,
because there is no matching fetch.writeFetchHEAD configuration
variable to flip the default to off (in which case, the positive
form may become necessary to defeat it).
Note that under "--dry-run" mode, FETCH_HEAD is never written;
otherwise you'd see list of objects in the file that you do not
actually have. Passing `--write-fetch-head` does not force `git
fetch` to write the file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --first-parent -p" showed patches only for single-parent
commits on the first-parent chain; the "--first-parent" option has
been made to imply "-m". Use "--no-diff-merges" to restore the
previous behaviour to omit patches for merge commits.
* jk/log-fp-implies-m:
doc/git-log: clarify handling of merge commit diffs
doc/git-log: move "-t" into diff-options list
doc/git-log: drop "-r" diff option
doc/git-log: move "Diff Formatting" from rev-list-options
log: enable "-m" automatically with "--first-parent"
revision: add "--no-diff-merges" option to counteract "-m"
log: drop "--cc implies -m" logic
"git bisect" learns the "--first-parent" option to find the first
breakage along the first-parent chain.
* al/bisect-first-parent:
bisect: combine args passed to find_bisection()
bisect: introduce first-parent flag
cmd_bisect__helper: defer parsing no-checkout flag
rev-list: allow bisect and first-parent flags
t6030: modernize "git bisect run" tests
In the ensure_core_worktree() function, we load the core.worktree value
of the submodule repository using repo_config_get_string(). This
function copies the string, but we never free it, leaking the memory.
We can instead use the "tmp" version of that function to avoid the
allocation at all. We don't have to worry about lifetime issues, since
we never even look at the value (we just want to know if it's set).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebase is implemented with two different backends - 'apply' and
'merge' each of which support a different set of options. In
particular the apply backend supports a number of options implemented
by 'git am' that are not implemented in the merge backend. This means
that the available options are different depending on which backend is
used which is confusing. This patch adds support for the
--committer-date-is-author-date option to the merge backend. This
option uses the author date of the commit that is being rewritten as
the committer date when the new commit is created.
Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The implementation of --committer-date-is-author-date exports
GIT_COMMITTER_DATE to override the default committer date but does not
reset GIT_COMMITTER_DATE in the environment after creating the commit
so it is set in the environment of any hooks that get run. We're about
to add the same functionality to the sequencer and do not want to have
GIT_COMMITTER_DATE set when running hooks or exec commands so lets
update commit_tree_extended() to take an explicit committer so we
override the default date without setting GIT_COMMITTER_DATE in the
environment.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of functions that used struct refspec_item did not zero out the
structure memory. This can result in unexpected behavior, especially if
additional parameters are ever added to refspec_item in the future. Use
memset to ensure that unset structure members are zero.
It may make sense to convert most of these uses of struct refspec_item
to use either struct initializers or refspec_item_init_or_die. However,
other similar code uses memset. Converting all of these uses has been
left as a future exercise.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two functions to get a single config string:
- git_config_get_string()
- git_config_get_string_const()
One might naively think that the first one allocates a new string and
the second one just points us to the internal configset storage. But
in fact they both allocate a new copy; the second one exists only to
avoid having to cast when using it with a const global which we never
intend to free.
The documentation for the function explains that clearly, but it seems
I'm not alone in being surprised by this. Of 17 calls to the function,
13 of them leak the resulting value.
We could obviously fix these by adding the appropriate free(). But it
would be simpler still if we actually had a non-allocating way to get
the string. There's git_config_get_value() but that doesn't quite do
what we want. If the config key is present but is a boolean with no
value (e.g., "[foo]bar" in the file), then we'll get NULL (whereas the
string versions will print an error and die).
So let's introduce a new variant, git_config_get_string_tmp(), that
behaves as these callers expect. We need a new name because we have new
semantics but the same function signature (so even if we converted the
four remaining callers, topics in flight might be surprised). The "tmp"
is because this value should only be held onto for a short time. In
practice it's rare for us to clear and refresh the configset,
invalidating the pointer, but hopefully the "tmp" makes callers think
about the lifetime. In each of the converted cases here the value only
needs to last within the local function or its immediate caller.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We unconditionally write a branch name into a newly allocated buffer in
new_branch_info->path, via setup_branch_path(). We then check to see if
the branch exists; if not, we set that field to NULL, leaking the
memory. We should take care to free() it when doing so.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prepare_to_clone_next_submodule() function has a few local-variable
strbufs. We use strbuf_reset() throughout the function to reuse the
buffers over and over. But at the end of the function we also use
strbuf_reset() as they go out of scope, which means we end up leaking
their heap buffers. This should be strbuf_release() instead.
These were introduced by 48308681b0 (git submodule update: have a
dedicated helper for cloning, 2016-02-29), but it doesn't seem to have
the same mistake elsewhere. Likewise, I looked for other instances of
the pattern in the submodule--helper file but couldn't find any.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are still a handful mentions of SHA-1 when we meant the
(hexadecimal) object names in end-user facing messages. Rewrite
them.
I was hoping that this can mostly be s/SHA-1/object name/, but
a few messages needed rephrasing to keep the result readable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new helper function has_object() has been introduced to make it
easier to mark object existence checks that do and don't want to
trigger lazy fetches, and a few such checks are converted using it.
* jt/has_object:
fsck: do not lazy fetch known non-promisor object
pack-objects: no fetch when allow-{any,promisor}
apply: do not lazy fetch when applying binary
sha1-file: introduce no-lazy-fetch has_object()
We UNLEAK() the "sorting" list created by parsing command-line options
(which is essentially used until the program exits). But we do so right
before leaving the cmd_ls_remote() function, which means we have to hit
all of the exits. But the point of UNLEAK() is that it's an annotation
which doesn't impact the variable itself. We can mark it as soon as
we're done writing its value, and then we only have to do so once.
This gives us a minor code reduction, and serves as a better example of
how UNLEAK() can be used.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no reason that git-fast-import benefits from being a separate
binary. And as it links against libgit.a, it has a non-trivial disk
footprint. Let's make it a builtin, which reduces the size of a stripped
installation from 22MB to 21MB.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no reason that bugreport has to be a separate binary. And since
it links against libgit.a, it has a rather large disk footprint. Let's
make it a builtin, which reduces the size of a stripped installation
from 24MB to 22MB.
This also simplifies our Makefile a bit. And we can take advantage of
builtin niceties like RUN_SETUP_GENTLY.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no real reason for credential helpers to be separate binaries. I
did them this way originally under the notion that helper don't _need_
to be part of Git, and so can be built totally separately (and indeed,
the ones in contrib/credential are). But the ones in our main Makefile
build on libgit.a, and the resulting binaries are reasonably large.
We can slim down our total disk footprint by just making them builtins.
This reduces the size of:
make strip install
from 29MB to 24MB on my Debian system.
Note that credential-cache can't operate without support for Unix
sockets. Currently we just don't build it at all when NO_UNIX_SOCKETS is
set. We could continue that with conditionals in the Makefile and our
list of builtins. But instead, let's build a dummy implementation that
dies with an informative message. That has two advantages:
- it's simpler, because the conditional bits are all kept inside
the credential-cache source
- a user who is expecting it to exist will be told _why_ they can't
use it, rather than getting the "credential-cache is not a git
command" error which makes it look like the Git install is broken.
Note that our dummy implementation does still respond to "-h" in order
to appease t0012 (and this may be a little friendlier for users, as
well).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert submodule subcommand 'summary' to a builtin and call it via
'git-submodule.sh'.
The shell version had to call $diff_cmd twice, once to find the modified
modules cared by the user and then again, with that list of modules
to do various operations for computing the summary of those modules.
On the other hand, the C version does not need a second call to
$diff_cmd since it reuses the module list from the first call to do the
aforementioned tasks.
In the C version, we use the combination of setting a child process'
working directory to the submodule path and then calling
'prepare_submodule_repo_env()' which also sets the 'GIT_DIR' to '.git',
so that we can be certain that those spawned processes will not access
the superproject's ODB by mistake.
A behavioural difference between the C and the shell version is that the
shell version outputs two line feeds after the 'git log' output when run
outside of the tests while the C version outputs one line feed in any
case. The reason for this is that the shell version calls log with
'--pretty=format:<fmt>' whose output is followed by two echo
calls; 'format' does not have "terminator" semantics like its 'tformat'
counterpart. So, the log output is terminated by a newline only when
invoked by the user and not when invoked from the scripts. This results
in the one & two line feed differences in the shell version.
On the other hand, the C version calls log with '--pretty=<fmt>'
which is equivalent to '--pretty:tformat:<fmt>' which is then
followed by a 'printf("\n")'. Due to its "terminator" semantics the
log output is always terminated by newline and hence one line feed in
any case.
Also, when we try to pass an option-like argument after a non-option
argument, for instance:
git submodule summary HEAD --foo-bar
(or)
git submodule summary HEAD --cached
That argument would be treated like a path to the submodule for which
the user is requesting a summary. So, the option ends up having no
effect. Though, passing '--quiet' is an exception to this:
git submodule summary HEAD --quiet
While 'summary' doesn't support '--quiet', we don't get an output for
the above command as '--quiet' is treated as a path which means we get
an output only if a submodule whose path is '--quiet' exists.
The error message in case of computing a summary for non-existent
submodules in the C version is different from that of the shell version.
Since the new error message is not marked for translation, change the
'test_i18ngrep' in t7421.4 to 'grep'.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Stefan Beller <stefanbeller@gmail.com>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many `submodule--helper` subcommands follow the convention that a struct
defines their callback data, and the declaration of that struct is
followed immediately by a macro to use in static initializers, without
any separating empty line.
Let's align the `init`, `status` and `sync` subcommands with that convention.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The final leg of SHA-256 transition.
* bc/sha-256-part-3: (39 commits)
t: remove test_oid_init in tests
docs: add documentation for extensions.objectFormat
ci: run tests with SHA-256
t: make SHA1 prerequisite depend on default hash
t: allow testing different hash algorithms via environment
t: add test_oid option to select hash algorithm
repository: enable SHA-256 support by default
setup: add support for reading extensions.objectformat
bundle: add new version for use with SHA-256
builtin/verify-pack: implement an --object-format option
http-fetch: set up git directory before parsing pack hashes
t0410: mark test with SHA1 prerequisite
t5308: make test work with SHA-256
t9700: make hash size independent
t9500: ensure that algorithm info is preserved in config
t9350: make hash size independent
t9301: make hash size independent
t9300: use $ZERO_OID instead of hard-coded object ID
t9300: abstract away SHA-1-specific constants
t8011: make hash size independent
...
Update "git help guides" documentation organization.
* pb/guide-docs:
git.txt: add list of guides
Documentation: don't hardcode command categories twice
help: drop usage of 'common' and 'useful' for guides
command-list.txt: add missing 'gitcredentials' and 'gitremote-helpers'
All "mergy" operations that internally use the merge-recursive
machinery should honor the merge.renormalize configuration, but
many of them didn't.
* en/eol-attrs-gotchas:
checkout: support renormalization with checkout -m <paths>
merge: make merge.renormalize work for all uses of merge machinery
t6038: remove problematic test
t6038: make tests fail for the right reason
The argv_array API is useful for not just managing argv but any
"vector" (NULL-terminated array) of strings, and has seen adoption
to a certain degree. It has been renamed to "strvec" to reduce the
barrier to adoption.
* jk/strvec:
strvec: rename struct fields
strvec: drop argv_array compatibility layer
strvec: update documention to avoid argv_array
strvec: fix indentation in renamed calls
strvec: convert remaining callers away from argv_array name
strvec: convert more callers away from argv_array name
strvec: convert builtin/ callers away from argv_array name
quote: rename sq_dequote_to_argv_array to mention strvec
strvec: rename files from argv-array to strvec
argv-array: rename to strvec
argv-array: use size_t for count and alloc
The purpose of "git init --separate-git-dir" is to separate the
repository from the worktree. This is true even when --separate-git-dir
is used on an existing worktree, in which case, it moves the .git/
subdirectory to a new location outside the worktree.
However, an outright bare repository (such as one created by "git init
--bare"), has no worktree, so using --separate-git-dir to separate it
from its non-existent worktree is nonsensical. Therefore, make it an
error to use --separate-git-dir on a bare repository.
Implementation note: "git init" considers a repository bare if told so
explicitly via --bare or if it guesses it to be so based upon
heuristics. In the explicit --bare case, a conflict with
--separate-git-dir is easy to detect early. In the guessed case,
however, the conflict can only be detected once "bareness" is guessed,
which happens after "git init" has begun creating the repository.
Technically, we can get by with a single late check which would cover
both cases, however, erroring out early, when possible, without leaving
detritus provides a better user experience.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that find_bisection() accepts multiple boolean arguments, these may
be combined into a single unsigned integer in order to declutter some of
the code in bisect.c
Also, rename the existing "flags" bitfield to "commit_flags", to
explicitly differentiate it from the new "bisect_flags" bitfield.
Based-on-patch-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Upon seeing a merge commit when bisecting, this option may be used to
follow only the first parent.
In detecting regressions introduced through the merging of a branch, the
merge commit will be identified as introduction of the bug and its
ancestors will be ignored.
This option is particularly useful in avoiding false positives when a
merged branch contained broken or non-buildable commits, but the merge
itself was OK.
Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_bisect__helper() is intended as a temporary shim layer serving as an
interface for git-bisect.sh. This function and git-bisect.sh should
eventually be replaced by a C implementation, cmd_bisect(), serving as
an entrypoint for all "git bisect ..." shell commands: cmd_bisect() will
only parse the first token following "git bisect", and dispatch the
remaining args to the appropriate function ["bisect_start()",
"bisect_next()", etc.].
Thus, cmd_bisect__helper() should not be responsible for parsing flags
like --no-checkout. Instead, let the --no-checkout flag remain in the
argv array, so it may be evaluated alongside the other options already
parsed by bisect_start().
Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add first_parent_only parameter to find_bisection(), removing the
barrier that prevented combining the --bisect and --first-parent flags
when using git rev-list
Based-on-patch-by: Tiago Botelho <tiagonbotelho@hotmail.com>
Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a call to has_object_file(), which lazily fetches missing
objects in a partial clone, when the object is known to not be
a promisor object. Change that call to has_object(), which does not do
any lazy fetching.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options --missing=allow-{any,promisor} were introduced in caf3827e2f
("rev-list: add list-objects filtering support", 2017-11-22) with the
following note in the commit message:
This patch introduces handling of missing objects to help
debugging and development of the "partial clone" mechanism,
and once the mechanism is implemented, for a power user to
perform operations that are missing-object aware without
incurring the cost of checking if a missing link is expected.
The idea that these options are missing-object aware (and thus do not
need to lazily fetch objects, unlike unaware commands that assume that
all objects are present) are assumed in later commits such as 07ef3c6604
("fetch test: use more robust test for filtered objects", 2020-01-15).
However, the current implementations of these options use
has_object_file(), which indeed lazily fetches missing objects. Teach
these implementations not to do so. Also, update the documentation of
these options to be clearer.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 1b81d8cb19 (help: use command-list.txt for the source of guides,
2018-05-20), all man5/man7 guides listed in command-list.txt appear in
the output of 'git help -g'.
However, 'git help -g' still prefixes this list with "The common Git
guides are:", which makes one wonder if there are others!
In the same spirit, the man page for 'git help' describes the '--guides'
option as listing 'useful' guides, which is not false per se but can
also be taken to mean that there are other guides that exist but are not
useful.
Instead of 'common' and 'useful', use 'Git concept guides' in both
places. To keep the code in line with this change, rename
help.c::list_common_guides_help to list_guides_help.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While packing many objects in a repository with a promissor remote,
lazily fetching missing objects from the promissor remote one by
one may be inefficient---the code now attempts to fetch all the
missing objects in batch (obviously this won't work for a lazy
clone that lazily fetches tree objects as you cannot even enumerate
what blobs are missing until you learn which trees are missing).
* jt/pack-objects-prefetch-in-batch:
pack-objects: prefetch objects to be packed
pack-objects: refactor to oid_object_info_extended
The 'merge' command is not the only one that does merges; other commands
like checkout -m or rebase do as well. Unfortunately, the only area of
the code that checked for the "merge.renormalize" config setting was in
builtin/merge.c, meaning it could only affect merges performed by the
"merge" command. Move the handling of this config setting to
merge_recursive_config() so that other commands can benefit from it as
well. Fixes a few tests in t6038.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "argc" and "argv" names made sense when the struct was argv_array,
but now they're just confusing. Let's rename them to "nr" (which we use
for counts elsewhere) and "v" (which is rather terse, but reads well
when combined with typical variable names like "args.v").
Note that we have to update all of the callers immediately. Playing
tricks with the preprocessor is hard here, because we wouldn't want to
rewrite unrelated tokens.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git mv src dst", when src is an unmerged path, errored out
correctly but with an incorrect error message to claim that src is
not tracked, which has been clarified.
* ct/mv-unmerged-path-error:
git-mv: improve error message for conflicted file
Preliminary clean-up of the refs API in preparation for adding a
new refs backend "reftable".
* hn/reftable:
reflog: cleanse messages in the refs.c layer
bisect: treat BISECT_HEAD as a pseudo ref
t3432: use git-reflog to inspect the reflog for HEAD
lib-t6000.sh: write tag using git-update-ref
"git clone --separate-git-dir=$elsewhere" used to stomp on the
contents of the existing directory $elsewhere, which has been
taught to fail when $elsewhere is not an empty directory.
* bw/fail-cloning-into-non-empty:
git clone: don't clone into non-empty directory
Updates to the changed-paths bloom filter.
* ds/commit-graph-bloom-updates:
commit-graph: check all leading directories in changed path Bloom filters
revision: empty pathspecs should not use Bloom filters
revision.c: fix whitespace
commit-graph: check chunk sizes after writing
commit-graph: simplify chunk writes into loop
commit-graph: unify the signatures of all write_graph_chunk_*() functions
commit-graph: persist existence of changed-paths
bloom: fix logic in get_bloom_filter()
commit-graph: change test to die on parse, not load
commit-graph: place bloom_settings in context
Now that we have a complete SHA-256 implementation in Git, let's enable
it so people can use it. Remove the ENABLE_SHA256 define constant
everywhere it's used. Add tests for initializing a repository with
SHA-256.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently we detect the hash algorithm in use by the length of the
object ID. This is inelegant and prevents us from using a different
hash algorithm that is also 256 bits in length.
Since we cannot extend the v2 format in a backward-compatible way, let's
add a v3 format, which is identical, except for the addition of
capabilities, which are prefixed by an at sign. We add "object-format"
as the only capability and reject unknown capabilities, since we do not
have a network connection and therefore cannot negotiate with the other
side.
For compatibility, default to the v2 format for SHA-1 and require v3
for SHA-256.
In t5510, always use format v3 so we can be sure we produce consistent
results across hash algorithms. Since head -n N lists the top N lines
instead of the Nth line, let's run our output through sed to normalize
it and compare it against a fixed value, which will make sure we get
exactly what we're expecting.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recently added test in t5702 started using git verify-pack outside of
a repository. While this poses no problems with SHA-1, with SHA-256 we
implicitly rely on the setup of the repository to initialize our hash
algorithm settings.
Since we're not in a repository here, we need to provide git verify-pack
help to set things up properly. git index-pack already knows an
--object-format option, so let's accept one as well and pass it down to
our git index-pack invocation. Since we're now dynamically adjusting
the elements in argv, let's switch to using struct argv_array to manage
them. Finally, let's make t5702 pass the proper argument on down to its
git verify-pack caller.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using "--first-parent" to consider history as a single line of
commits, git-log still defaults to treating merges specially, even
though they could be considered as single commits in the linearized
history (that just introduce all of the changes from the second and
higher parents).
Let's instead have "--first-parent" imply "-m", which makes something
like:
git log --first-parent -p
do what you'd expect. Likewise:
git log --first-parent -Sfoo
will find "foo" in merge commits.
No new test is needed; we'll tweak the output of the existing
"--first-parent -p" test, which now matches the "-m --first-parent -p"
test. The unchanged existing test for "--no-diff-merges" confirms that
the user can get the old behavior if they want.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-m" option sets revs->ignore_merges to "0", but there's no way to
undo it. This probably isn't something anybody overly cares about, since
"1" is already the default, but it will serve as an escape hatch when we
flip the default for ignore_merges to "0" in more situations.
We'll also add a few extra niceties:
- initialize the value to "-1" to indicate "not set", and then resolve
it to the normal 0/1 bool in setup_revisions(). This lets any tweak
functions, as well as setup_revisions() itself, avoid clobbering the
user's preference (which until now they couldn't actually express).
- since we now have --no-diff-merges, let's add the matching
--diff-merges, which is just a synonym for "-m". Then we don't even
need to document --no-diff-merges separately; it countermands the
long form of "-m" in the usual way.
The new test shows that this behaves just the same as the current
behavior without "-m".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was added by 82dee4160c (log: show merge commit when --cc is given,
2015-08-20), which explains why we need it. But that commit failed to
notice that setup_revisions() already does the same thing, since
cd2bdc5309 (Common option parsing for "git log --diff" and friends,
2006-04-14).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_object_or_die() is passed an object ID and a name to show if the
object cannot be parsed. If the name is NULL then it shows the
hexadecimal object ID. Use that feature instead of preparing and
passing the hexadecimal representation to the function proactively.
That's shorter and a bit more efficient.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code which split an argv_array call across multiple lines, like:
argv_array_pushl(&args, "one argument",
"another argument", "and more",
NULL);
was recently mechanically renamed to use strvec, which results in
mis-matched indentation like:
strvec_pushl(&args, "one argument",
"another argument", "and more",
NULL);
Let's fix these up to align the arguments with the opening paren. I did
this manually by sifting through the results of:
git jump grep 'strvec_.*,$'
and liberally applying my editor's auto-format. Most of the changes are
of the form shown above, though I also normalized a few that had
originally used a single-tab indentation (rather than our usual style of
aligning with the open paren). I also rewrapped a couple of obvious
cases (e.g., where previously too-long lines became short enough to fit
on one), but I wasn't aggressive about it. In cases broken to three or
more lines, the grouping of arguments is sometimes meaningful, and it
wasn't worth my time or reviewer time to ponder each case individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).
This patch converts all of the files in builtin/ to keep the diff to a
manageable size.
The conversion was done purely mechanically with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe '
s/ARGV_ARRAY/STRVEC/g;
s/argv_array/strvec/g;
'
and then selectively staging files with "git add builtin/". We'll deal
with any indentation/style fallouts separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to eventually drop the use of the "argv_array" name in favor of
"strvec." Unlike most other uses of the name, this one is embedded in a
function name, so the definition and all of the callers need to be
updated at the same time.
We don't technically need to update the parameter types here (our
preprocessor compat macros make the two names interchangeable), but
let's do so to keep the site consistent for now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This requires updating #include lines across the code-base, but that's
all fairly mechanical, and was done with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe 's/argv-array.h/strvec.h/'
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an object to be packed is noticed to be missing, prefetch all
to-be-packed objects in one batch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use oid_object_info_extended() instead of oid_object_info() because a
subsequent commit needs to specify an additional flag here.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git mv' has always complained about renaming a conflicted
file, as it cannot handle multiple index entries for one file.
However, the error message it uses has been the same as the
one for an untracked file:
fatal: not under version control, src=...
which is patently wrong. Distinguish the two cases and
add a test to make sure we produce the correct message.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix to the code to produce progress bar, which is new in the
upcoming release.
* tb/commit-graph-no-check-oids:
commit-graph: fix "Collecting commits from input" progress line
To display a progress line while reading commits from standard input
and looking them up, 5b6653e523 (builtin/commit-graph.c: dereference
tags in builtin, 2020-05-13) should have added a pair of
start_delayed_progress() and stop_progress() calls around the loop
reading stdin. Alas, the stop_progress() call ended up at the wrong
place, after write_commit_graph(), which does all the commit-graph
computation and writing, and has several progress lines of its own.
Consequently, that new
Collecting commits from input: 1234
progress line is overwritten by the first progress line shown by
write_commit_graph(), and its final "done" line is shown last, after
everything is finished:
$ { sleep 3 ; git rev-list -3 HEAD ; sleep 1 ; } | ~/src/git/git commit-graph write --stdin-commits
Expanding reachable commits in commit graph: 873402, done.
Writing out commit graph in 4 passes: 100% (3493608/3493608), done.
Collecting commits from input: 3, done.
Furthermore, that stop_progress() call was added after the 'cleanup'
label, where that loop reading stdin jumps in case of an error. In
case of invalid input this then results in the "done" line shown after
the error message:
$ { sleep 3 ; git rev-list -3 HEAD ; echo junk ; } | ~/src/git/git commit-graph write --stdin-commits
error: unexpected non-hex object ID: junk
Collecting commits from input: 3, done.
Move that stop_progress() call to the right place.
While at it, drop the unnecessary 'if (progress)' condition protecting
the stop_progress() call, because that function is prepared to handle
a NULL progress struct.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebase is implemented with two different backends - 'apply' and
'merge' each of which support a different set of options. In
particular the apply backend supports a number of options implemented
by 'git am' that are not implemented in the merge backend. This means
that the available options are different depending on which backend is
used which is confusing. This patch adds support for the
--ignore-whitespace option to the merge backend. This option treats
lines with only whitespace changes as unchanged and is implemented in
the merge backend by translating it to -Xignore-space-change.
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both the git-bisect.sh as bisect--helper inspected the file system
directly.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git clone with --separate-git-dir realgitdir and
realgitdir already exists, it's content is destroyed.
So, make sure we don't clone into an existing non-empty directory.
When d45420c1 (clone: do not clean up directories we didn't create,
2018-01-02) tightened the clean-up procedure after a failed cloning
into an empty directory, it assumed that the existing directory
given is an empty one so it is OK to keep that directory, while
running the clean-up procedure that is designed to remove everything
in it (since there won't be any, anyway). Check and make sure that
the $GIT_DIR is empty even cloning into an existing repository.
Signed-off-by: Ben Wijen <ben@wijen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent update to "git diff" meant as a code clean-up introduced a
bug in its error handling code, which has been corrected.
* ct/diff-with-merge-base-clarification:
diff: check for merge bases before assigning sym->base
In symdiff_prepare(), we iterate over the set of parsed objects to pick
out any symmetric differences, including the left, right, and base
elements. We assign the results into pointers in a "struct symdiff", and
then complain if we didn't find a base, like so:
sym->left = rev->pending.objects[lpos].name;
sym->right = rev->pending.objects[rpos].name;
sym->base = rev->pending.objects[basepos].name;
if (basecount == 0)
die(_("%s...%s: no merge base"), sym->left, sym->right);
But the least lines are backwards. If basecount is 0, then basepos will
be -1, and we will access memory outside of the pending array. This
isn't usually that big a deal, since we don't do anything besides a
single pointer-sized read before exiting anyway, but it does violate the
C standard, and of course memory-checking tools like ASan complain.
Let's put the basecount check first. Note that we haveto split it from
the other assignments, since the die() relies on sym->left and
sym->right having been assigned (this isn't strictly necessary, but is
easier to read than dereferencing the pending array again).
Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fast-export --anonymize" learned to take customized mapping to
allow its users to tweak its output more usable for debugging.
* jk/fast-export-anonym-alt:
fast-export: use local array to store anonymized oid
fast-export: anonymize "master" refname
fast-export: allow seeding the anonymized mapping
fast-export: add a "data" callback parameter to anonymize_str()
fast-export: move global "idents" anonymize hashmap into function
fast-export: use a flex array to store anonymized entries
fast-export: stop storing lengths in anonymized hashmaps
fast-export: tighten anonymize_mem() interface to handle only strings
fast-export: store anonymized oids as hex strings
fast-export: use xmemdupz() for anonymizing oids
t9351: derive anonymized tree checks from original repo
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.
* js/default-branch-name:
contrib: subtree: adjust test to change in fmt-merge-msg
testsvn: respect `init.defaultBranch`
remote: use the configured default branch name when appropriate
clone: use configured default branch name when appropriate
init: allow setting the default for the initial branch name via the config
init: allow specifying the initial branch name for the new repository
docs: add missing diamond brackets
submodule: fall back to remote's HEAD for missing remote.<name>.branch
send-pack/transport-helper: avoid mentioning a particular branch
fmt-merge-msg: stop treating `master` specially
API cleanup for get_worktrees()
* es/get-worktrees-unsort:
worktree: drop get_worktrees() unused 'flags' argument
worktree: drop get_worktrees() special-purpose sorting option
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.
* ak/commit-graph-to-slab:
commit-graph: minimize commit_graph_data_slab access
commit: move members graph_pos, generation to a slab
commit-graph: introduce commit_graph_data_slab
object: drop parsed_object_pool->commit_count
SHA-256 migration work continues.
* bc/sha-256-part-2: (44 commits)
remote-testgit: adapt for object-format
bundle: detect hash algorithm when reading refs
t5300: pass --object-format to git index-pack
t5704: send object-format capability with SHA-256
t5703: use object-format serve option
t5702: offer an object-format capability in the test
t/helper: initialize the repository for test-sha1-array
remote-curl: avoid truncating refs with ls-remote
t1050: pass algorithm to index-pack when outside repo
builtin/index-pack: add option to specify hash algorithm
remote-curl: detect algorithm for dumb HTTP by size
builtin/ls-remote: initialize repository based on fetch
t5500: make hash independent
serve: advertise object-format capability for protocol v2
connect: parse v2 refs with correct hash algorithm
connect: pass full packet reader when parsing v2 refs
Documentation/technical: document object-format for protocol v2
t1302: expect repo format version 1 for SHA-256
builtin/show-index: provide options to determine hash algo
t5302: modernize test formatting
...
When displaying cat-file usage, the fact that a <format> can
be specified is only visible when lookling at the --batch and
--batch-check options which are shown like this:
--batch[=<format>] show info and content of objects fed from the standard input
--batch-check[=<format>]
show info about objects fed from the standard input
It seems more coherent and improves discovery to also show it
on the usage line.
In the documentation the DESCRIPTION tells us that "The output
format can be overridden using the optional <format> argument",
but we can't see the <format> argument in the SYNOPSIS above
the description which is confusing.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changed-path Bloom filters were released in v2.27.0, but have a
significant drawback. A user can opt-in to writing the changed-path
filters using the "--changed-paths" option to "git commit-graph write"
but the next write will drop the filters unless that option is
specified.
This becomes even more important when considering the interaction with
gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of
features.experimental). These config options trigger commit-graph writes
that the user did not signal, and hence there is no --changed-paths
option available.
Allow a user that opts-in to the changed-path filters to persist the
property of "my commit-graph has changed-path filters" automatically. A
user can drop filters using the --no-changed-paths option.
In the process, we need to be extremely careful to match the Bloom
filter settings as specified by the commit-graph. This will allow future
versions of Git to customize these settings, and the version with this
change will persist those settings as commit-graphs are rewritten on
top.
Use the trace2 API to signal the settings used during the write, and
check that output in a test after manually adjusting the correct bytes
in the commit-graph file.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.
* sk/diff-files-show-i-t-a-as-new:
diff-files: treat "i-t-a" files as "not-in-index"
An in-code comment in "git diff" has been updated.
* dl/diff-usage-comment-update:
builtin/diff: fix botched update of usage comment
builtin/diff: update usage comment
Allow runtime upgrade of the repository format version, which needs
to be done carefully.
There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.
* xl/upgrade-repo-format:
check_repository_format_gently(): refuse extensions for old repositories
sparse-checkout: upgrade repository to version 1 when enabling extension
fetch: allow adding a filter after initial clone
repository: add a helper function to perform repository format upgrade
Some older versions of gcc complain about this line:
builtin/fast-export.c:412:2: error: dereferencing type-punned pointer
will break strict-aliasing rules [-Werror=strict-aliasing]
put_be32(oid.hash + hashsz - 4, counter++);
^
This seems to be a false positive, as there's no type-punning at all
here. oid.hash is an array of unsigned char; when we pass it to a
function it decays to a pointer to unsigned char. We do take a void
pointer in put_be32(), but it's immediately aliased with another pointer
to unsigned char (and clearly the compiler is looking inside the inlined
put_be32(), since the warning doesn't happen with -O0).
This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6,
7, 8, and 9).
We can work around it by using a local array instead of an object_id
struct. This is a little more intimate with the details of object_id,
but for whatever reason doesn't seem to trigger the compiler warning.
We can revert this patch once we decide that those gcc versions are too
old to care about for a warning like this (gcc 4.8 is the default
compiler for Ubuntu Trusty, which is out-of-support but not fully
end-of-life'd until April 2022).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:
- it helped to have some known reference point between the original
and anonymized repository
- since it's historically the default branch name, it doesn't leak any
information
Now that we can ask fast-export to retain particular tokens, we have a
much better tool for the first one (because it works for any ref, not
just master).
For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.
Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. We could just use
--anonymize-map=master to keep the same output, but then we wouldn't
know if it works because of our hard-coded master or because of the
explicit map.
So let's flip the test a bit, and confirm that we anonymize "master",
but keep "other" in the output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.
Let's make it possible to seed the anonymization map. This lets users
either:
- mark names to be retained as-is, if they don't consider them secret
(in which case their original commands would just work)
- map names to new values, which lets them adapt the reproduction
recipe to the new names without revealing the originals
The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.
This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.
One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.
* jt/cdn-offload:
upload-pack: fix a sparse '0 as NULL pointer' warning
upload-pack: send part of packfile response as uri
fetch-pack: support more than one pack lockfile
upload-pack: refactor reading of pack-objects out
Documentation: add Packfile URIs design doc
Documentation: order protocol v2 sections
http-fetch: support fetching packfiles by URL
http-fetch: refactor into function
http: refactor finish_http_pack_request()
http: use --stdin when indexing dumb HTTP pack
Rewrite of parts of the scripted "git submodule" Porcelain command
continues; this time it is "git submodule set-branch" subcommand's
turn.
* ss/submodule-set-branch-in-c:
submodule: port subcommand 'set-branch' from shell to C
Code clean-up around "git branch" with a minor bugfix.
* dl/branch-cleanup:
branch: don't mix --edit-description
t3200: test for specific errors
t3200: rename "expected" to "expect"
"git diff" used to take arguments in random and nonsense range
notation, e.g. "git diff A..B C", "git diff A..B C...D", etc.,
which has been cleaned up.
* ct/diff-with-merge-base-clarification:
Documentation: usage for diff combined commits
git diff: improve range handling
t/t3430: avoid undefined git diff behavior
Code clean-up of "git clean" resulted in a fix of recent
performance regression.
* en/clean-cleanups:
clean: optimize and document cases where we recurse into subdirectories
clean: consolidate handling of ignored parameters
dir, clean: avoid disallowed behavior
dir: fix a few confusing comments
When cloning a repository without any branches, Git chooses a default
branch name for the as-yet unborn branch.
As part of the implicit initialization of the local repository, Git just
learned to respect `init.defaultBranch` to choose a different initial
branch name. We now really want that branch name to be used as a
fall-back.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We just introduced the command-line option
`--initial-branch=<branch-name>` to allow initializing a new repository
with a different initial branch than the hard-coded one.
To allow users to override the initial branch name more permanently
(i.e. without having to specify the name manually for each and every
`git init` invocation), let's introduce the `init.defaultBranch` config
setting.
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a growing number of projects and companies desiring to change
the main branch name of their repositories (see e.g.
https://twitter.com/mislav/status/1270388510684598272 for background on
this).
To change that branch name for new repositories, currently the only way
to do that automatically is by copying all of Git's template directory,
then hard-coding the desired default branch name into the `.git/HEAD`
file, and then configuring `init.templateDir` to point to those copied
template files.
To make this process much less cumbersome, let's introduce a new option:
`--initial-branch=<branch-name>`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `remote.<name>.branch` is not configured, `git submodule update`
currently falls back to using the branch name `master`. A much better
idea, however, is to use the remote `HEAD`: on all Git servers running
reasonably recent Git versions, the symref `HEAD` points to the main
branch.
Note: t7419 demonstrates that there _might_ be use cases out there that
_expect_ `git submodule update --remote` to update submodules to the
remote `master` branch even if the remote `HEAD` points to another
branch. Arguably, this patch makes the behavior more intuitive, but
there is a slight possibility that this might cause regressions in
obscure setups.
Even so, it should be okay to fix this behavior without anything like a
longer transition period:
- The `git submodule update --remote` command is not really common.
- Current Git's behavior when running this command is outright
confusing, unless the remote repository's current branch _is_ `master`
(in which case the proposed behavior matches the old behavior).
- If a user encounters a regression due to the changed behavior, the fix
is actually trivial: setting `submodule.<name>.branch` to `master`
will reinstate the old behavior.
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The anonymize_str() function takes a generator callback, but there's no
way to pass extra context to it. Let's add the usual "void *data"
parameter to the generator interface and pass it along.
This is mildly annoying for existing callers, all of which pass NULL,
but is necessary to avoid extra globals in some cases we'll add in a
subsequent patch.
While we're touching each of these callbacks, we can further observe
that none of them use the existing orig/len parameters at all. This
makes sense, since the point is for their output to have no discernable
basis in the original (my original version had some notion that we might
use a one-way function to obfuscate the names, but it was never
implemented). So let's drop those extra parameters. If a caller really
wants to do something with them, it can pass a struct through the new
data parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the other anonymization functions keep their static mappings
inside the function to avoid polluting the global namespace. Let's do
the same for "idents", as nobody needs it outside of
anonymize_ident_line().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we're using a separate keydata struct for hash lookups, we have
more flexibility in how we allocate anonymized_entry structs. Let's push
the "orig" key into a flex member within the struct. That should save us
a few bytes of memory per entry (a pointer plus any malloc overhead),
and may make lookups a little faster (since it's one less pointer to
chase in the comparison function).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the anonymize_str() interface is restricted to NUL-terminated
strings, there's no need for us to keep track of the length of each
entry in the hashmap. This simplifies the code and saves a bit of
memory.
Note that we do still need to compare the stored results to partial
strings passed in by the callers. We can do that by using hashmap's
keydata feature to get the ptr/len pair into the comparison function,
and then using strncmp().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the anonymize_mem() interface _can_ store arbitrary byte
sequences, none of the callers uses this feature (as of the previous
commit). We'd like to keep it that way, as we'll be exposing the
string-like nature of the anonymization routines to the user. So let's
tighten up the interface a bit:
- don't treat "len" as an out-parameter from anonymize_mem(); this
ensures callers treat the pointer result as a NUL-terminated string
- likewise, don't treat "len" as an out-parameter from generator
functions
- swap out "void *" for "char *" as appropriate to signal that we
don't handle arbitrary memory
- rename the function to anonymize_str()
This will also open up some optimization opportunities in a future
patch.
Note that we can't drop the "len" parameter entirely. Some callers do
pass in partial strings (e.g., "foo/bar", len=3) to avoid copying, and
we need to handle those still.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fast-export stores anonymized oids, it does so as binary strings.
And while the anonymous mapping storage is binary-clean (at least as of
the previous commit), this will become awkward when we start exposing
more of it to the user. In particular, if we allow a method for
retaining token "foo", then users may want to specify a hex oid as such
a token.
Let's just switch to storing the hex strings. The difference in memory
usage is negligible (especially considering how infrequently we'd
generally store an oid compared to, say, path components).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our anonymize_mem() function is careful to take a ptr/len pair to allow
storing binary tokens like object ids, as well as partial strings (e.g.,
just "foo" of "foo/bar"). But it duplicates the hash key using
xstrdup()! That means that:
- for a partial string, we'd store all bytes up to the NUL, even
though we'd never look at anything past "len". This didn't produce
wrong behavior, but was wasteful.
- for a binary oid that doesn't contain a zero byte, we'd copy garbage
bytes off the end of the array (though as long as nothing complained
about reading uninitialized bytes, further reads would be limited by
"len", and we'd produce the correct results)
- for a binary oid that does contain a zero byte, we'd copy _fewer_
bytes than intended into the hashmap struct. When we later try to
look up a value, we'd access uninitialized memory and potentially
falsely claim that a particular oid is not present.
The most common reason to store an oid is an anonymized gitlink, but our
test case doesn't have any gitlinks at all. So let's add one whose oid
contains a NUL and is present at two different paths. ASan catches the
memory error, but even without it we can detect the bug because the oid
is not anonymized the same way for both paths.
And of course the fix is to copy the correct number of bytes. We don't
technically need the appended NUL from xmemdupz(), but it doesn't hurt
as an extra protection against anybody treating it like a string (plus a
future patch will push us more in that direction).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, an attempt was made to correct the "N=1, M=0"
case. However, the fix was botched and it introduced two half-correct
sections by mistake. Combine these half-correct sections into one fully
correct section.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d91d6fbf26 (commit-reach: create repo_is_descendant_of(), 2020-06-17)
adds a repository aware version of is_descendant_of() and a backward
compatibility shim that is barely used.
Update all callers to directly use the new repo_is_descendant_of()
function instead; making the codebase simpler and pushing more
the_repository references higher up the stack.
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The same worktree directory must be registered only once, but
"git worktree move" allowed this invariant to be violated, which
has been corrected.
* es/worktree-duplicate-paths:
worktree: make "move" refuse to move atop missing registered worktree
worktree: generalize candidate worktree path validation
worktree: prune linked worktree referencing main worktree path
worktree: prune duplicate entries referencing same worktree path
worktree: make high-level pruning re-usable
worktree: give "should be pruned?" function more meaningful name
worktree: factor out repeated string literal
The `diff-files' command and related commands which call the function
`cmd_diff_files()', consider the "intent-to-add" files as a part of the
index when comparing the work-tree against it. This was previously
addressed in commits [1] and [2] by turning the option
`--ita-invisible-in-index' (introduced in [3]) on by default.
For `diff-files' (and `add -p' as a consequence) to show the i-t-a
files as as new, `ita_invisible_in_index' will be enabled by default
here as well.
[1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default,
2018-05-26)
[2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist
in index", 2016-10-24)
[3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24)
Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_worktrees() accepts a 'flags' argument, however, there are no
existing flags (the lone flag GWT_SORT_LINKED was recently retired) and
no behavior which can be tweaked. Therefore, drop the 'flags' argument.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Of all the clients of get_worktrees(), only "git worktree list" wants
the list sorted in a very specific way; other clients simply don't care
about the order. Rather than imbuing get_worktrees() with special
knowledge about how various clients -- now and in the future -- may want
the list sorted, drop the sorting capability altogether and make it the
client's responsibility to sort the list if needed.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git index-pack is usually run in a repository, but need not be. Since
packs don't contains information on the algorithm in use, instead
relying on context, add an option to index-pack to tell it which one
we're using in case someone runs it outside of a repository. Since
using --stdin necessarily implies a repository, don't allow specifying
an object format if it's provided to prevent users from passing an
option that won't work. Add documentation for this option.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_pull() builds a commit_list to pass a single potential ancestor to
is_descendant_of(). The latter leaves the list intact. Release the
allocated memory after the call.
Leaking in cmd_*() isn't a big deal, but sets a bad example for other
users of is_descendant_of().
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A comment in cmd_diff() states that if one tree-ish and no blobs are
provided, (the "N=1, M=0" case), it will provide a diff between the tree
and the cache. This is incorrect because a diff happens between the
tree-ish and the working tree. Remove the `--cached` in the comment so
that the correct behavior is shown. Add a new section describing the
"N=1, M=0, --cached" behavior.
Next, describe the "N=0, M=0, --cached" case, similar to the above since
it is undocumented.
Finally, fix some spacing issues. Add spaces between each section for
consistency and readability. Also, change tabs within the comment into
spaces.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behaviour of "sparse-checkout" in the state "git clone
--no-checkout" left was changed accidentally in 2.27, which has
been corrected.
* en/sparse-checkout:
sparse-checkout: avoid staging deletions of all files
The reflog entries for "git clone" and "git fetch" did not
anonymize the URL they operated on.
* js/reflog-anonymize-for-clone-and-fetch:
clone/fetch: anonymize URLs in the reflog
14ba97f8 (alloc: allow arbitrary repositories for alloc functions,
2018-05-15) introduced parsed_object_pool->commit_count to keep count of
commits per repository and was used to assign commit->index.
However, commit-slab code requires commit->index values to be unique
and a global count would be correct, rather than a per-repo count.
Let's introduce a static counter variable, `parsed_commits_count` to
keep track of parsed commits so far.
As commit_count has no use anymore, let's also drop it from the struct.
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git branch` accepts `--edit-description` in conjunction with other
arguments. However, `--edit-description` is its own mode, similar to
`--set-upstream-to`, which is also made mutually exclusive with other
modes. Prevent `--edit-description` from being mixed with other modes.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 6b1db43109 ("clean: teach clean -d to preserve ignored paths",
2017-05-23) added the following code block (among others) to git-clean:
if (remove_directories)
dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
The reason for these flags is well documented in the commit message, but
isn't obvious just from looking at the code. Add some explanations to
the code to make it clearer.
Further, it appears git-2.26 did not correctly handle this combination
of flags from git-clean. With both these flags and without
DIR_SHOW_IGNORED_TOO_MODE_MATCHING set, git is supposed to recurse into
all untracked AND ignored directories. git-2.26.0 clearly was not doing
that. I don't know the full reasons for that or whether git < 2.27.0
had additional unknown bugs because of that misbehavior, because I don't
feel it's worth digging into. As per the huge changes and craziness
documented in commit 8d92fb2927 ("dir: replace exponential algorithm
with a linear one", 2020-04-01), the old algorithm was a mess and was
thrown out. What I can say is that git-2.27.0 correctly recurses into
untracked AND ignored directories with that combination.
However, in clean's case we don't need to recurse into ignored
directories; that is just a waste of time. Thus, when git-2.27.0
started correctly handling those flags, we got a performance regression
report. Rather than relying on other bugs in fill_directory()'s former
logic to provide the behavior of skipping ignored directories, make use
of the DIR_SHOW_IGNORED_TOO_MODE_MATCHING value specifically added in
commit eec0f7f2b7 ("status: add option to show ignored files
differently", 2017-10-30) for this purpose.
Reported-by: Brian Malehorn <bmalehorn@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I spent a long time trying to figure out how and whether the code worked
with different values of ignore, ignore_only, and remove_directories.
After lots of time setting up lots of testcases, sifting through lots of
print statements, and walking through the debugger, I finally realized
that one piece of code related to how it was all setup was found in
clean.c rather than dir.c. Make a change that would have made it easier
for me to do the extra testing by putting this handling in one spot.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dir.h documented quite clearly that DIR_SHOW_IGNORED and
DIR_SHOW_IGNORED_TOO are mutually exclusive, with a big comment to this
effect by the definition of both enum values. However, a command like
git clean -fx $DIR
would set both values for dir.flags. I _think_ it happened to work
because:
* As dir.h points out, DIR_KEEP_UNTRACKED_CONTENTS only takes effect
if DIR_SHOW_IGNORED_TOO is set.
* As coded, I believe DIR_SHOW_IGNORED would just happen to take
precedence over DIR_SHOW_IGNORED_TOO in the code as currently
constructed.
Which is a long way of saying "we just got lucky".
Fix clean.c to avoid setting these mutually exclusive values at the same
time, and add a check to dir.c that will throw a BUG() to prevent anyone
else from making this mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Document the usage for producing combined commits with "git diff".
This includes updating the synopsis section.
While here, add the three-dot notation to the synopsis.
Make "git diff -h" print the same usage summary as the manual
page synopsis, minus the "A..B" form, which is now discouraged.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git diff is given a symmetric difference A...B, it chooses
some merge base from the two specified commits (as documented).
This fails, however, if there is *no* merge base: instead, you
see the differences between A and B, which is certainly not what
is expected.
Moreover, if additional revisions are specified on the command
line ("git diff A...B C"), the results get a bit weird:
* If there is a symmetric difference merge base, this is used
as the left side of the diff. The last final ref is used as
the right side.
* If there is no merge base, the symmetric status is completely
lost. We will produce a combined diff instead.
Similar weirdness occurs if you use, e.g., "git diff C A...B D".
Likewise, using multiple two-dot ranges, or tossing extra
revision specifiers into the command line with two-dot ranges,
or mixing two and three dot ranges, all produce nonsense.
To avoid all this, add a routine to catch the range cases and
verify that that the arguments make sense. As a side effect,
produce a warning showing *which* merge base is being used when
there are multiple choices; die if there is no merge base.
Signed-off-by: Chris Torek <chris.torek@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack to send part of its packfile response as URIs.
An administrator may configure a repository with one or more
"uploadpack.blobpackfileuri" lines, each line containing an OID, a pack
hash, and a URI. A client may configure fetch.uriprotocols to be a
comma-separated list of protocols that it is willing to use to fetch
additional packfiles - this list will be sent to the server. Whenever an
object with one of those OIDs would appear in the packfile transmitted
by upload-pack, the server may exclude that object, and instead send the
URI. The client will then download the packs referred to by those URIs
before performing the connectivity check.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a fetch results in a packfile being downloaded, a .keep file is
generated, so that the packfile can be preserved (from, say, a running
"git repack") until refs are written referring to the contents of the
packfile.
In a subsequent patch, a successful fetch using protocol v2 may result
in more than one .keep file being generated. Therefore, teach
fetch_pack() and the transport mechanism to support multiple .keep
files.
Implementation notes:
- builtin/fetch-pack.c normally does not generate .keep files, and thus
is unaffected by this or future changes. However, it has an
undocumented "--lock-pack" feature, used by remote-curl.c when
implementing the "fetch" remote helper command. In keeping with the
remote helper protocol, only one "lock" line will ever be written;
the rest will result in warnings to stderr. However, in practice,
warnings will never be written because the remote-curl.c "fetch" is
only used for protocol v0/v1 (which will not generate multiple .keep
files). (Protocol v2 uses the "stateless-connect" command, not the
"fetch" command.)
- connected.c has an optimization in that connectivity checks on a ref
need not be done if the target object is in a pack known to be
self-contained and connected. If there are multiple packfiles, this
optimization can no longer be done.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" takes special care to avoid creating a new worktree
at a location already registered to an existing worktree even if that
worktree is missing (which can happen, for instance, if the worktree
resides on removable media). "git worktree move", however, is not so
careful when validating the destination location and will happily move
the source worktree atop the location of a missing worktree. This leads
to the anomalous situation of multiple worktrees being associated with
the same path, which is expressly forbidden by design. For example:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
$ git worktree remove ../bar
fatal: validation failed, cannot remove working tree:
'.../bar' does not point back to '.git/worktrees/bar'
Fix this shortcoming by enhancing "git worktree move" to perform the
same additional validation of the destination directory as done by "git
worktree add".
While at it, add a test to verify that "git worktree move" won't move a
worktree atop an existing (non-worktree) path -- a restriction which has
always been in place but was never tested.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree add" checks that the specified path is a valid location
for a new worktree by ensuring that the path does not already exist and
is not already registered to another worktree (a path can be registered
but missing, for instance, if it resides on removable media). Since "git
worktree add" is not the only command which should perform such
validation ("git worktree move" ought to also), generalize the the
validation function for use by other callers, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree prune" detects when multiple entries are associated with
the same path and prunes the duplicates, however, it does not detect
when a linked worktree points at the path of the main worktree.
Although "git worktree add" disallows creating a new worktree with the
same path as the main worktree, such a case can arise outside the
control of Git even without the user mucking with .git/worktree/<id>/
administrative files. For instance:
$ git clone foo.git
$ git -C foo worktree add ../bar
$ rm -rf bar
$ mv foo bar
$ git -C bar worktree list
.../bar deadfeeb [master]
.../bar deadfeeb [bar]
Help the user recover from such corruption by extending "git worktree
prune" to also detect when a linked worktree is associated with the path
of the main worktree.
Reported-by: Jonathan Müller <jonathanmueller.dev@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A fundamental restriction of linked working trees is that there must
only ever be a single worktree associated with a particular path, thus
"git worktree add" explicitly disallows creation of a new worktree at
the same location as an existing registered worktree. Nevertheless,
users can still "shoot themselves in the foot" by mucking with
administrative files in .git/worktree/<id>/. Worse, "git worktree move"
is careless[1] and allows a worktree to be moved atop a registered but
missing worktree (which can happen, for instance, if the worktree is on
removable media). For instance:
$ git clone foo.git
$ cd foo
$ git worktree add ../bar
$ git worktree add ../baz
$ rm -rf ../bar
$ git worktree move ../baz ../bar
$ git worktree list
.../foo beefd00f [master]
.../bar beefd00f [bar]
.../bar beefd00f [baz]
Help users recover from this form of corruption by teaching "git
worktree prune" to detect when multiple worktrees are associated with
the same path.
[1]: A subsequent commit will fix "git worktree move" validation to be
more strict.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The low-level logic for removing a worktree is well encapsulated in
delete_git_dir(). However, high-level details related to pruning a
worktree -- such as dealing with verbosity and dry-run mode -- are not
encapsulated. Factor out this high-level logic into its own function so
it can be re-used as new worktree corruption detectors are added.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Readers of the name prune_worktree() are likely to expect the function
to actually prune a worktree, however, it only answers the question
"should this worktree be pruned?". Give it a name more reflective of its
true purpose to avoid such confusion.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to parse "git bisect start" command line was lax in
validating the arguments.
* cb/bisect-helper-parser-fix:
bisect--helper: avoid segfault with bad syntax in `start --term-*`
On-the-wire protocol v2 easily falls into a deadlock between the
remote-curl helper and the fetch-pack process when the server side
prematurely throws an error and disconnects. The communication has
been updated to make it more robust.
* dl/remote-curl-deadlock-fix:
stateless-connect: send response end packet
pkt-line: define PACKET_READ_RESPONSE_END
remote-curl: error on incomplete packet
pkt-line: extern packet_length()
transport: extract common fetch_pack() call
remote-curl: remove label indentation
remote-curl: fix typo
Code simplification and test coverage enhancement.
* bc/filter-process:
t2060: add a test for switch with --orphan and --discard-changes
builtin/checkout: simplify metadata initialization
For each worktree removed by "git worktree prune", it reports the reason
for the removal. All reasons share the common prefix "Removing
worktrees/%s:". As new removal reasons are added, this prefix needs to
be duplicated, which is error-prone and potentially cumbersome.
Therefore, factor out the common prefix.
Although this change seems to increase the "sentence lego quotient", it
should be reasonably safe, as the reason for removal is a distinct
clause, not strictly related to the prefix. Moreover, the "worktrees" in
"Removing worktrees/%s:" is a path literal which ought not be localized,
so by factoring it out, we can more easily avoid exposing that path
fragment to translators.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'extensions' configuration variable gets special meaning in the new
repository version, so when enabling the extension we should upgrade the
repository to version 1.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Retroactively adding a filter can be useful for existing shallow clones as
they allow users to see earlier change histories without downloading all
git objects in a regular --unshallow fetch.
Without this patch, users can make a clone partial by editing the
repository configuration to convert the remote into a promisor, like:
git config core.repositoryFormatVersion 1
git config extensions.partialClone origin
git fetch --unshallow --filter=blob:none origin
Since the hard part of making this work is already in place and such
edits can be error-prone, teach Git to perform the required configuration
change automatically instead.
Note that this change does not modify the existing git behavior which
recognizes setting extensions.partialClone without changing
repositoryFormatVersion.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sparse-checkout's purpose is to update the working tree to have it
reflect a subset of the tracked files. As such, it shouldn't be
switching branches, making commits, downloading or uploading data, or
staging or unstaging changes. Other than updating the worktree, the
only thing sparse-checkout should touch is the SKIP_WORKTREE bit of the
index. In particular, this sets up a nice invariant: running
sparse-checkout will never change the status of any file in `git status`
(reflecting the fact that we only set the SKIP_WORKTREE bit if the file
is safe to delete, i.e. if the file is unmodified).
Traditionally, we did a _really_ bad job with this goal. The
predecessor to sparse-checkout involved manual editing of
.git/info/sparse-checkout and running `git read-tree -mu HEAD`. That
command would stage and unstage changes and overwrite dirty changes in
the working tree.
The initial implementation of the sparse-checkout command was no better;
it simply invoked `git read-tree -mu HEAD` as a subprocess and had the
same caveats, though this issue came up repeatedly in review comments
and workarounds for the problems were put in place before the feature
was merged[1, 2, 3, 4, 5, 6; especially see 4 & 6].
[1] https://lore.kernel.org/git/CABPp-BFT9A5n=_bx5LsjCvbogqwSjiwgr5amcjgbU1iAk4KLJg@mail.gmail.com/
[2] https://lore.kernel.org/git/CABPp-BEmwSwg4tgJg6nVG8a3Hpn_g-=ZjApZF4EiJO+qVgu4uw@mail.gmail.com/
[3] https://lore.kernel.org/git/CABPp-BFV7TA0qwZCQpHCqx9N+JifyRyuBQ-pZ_oGfe-NOgyh7A@mail.gmail.com/
[4] https://lore.kernel.org/git/CABPp-BHYCCD+Vx5fq35jH82eHc1-P53Lz_aGNpHJNcx9kg2K-A@mail.gmail.com/
[5] https://lore.kernel.org/git/CABPp-BF+JWYZfDqp2Tn4AEKVp4b0YMA=Mbz4Nz62D-gGgiduYQ@mail.gmail.com/
[6] https://lore.kernel.org/git/20191121163706.GV23183@szeder.dev/
However, these workarounds, in addition to disabling the feature in a
number of important cases, also missed one special case. I'll get back
to it later.
In the 2.27.0 cycle, the disabling of the feature was lifted by finally
replacing the internal equivalent of `git read-tree -mu HEAD` with
something that did what we wanted: the new update_sparsity() function in
unpack-trees.c that only ever updates SKIP_WORKTREE bits in the index
and updates the working tree to match. This new function handles all
the cases that were problematic for the old implementation, except that
it breaks the same special case that avoided the workarounds of the old
implementation, but broke it in a different way.
So...that brings us to the special case: a git clone performed with
--no-checkout. As per the meaning of the flag, --no-checkout does not
check out any branch, with the implication that you aren't on one and
need to switch to one after the clone. Implementationally, HEAD is
still set (so in some sense you are partially on a branch), but
* the index is "unborn" (non-existent)
* there are no files in the working tree (other than .git/)
* the next time git switch (or git checkout) is run it will run
unpack_trees with `initial_checkout` flag set to true.
It is not until you run, e.g. `git switch <somebranch>` that the index
will be written and files in the working tree populated.
With this special --no-checkout case, the traditional `read-tree -mu
HEAD` behavior would have done the equivalent of acting like checkout --
switch to the default branch (HEAD), write out an index that matches
HEAD, and update the working tree to match. This special case slipped
through the avoid-making-changes checks in the original sparse-checkout
command and thus continued there.
After update_sparsity() was introduced and used (see commit f56f31af03
("sparse-checkout: use new update_sparsity() function", 2020-03-27)),
the behavior for the --no-checkout case changed: Due to git's
auto-vivification of an empty in-memory index (see do_read_index() and
note that `must_exist` is false), and due to sparse-checkout's
update_working_directory() code to always write out the index after it
was done, we got a new bug. That made it so that sparse-checkout would
switch the repository from a clone with an "unborn" index (i.e. still
needing an initial_checkout), to one that had a recorded index with no
entries. Thus, instead of all the files appearing deleted in `git
status` being known to git as a special artifact of not yet being on a
branch, our recording of an empty index made it suddenly look to git as
though it was definitely on a branch with ALL files staged for deletion!
A subsequent checkout or switch then had to contend with the fact that
it wasn't on an initial_checkout but had a bunch of staged deletions.
Make sure that sparse-checkout changes nothing in the index other than
the SKIP_WORKTREE bit; in particular, when the index is unborn we do not
have any branch checked out so there is no sparsification or
de-sparsification work to do. Simply return from
update_working_directory() early.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.
Let's scrub them before writing them to the reflog.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message from "git checkout -b foo -t bar baz" was
confusing.
* rs/checkout-b-track-error:
checkout: improve error messages for -b with extra argument
checkout: add tests for -b and --track
Convert submodule subcommand 'set-branch' to a builtin and call it via
'git-submodule.sh'.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Denton Liu <liu.denton@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-remote may or may not operate within a repository, and as such will
not have been initialized with the repository's hash algorithm. Even if
it were, the remote side could be using a different algorithm and we
would still want to display those refs properly. Find the hash
algorithm used by the remote side by querying the transport object and
set our hash algorithm accordingly.
Without this change, if the remote side is using SHA-256, we truncate
the refs to 40 hex characters, since that's the length of the default
hash algorithm (SHA-1).
Note that technically this is not a correct setting of the repository
hash algorithm since, if we are in a repository, it might be one of a
different hash algorithm from the remote side. However, our current
code paths don't handle multiple algorithms and won't for some time, so
this is the best we can do. We rely on the fact that ls-remote never
modifies the current repository, which is a reasonable assumption to
make.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
show-index is capable of reading any possible index file whether or not
the index is inside a repository. However, because our index files lack
metadata about the hash algorithm in use, it's not possible to
autodetect the algorithm that a particular index file is using.
In order to allow us to read index files of any algorithm, let's set up
the .git directory gently so that we default to the algorithm for the
current repository, and add an --object-format option to allow users to
override this setting and continue to run show-index outside of a
repository altogether. Let's also document this new option so that
people can find it and use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both v2 pack index files and the v3 format specified as part of the
NewHash work have similar data starting at the CRC table. Much of the
existing code wants to read either this table or the offset entries
following it, and in doing so computes the offset each time.
In order to share as much code between v2 and v3, compute the offset of
the CRC table and store it when the pack is opened. Use this value to
compute offsets to not only the CRC table, but to the offset entries
beyond it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a clone, we don't know what hash algorithm the other end
will support. Currently, we don't support fetching data belonging to a
different algorithm, so we must know what algorithm the remote side is
using in order to properly initialize the repository. We can know that
only after fetching the refs, so if the remote side has any references,
use that information to reinitialize the repository with the correct
hash algorithm information.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Detect when the server doesn't support our hash algorithm and abort.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Advertise the current hash algorithm in use by using the object-format
capability as part of the ref advertisement.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.
This can be seen in the following command which does not terminate:
$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
whereas the v1 version does terminate as expected:
$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
Cloning into 'git'...
fatal: the remote end hung up unexpectedly
Instead of blindly forwarding packets, make remote-curl insert a
response end packet after proxying the responses from the remote server
when using stateless_connect(). On the RPC client side, ensure that each
response ends as described.
A separate control packet is chosen because we need to be able to
differentiate between what the remote server sends and remote-curl's
control packets. By ensuring in the remote-curl code that a server
cannot send response end packets, we prevent a malicious server from
being able to perform a denial of service attack in which they spoof a
response end packet and cause the described deadlock to happen.
Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to create a branch "foo" based on "origin/master" and give
git commit -b an extra unsupported argument "bar", it confusingly
reports:
$ git checkout -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
$ git checkout --track -b foo origin/master bar
fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it
That's wrong, because it very well understands that "origin/master" is
supposed to be the start point for the new branch and not "bar". Check
if we got a commit and show more fitting messages in that case instead:
$ git checkout -b foo origin/master bar
fatal: Cannot update paths and switch to branch 'foo' at the same time.
$ git checkout --track -b foo origin/master bar
fatal: '--track' cannot be used with updating paths
Original-patch-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
06f5608c14 (bisect--helper: `bisect_start` shell function partially in C,
2019-01-02) adds a lax parser for `git bisect start` which could result
in a segfault under a bad syntax call for start with custom terms.
Detect if there are enough arguments left in the command line to use for
--term-{old,good,new,bad} and abort with the same syntax error the original
implementation will show if not.
While at it, remove an unnecessary (and incomplete) check for unknown
arguments and make sure to add a test to avoid regressions.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Acked-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we call init_checkout_metadata in reset_tree, we want to pass the
object ID of the commit in question so that it can be passed to filters,
or if there is no commit, the tree. We anticipated this latter case,
which can occur elsewhere in the checkout code, but it cannot occur
here. The only case in which we do not have a commit object is when
invoking git switch with --orphan. Moreover, we can only hit this code
path without a commit object additionally with either --force or
--discard-changes.
In such a case, there is no point initializing the checkout metadata
with a commit or tree because (a) there is no commit, only the empty
tree, and (b) we will never use the data, since no files will be smudged
when checking out a branch with no files. Pass the all-zeros object ID
in this case, since we just need some value which is a valid pointer.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index was added to the data verified by git-fsck in
ea5ae6c3 "fsck: verify multi-pack-index". This implementation was
based on the implementation for verifying the commit-graph, and a
copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set
when an error appears in the multi-pack-index.
Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a merge with a single strategy, the result of evaluate_result() is
effectively not used and therefore is not needed, so avoid altogether.
On Windows, this optimization can halve the time required to perform a
recursive merge of a single commit with the LLVM repo.
Signed-off-by: Andrew Ng <andrew.ng@sony.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 7c5c9b9c57 (commit-graph: error out on invalid commit oids in
'write --stdin-commits', 2019-08-05), the commit-graph builtin dies on
receiving non-commit OIDs as input to '--stdin-commits'.
This behavior can be cumbersome to work around in, say, the case of
piping 'git for-each-ref' to 'git commit-graph write --stdin-commits' if
the caller does not want to cull out non-commits themselves. In this
situation, it would be ideal if 'git commit-graph write' wrote the graph
containing the inputs that did pertain to commits, and silently ignored
the remainder of the input.
Some options have been proposed to the effect of '--[no-]check-oids'
which would allow callers to have the commit-graph builtin do just that.
After some discussion, it is difficult to imagine a caller who wouldn't
want to pass '--no-check-oids', suggesting that we should get rid of the
behavior of complaining about non-commit inputs altogether.
If callers do wish to retain this behavior, they can easily work around
this change by doing the following:
git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' |
awk '
!/commit/ { print "not-a-commit:"$1 }
/commit/ { print $1 }
' |
git commit-graph write --stdin-commits
To make it so that valid OIDs that refer to non-existent objects are
indeed an error after loosening the error handling, perform an extra
lookup to make sure that object indeed exists before sending it to the
commit-graph internals.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given a list of commits, the commit-graph machinery calls
'lookup_commit_reference_gently()' on each element in the set and treats
the resulting set of OIDs as the base over which to close for
reachability.
In an earlier collection of commits, the 'git commit-graph write
--reachable' case made the inner-most call to
'lookup_commit_reference_gently()' by peeling references before they
were passed over to the commit-graph internals.
Do the analog for 'git commit-graph write --stdin-commits' by calling
'lookup_commit_reference_gently()' outside of the commit-graph
machinery, making the inner-most call a noop.
Since this may incur additional processing time, surround
'read_one_commit' with a progress meter to provide output to the caller.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With either '--stdin-commits' or '--stdin-packs', the commit-graph
builtin will read line-delimited input, and interpret it either as a
series of commit OIDs, or pack names.
In a subsequent commit, we will begin handling '--stdin-commits'
differently by processing each line as it comes in, instead of in one
shot at the end. To make adequate room for this additional logic, split
the '--stdin-commits' case from '--stdin-packs' by only storing the
input when '--stdin-packs' is given.
In the case of '--stdin-commits', feed each line to a new
'read_one_commit' helper, which (for now) will merely call
'parse_oid_hex'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach "am", "commit", "merge" and "rebase", when they are run with
the "--quiet" option, to pass "--quiet" down to "gc --auto".
* jc/auto-gc-quiet:
auto-gc: pass --quiet down from am, commit, merge and rebase
auto-gc: extract a reusable helper from "git fetch"
The <stdlib.h> header on NetBSD brings in its own definition of
hmac() function (eek), which conflicts with our own and unrelated
function with the same name. Our function has been renamed to work
around the issue.
* cb/avoid-colliding-with-netbsd-hmac:
builtin/receive-pack: avoid generic function name hmac()
"git restore --staged --worktree" now defaults to take the contents
out of "HEAD", instead of erring out.
* es/restore-staged-from-head-by-default:
restore: default to HEAD when combining --staged and --worktree
"git branch" and other "for-each-ref" variants accepted multiple
--sort=<key> options in the increasing order of precedence, but it
had a few breakages around "--ignore-case" handling, and tie-breaking
with the refname, which have been fixed.
* jk/for-each-ref-multi-key-sort-fix:
ref-filter: apply fallback refname sort only after all user sorts
ref-filter: apply --ignore-case to all sorting keys
In error messages that "git switch" mentions its option to create a
new branch, "-b/-B" options were shown, where "-c/-C" options
should be, which has been corrected.
* dl/switch-c-option-in-error-message:
switch: fix errors and comments related to -c and -C
Convert submodule subcommand 'set-url' to a builtin. Port 'set-url' to
'submodule--helper.c' and call the latter via 'git-submodule.sh'.
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These commands take the --quiet option for their own operation, but
they forget to pass the option down when they invoke "git gc --auto"
internally.
Teach them to do so using the run_auto_gc() helper we added in the
previous step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back in 1991006c (fetch: convert argv_gc_auto to struct argv_array,
2014-08-16), we taught "git fetch --quiet" to pass the "--quiet"
option down to "gc --auto". This issue, however, is not limited to
"fetch":
$ git grep -e 'gc.*--auto' \*.c
finds hits in "am", "commit", "merge", and "rebase" and these
commands do not pass "--quiet" down to "gc --auto" when they
themselves are told to be quiet.
As a preparatory step, let's introduce a helper function
run_auto_gc(), that the caller can pass a boolean "quiet",
and redo the fix to "git fetch" using the helper.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, files are restored from the index for --worktree, and from
HEAD for --staged. When --worktree and --staged are combined, --source
must be specified to disambiguate the restore source[1], thus making it
cumbersome to restore a file in both the worktree and the index.
However, HEAD is also a reasonable default for --worktree when combined
with --staged, so make it the default anytime --staged is used (whether
combined with --worktree or not).
[1]: Due to an oversight, the --source requirement, though documented,
is not actually enforced.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fabec2c5c3 (builtin/receive-pack: switch to use the_hash_algo, 2019-08-18)
renames hmac_sha1 to hmac, as it was updated to use the hash function used
by git (which won't be sha1 in the future).
hmac() is provided by NetBSD >= 8 libc and therefore conflicts as shown by :
builtin/receive-pack.c:421:13: error: conflicting types for 'hmac'
static void hmac(unsigned char *out,
^~~~
In file included from ./git-compat-util.h:172:0,
from ./builtin.h:4,
from builtin/receive-pack.c:1:
/usr/include/stdlib.h:305:10: note: previous declaration of 'hmac' was here
ssize_t hmac(const char *, const void *, size_t, const void *, size_t, void *,
^~~~
Rename it again to hmac_hash to reflect it will use the git's defined hash
function and avoid the conflict, while at it update a comment to better
describe the HMAC function that was used.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the ref-filter users (for-each-ref, branch, and tag) take an
--ignore-case option which makes filtering and sorting case-insensitive.
However, this option was applied only to the first element of the
ref_sorting list. So:
git for-each-ref --ignore-case --sort=refname
would do what you expect, but:
git for-each-ref --ignore-case --sort=refname --sort=taggername
would sort the primary key (taggername) case-insensitively, but sort the
refname case-sensitively. We have two options here:
- teach callers to set ignore_case on the whole list
- replace the ref_sorting list with a struct that contains both the
list of sorting keys, as well as options that apply to _all_
keys
I went with the first one here, as it gives more flexibility if we later
want to let the users set the flag per-key (presumably through some
special syntax when defining the key; for now it's all or nothing
through --ignore-case).
The new test covers this by sorting on both tagger and subject
case-insensitively, which should compare "a" and "A" identically, but
still sort them before "b" and "B". We'll break ties by sorting on the
refname to give ourselves a stable output (this is actually supposed to
be done automatically, but there's another bug which will be fixed in
the next commit).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error and verbose trace messages from "git push" did not redact
credential material embedded in URLs.
* js/anonymise-push-url-in-errors:
push: anonymize URLs in error messages and warnings
The "bugreport" tool.
* es/bugreport:
bugreport: drop extraneous includes
bugreport: add compiler info
bugreport: add uname info
bugreport: gather git version and build info
bugreport: add tool to generate debugging info
help: move list_config_help to builtin/help
Incompatible options "--root" and "--fork-point" of "git rebase"
have been marked and documented as being incompatible.
* en/rebase-root-and-fork-point-are-incompatible:
rebase: display an error if --root and --fork-point are both provided
"git blame" learns to take advantage of the "changed-paths" Bloom
filter stored in the commit-graph file.
* ds/blame-on-bloom:
test-bloom: check that we have expected arguments
test-bloom: fix some whitespace issues
blame: drop unused parameter from maybe_changed_path
blame: use changed-path Bloom filters
tests: write commit-graph with Bloom filters
revision: complicated pathspecs disable filters
Introduce an extension to the commit-graph to make it efficient to
check for the paths that were modified at each commit using Bloom
filters.
* gs/commit-graph-path-filter:
bloom: ignore renames when computing changed paths
commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag
t4216: add end to end tests for git log with Bloom filters
revision.c: add trace2 stats around Bloom filter usage
revision.c: use Bloom filters to speed up path based revision walks
commit-graph: add --changed-paths option to write subcommand
commit-graph: reuse existing Bloom filters during write
commit-graph: write Bloom filters to commit graph file
commit-graph: examine commits by generation number
commit-graph: examine changed-path objects in pack order
commit-graph: compute Bloom filters for changed paths
diff: halt tree-diff early after max_changes
bloom.c: core Bloom filter implementation for changed paths.
bloom.c: introduce core Bloom filter constructs
bloom.c: add the murmur3 hash implementation
commit-graph: define and use MAX_NUM_CHUNKS
Fix in-core inconsistency after fetching into a shallow repository
that broke the code to write out commit-graph.
* tb/reset-shallow:
shallow.c: use '{commit,rollback}_shallow_file'
t5537: use test_write_lines and indented heredocs for readability
In previous patches, the functions 'commit_shallow_file' and
'rollback_shallow_file' were introduced to reset the shallowness
validity checks on a repository after potentially modifying
'.git/shallow'.
These functions can be made safer by wrapping the 'struct lockfile *' in
a new type, 'shallow_lock', so that they cannot be called with a raw
lock (and potentially misused by other code that happens to possess a
lockfile, but has nothing to do with shallowness).
This patch introduces that type as a thin wrapper around 'struct
lockfile', and updates the two aforementioned functions and their
callers to use it.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are many functions in commit.h that are more related to shallow
repositories than they are to any sort of generic commit machinery.
Likely this began when there were only a few shallow-related functions,
and commit.h seemed a reasonable enough place to put them.
But, now there are a good number of shallow-related functions, and
placing them all in 'commit.h' doesn't make sense.
This patch extracts a 'shallow.h', which takes all of the declarations
from 'commit.h' for functions which already exist in 'shallow.c'. We
will bring the remaining shallow-related functions defined in 'commit.c'
in a subsequent patch.
For now, move only the ones that already are implemented in 'shallow.c',
and update the necessary includes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d787d311db (checkout: split part of it to new command 'switch',
2019-03-29), the `git switch` command was created by extracting the
common functionality of cmd_checkout() in checkout_main(). However, in
b7b5fce270 (switch: better names for -b and -B, 2019-03-29), the branch
creation and force creation options for 'switch' were changed to -c and
-C, respectively. As a result of this, error messages and comments that
previously referred to `-b` and `-B` became invalid for `git switch`.
For error messages that refer to `-b` and `-B`, use a format string
instead so that `-c` and `-C` can be printed when `git switch` is
invoked.
Reported-by: Robert Simpson
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-ref --stdin" learned a handful of new verbs to let the
user control ref update transactions more explicitly, which helps
as an ingredient to implement two-phase commit-style atomic
ref-updates across multiple repositories.
* ps/transactional-update-ref-stdin:
update-ref: implement interactive transaction handling
update-ref: read commands in a line-wise fashion
update-ref: move transaction handling into `update_refs_stdin()`
update-ref: pass end pointer instead of strbuf
update-ref: drop unused argument for `parse_refname`
update-ref: organize commands in an array
strbuf: provide function to append whole lines
git-update-ref.txt: add missing word
refs: fix segfault when aborting empty transaction
The directory traversal code had redundant recursive calls which
made its performance characteristics exponential with respect to
the depth of the tree, which was corrected.
* en/fill-directory-exponential:
completion: fix 'git add' on paths under an untracked directory
Fix error-prone fill_directory() API; make it only return matches
dir: replace double pathspec matching with single in treat_directory()
dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory()
dir: replace exponential algorithm with a linear one
dir: refactor treat_directory to clarify control flow
dir: fix confusion based on variable tense
dir: fix broken comment
dir: consolidate treat_path() and treat_one_path()
dir: fix simple typo in comment
t3000: add more testcases testing a variety of ls-files issues
t7063: more thorough status checking
"sparse-checkout" UI improvements.
* en/sparse-checkout:
sparse-checkout: provide a new reapply subcommand
unpack-trees: failure to set SKIP_WORKTREE bits always just a warning
unpack-trees: provide warnings on sparse updates for unmerged paths too
unpack-trees: make sparse path messages sound like warnings
unpack-trees: split display_error_msgs() into two
unpack-trees: rename ERROR_* fields meant for warnings to WARNING_*
unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier
sparse-checkout: use improved unpack_trees porcelain messages
sparse-checkout: use new update_sparsity() function
unpack-trees: add a new update_sparsity() function
unpack-trees: pull sparse-checkout pattern reading into a new function
unpack-trees: do not mark a dirty path with SKIP_WORKTREE
unpack-trees: allow check_updates() to work on a different index
t1091: make some tests a little more defensive against failures
unpack-trees: simplify pattern_list freeing
unpack-trees: simplify verify_absent_sparse()
unpack-trees: remove unused error type
unpack-trees: fix minor typo in comment
The stash entry created by "git rebase --autosquash" to keep the
initial dirty state were discarded by mistake upon "git rebase
--quit", which has been corrected.
* dl/merge-autostash-rebase-quit-fix:
rebase: save autostash entry into stash reflog on --quit
"git grep" did not quote a path with unusual character like other
commands (like "git diff", "git status") do, but did quote when run
from a subdirectory, both of which has been corrected.
* mt/grep-cquote-path:
grep: follow conventions for printing paths w/ unusual chars
The "--decorate-refs" and "--decorate-refs-exclude" options "git
log" takes have learned a companion configuration variable
log.excludeDecoration that sits at the lowest priority in the
family.
* ds/log-exclude-decoration-config:
log: add log.excludeDecoration config option
log-tree: make ref_filter_match() a helper method
"git diff-tree --pretty --notes" used to hit an assertion failure,
as it forgot to initialize the notes subsystem.
* tb/diff-tree-with-notes:
diff-tree.c: load notes machinery when required
Allowing the user to split a patch hunk while "git stash -p" does
not work well; a band-aid has been added to make this (partially)
work better.
* js/stash-p-fix:
stash -p: (partially) fix bug concerning split hunks
t3904: fix incorrect demonstration of a bug
Code in builtin/*, i.e. those can only be called from within
built-in subcommands, that implements bulk of a couple of
subcommands have been moved to libgit.a so that they could be used
by others.
* dl/libify-a-few:
Lib-ify prune-packed
Lib-ify fmt-merge-msg
"git diff" in a partial clone learned to avoid lazy loading blob
objects in more casese when they are not needed.
* jt/avoid-prefetch-when-able-in-diff:
diff: restrict when prefetching occurs
diff: refactor object read
diff: make diff_populate_filespec_options struct
promisor-remote: accept 0 as oid_nr in function
"git commit-graph write --expire-time=<timestamp>" did not use the
given timestamp correctly, which has been corrected.
* ds/commit-graph-expiry-fix:
commit-graph: fix buggy --expire-time option
"git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap"
* jc/log-no-mailmap:
log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap
clone: reorder --recursive/--recurse-submodules
parse-options: teach "git cmd -h" to show alias as alias
The config API made mixed uses of int and size_t types to represent
length of various pieces of text it parsed, which has been updated
to use the correct type (i.e. size_t) throughout.
* jk/config-use-size-t:
config: reject parsing of files over INT_MAX
config: use size_t to store parsed variable baselen
git_config_parse_key(): return baselen as size_t
config: drop useless length variable in write_pair()
parse_config_key(): return subsection len as size_t
remote: drop auto-strlen behavior of make_branch() and make_rewrite()
Validation of push certificate has been made more robust against
timing attacks.
* bc/constant-memequal:
receive-pack: compilation fix
builtin/receive-pack: use constant-time comparison for HMAC value
Just like 47abd85ba0 (fetch: Strip usernames from url's before storing
them, 2009-04-17) and later 882d49ca5c (push: anonymize URL in status
output, 2016-07-13), and even later c1284b21f2 (curl: anonymize URLs
in error messages and warnings, 2019-03-04) this change anonymizes URLs
(read: strips them of user names and especially passwords) in
user-facing error messages and warnings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a03b55530a (merge: teach --autostash option, 2020-04-07), the
--autostash option was introduced for `git merge`. Notably, when
`git merge --quit` is run with an autostash entry present, it is saved
into the stash reflog. This is contrasted with the current behaviour of
`git rebase --quit` where the autostash entry is simply just dropped out
of existence.
Adopt the behaviour of `git merge --quit` in `git rebase --quit` and
save the autostash entry into the stash reflog instead of just deleting
it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the usage for `git push` is shown, it includes the following
lines
--recurse-submodules[=(check|on-demand|no)]
control recursive pushing of submodules
which seem to indicate that the argument for --recurse-submodules is
optional. However, we cannot actually run that optiion without an
argument:
$ git push --recurse-submodules
fatal: recurse-submodules missing parameter
Unset PARSE_OPT_OPTARG so that it is clear that this option requires an
argument. Since the parse-options machinery guarantees that an argument
is present now, assume that `arg` is set in the else of
option_parse_recurse_submodules().
Reported-by: Andrew White <andrew.white@audinate.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the codebase, there are many options which use OPTION_CALLBACK in a
plain ol' struct definition. However, we have the OPT_CALLBACK and
OPT_CALLBACK_F macros which are meant to abstract these plain struct
definitions away. These macros are useful as they semantically signal to
developers that these are just normal callback option with nothing fancy
happening.
Replace plain struct definitions of OPTION_CALLBACK with OPT_CALLBACK or
OPT_CALLBACK_F where applicable. The heavy lifting was done using the
following (disgusting) shell script:
#!/bin/sh
do_replacement () {
tr '\n' '\r' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\s*0,\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK(\1,\2,\3,\4,\5,\6)/g' |
sed -e 's/{\s*OPTION_CALLBACK,\s*\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\([^,]*\),\(\s*[^[:space:]}]*\)\s*}/OPT_CALLBACK_F(\1,\2,\3,\4,\5,\6,\7)/g' |
tr '\r' '\n'
}
for f in $(git ls-files \*.c)
do
do_replacement <"$f" >"$f.tmp"
mv "$f.tmp" "$f"
done
The result was manually inspected and then reformatted to match the
style of the surrounding code. Finally, using
`git grep OPTION_CALLBACK \*.c`, leftover results which were not handled
by the script were manually transformed.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--root implies we want to rebase all commits since the beginning of
history. --fork-point means we want to use the reflog of the specified
upstream to find the best common ancestor between <upstream> and
<branch> and only rebase commits since that common ancestor. These
options are clearly contradictory, so throw an error (instead of
segfaulting on a NULL pointer) if both are specified.
Reported-by: Alexander Berg <alexander.berg@atos.net>
Documentation-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
2019-01-10), the author noted that 'is_repository_shallow' produces
visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
This is a problem for e.g., fetching with '--update-shallow' in a
shallow repository with 'fetch.writeCommitGraph' enabled, since the
update to '.git/shallow' will cause Git to think that the repository
isn't shallow when it is, thereby circumventing the commit-graph
compatibility check.
This causes problems in shallow repositories with at least shallow refs
that have at least one ancestor (since the client won't have those
objects, and therefore can't take the reachability closure over commits
when writing a commit-graph).
Address this by introducing thin wrappers over 'commit_lock_file' and
'rollback_lock_file' for use specifically when the lock is held over
'.git/shallow'. These wrappers (appropriately called
'commit_shallow_file' and 'rollback_shallow_file') call into their
respective functions in 'lockfile.h', but additionally reset validity
checks used by the shallow machinery.
Replace each instance of 'commit_lock_file' and 'rollback_lock_file'
with 'commit_shallow_file' and 'rollback_shallow_file' when the lock
being held is over the '.git/shallow' file.
As a result, 'prune_shallow' can now only be called once (since
'check_shallow_file_for_update' will die after calling
'reset_repository_shallow'). But, this is OK since we only call
'prune_shallow' at most once per process.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow "git rebase" to reapply all local commits, even if the may be
already in the upstream, without checking first.
* jt/rebase-allow-duplicate:
rebase --merge: optionally skip upstreamed commits
"git rebase" (again) learns to honor "--no-keep-empty", which lets
the user to discard commits that are empty from the beginning (as
opposed to the ones that become empty because of rebasing). The
interactive rebase also marks commits that are empty in the todo.
* en/rebase-no-keep-empty:
rebase: fix an incompatible-options error message
rebase: reinstate --no-keep-empty
rebase -i: mark commits that begin empty in todo editor
The interactive input from various codepaths are consolidated and
any prompt possibly issued earlier are fflush()ed before we read.
* js/flush-prompt-before-interative-input:
interactive: explicitly `fflush` stdout before expecting input
interactive: refactor code asking the user for interactive input