There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:
1. It won't be portable to systems without symlinks.
2. It may behave inconsistently, since Git may look at
this file in the index or a tree without bothering to
resolve any symbolic links. We don't do this _yet_, but
the config infrastructure is there and it's planned for
the future.
With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:
a. A symlinked .gitmodules file may circumvent any fsck
checks of the content.
b. Git may read and write from the on-disk file without
sanity checking the symlink target. So for example, if
you link ".gitmodules" to "../oops" and run "git
submodule add", we'll write to the file "oops" outside
the repository.
Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.
Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.
Signed-off-by: Jeff King <peff@peff.net>
In the update_one(), we check verify_path() on the proposed
path before doing anything else. In preparation for having
verify_path() look at the file mode, let's stat the file
earlier, so we can check the mode accurately.
This is made a bit trickier by the fact that this function
only does an lstat in a few code paths (the ones that flow
down through process_path()). So we can speculatively do the
lstat() here and pass the results down, and just use a dummy
mode for cases where we won't actually be updating the index
from the filesystem.
Signed-off-by: Jeff King <peff@peff.net>
Submodule "names" come from the untrusted .gitmodules file,
but we blindly append them to $GIT_DIR/modules to create our
on-disk repo paths. This means you can do bad things by
putting "../" into the name (among other things).
Let's sanity-check these names to avoid building a path that
can be exploited. There are two main decisions:
1. What should the allowed syntax be?
It's tempting to reuse verify_path(), since submodule
names typically come from in-repo paths. But there are
two reasons not to:
a. It's technically more strict than what we need, as
we really care only about breaking out of the
$GIT_DIR/modules/ hierarchy. E.g., having a
submodule named "foo/.git" isn't actually
dangerous, and it's possible that somebody has
manually given such a funny name.
b. Since we'll eventually use this checking logic in
fsck to prevent downstream repositories, it should
be consistent across platforms. Because
verify_path() relies on is_dir_sep(), it wouldn't
block "foo\..\bar" on a non-Windows machine.
2. Where should we enforce it? These days most of the
.gitmodules reads go through submodule-config.c, so
I've put it there in the reading step. That should
cover all of the C code.
We also construct the name for "git submodule add"
inside the git-submodule.sh script. This is probably
not a big deal for security since the name is coming
from the user anyway, but it would be polite to remind
them if the name they pick is invalid (and we need to
expose the name-checker to the shell anyway for our
test scripts).
This patch issues a warning when reading .gitmodules
and just ignores the related config entry completely.
This will generally end up producing a sensible error,
as it works the same as a .gitmodules file which is
missing a submodule entry (so "submodule update" will
barf, but "git clone --recurse-submodules" will print
an error but not abort the clone.
There is one minor oddity, which is that we print the
warning once per malformed config key (since that's how
the config subsystem gives us the entries). So in the
new test, for example, the user would see three
warnings. That's OK, since the intent is that this case
should never come up outside of malicious repositories
(and then it might even benefit the user to see the
message multiple times).
Credit for finding this vulnerability and the proof of
concept from which the test script was adapted goes to
Etienne Stalmans.
Signed-off-by: Jeff King <peff@peff.net>
Now that we use generation numbers from the commit-graph, we must
ensure that all commits that exist in the commit-graph are loaded
from that file instead of from the object database. Since the
commit-graph file is only checked if core.commitGraph is true, we
must check the default config before we load any commits.
In the merge builtin, the config was checked after loading the HEAD
commit. This was due to the use of the global 'branch' when checking
merge-specific config settings.
Move the config load to be between the initialization of 'branch' and
the commit lookup.
Without this change, a fast-forward merge would hit a BUG("bad
generation skip") statement in commit.c during paint_down_to_common().
This is because the HEAD commit would be loaded with "infinite"
generation but then reached by commits with "finite" generation
numbers.
Add a test to t5318-commit-graph.sh that exercises this code path to
prevent a regression.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.
Signed-off-by: Casey Fitzpatrick <kcghost@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strings allocated in `setup_unpack_trees_porcelain()` are never
freed. Provide a function `clear_unpack_trees_porcelain()` to do so and
call it where we use `setup_unpack_trees_porcelain()`. The only
non-trivial user is `unpack_trees_start()`, where we should place the
new call in `unpack_trees_finish()`.
We keep the string pointers in an array, mixing pointers to static
memory and memory that we allocate on the heap. We also keep several
copies of the individual pointers. So we need to make sure that we do
not free what we must not free and that we do not double-free. Let a
separate argv_array take ownership of all the strings we create so that
we can easily free them.
Zero the whole array of string pointers to make sure that we do not
leave any dangling pointers.
Note that we only take responsibility for the memory allocated in
`setup_unpack_trees_porcelain()` and not any other members of the
`struct unpack_trees_options`.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is another candidate for commit-slab. Keep Junio's observation in
code so we can search it later on when somebody wants to improve the
code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of relying on commit->util to store the source string, let the
user provide a commit-slab to store the source strings in.
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The help command currently hard codes the list of guides and their
summary in C. Let's move this list to command-list.txt. This lets us
extract summary lines from Documentation/git*.txt. This also
potentially lets us list guides in git.txt, but I'll leave that for
now.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This lists all recognized commands [1] by category. The group order
follows closely git.txt.
[1] We may actually show commands that are not built (e.g. if you set
NO_PERL you don't have git-instaweb but it's still listed here). I
ignore the problem because on Linux a git package could be split
anyway. The "git-core" package may not contain git-instaweb even if
it's built because it may end up in a separate package. We can't know
anyway.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run "config --blob" outside of a repository, then we
eventually try to resolve the blob name and hit a BUG().
Let's catch this earlier and provide a useful message.
Note that we could also catch this much lower in the stack,
in git_config_from_blob_ref(). That might cover other
callsites, too, but it's unclear whether those ones would
actually be bugs or not. So let's leave the low-level
functions to assume the caller knows what it's doing (and
BUG() if it turns out it doesn't).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrate all git_path_* functions that are defined in path.c to take a
repository argument. Unlike other patches in this series, do not use the
#define trick, as we rewrite the whole function, which is rather small.
This doesn't migrate all the functions, as other builtins have their own
local path functions defined using GIT_PATH_FUNC. So keep that macro
around to serve the other locations.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of is_repository_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of register_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fetch to generate ref-prefixes, to be used for server-side
filtering of the ref-advertisement, based on the configured fetch
refspec ('remote.<name>.fetch') when no user provided refspec exists.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using protocol v2 a client constructs a list of ref-prefixes which
are sent across the wire so that the server can do server-side filtering
of the ref-advertisement. The logic that does this exists for both
fetch and push (even though no push support for v2 currently exists yet)
and is roughly the same so lets consolidate this logic and make it
general enough that it can be used for both the push and fetch cases.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'match_push_refs()' to take a 'struct refspec' as a parameter
instead of an array of 'const char *'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove 'transprot_verify_remote_names()' because all callers have
migrated to using 'struct refspec' which performs the same checks in
'parse_refspec()'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert send-pack.c to store refspecs in a 'struct refspec' instead of
as an array of 'const char *'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'transport_push()' to take a 'struct refspec' as a
parameter instead of an array of strings which represent
refspecs.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the refspecs in builtin/push.c to be stored in a 'struct
refspec' instead of being stored in a list of 'struct refspec_item's.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the error checking for using the "--mirror", "--all", and "--tags"
options earlier and explicitly check for the presence of the flags
instead of checking for a side-effect of the flag.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'query_refspecs()' to take a 'struct refspec' as a parameter instead
of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'apply_refspecs()' to take a 'struct refspec' as a parameter instead
of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'get_stale_heads()' to take a 'struct refspec' as a parameter instead
of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'prune_refs()' to take a 'struct refspec' as a parameter instead
of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'get_ref_map()' to take a 'struct refspec' as a parameter
instead of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'do_fetch()' to take a 'struct refspec' as a parameter instead
of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the refmap in builtin/fetch.c to be stored in a
'struct refspec'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove 'add_prune_tags_to_fetch_refspec()' function and instead have the
only caller directly add the tag refspec using 'refspec_append()'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the set of fetch refspecs stored in 'struct remote' to use
'struct refspec'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the set of push refspecs stored in 'struct remote' to use
'struct refspec'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert fast-export to use 'struct refspec' instead of using a list of
refspec_item's.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'cmd_clone()' to use 'refspec_item_init()' instead of relying on
the old 'parse_fetch_refspec()' to initialize a single refspec item.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'get_tracking_branch()' to use 'refspec_item_init()' instead of
the old 'parse_fetch_refspec()' function.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for introducing an abstraction around a collection of
refspecs (much like how a 'struct pathspec' is a collection of 'struct
pathspec_item's) rename the existing 'struct refspec' to 'struct
refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for performing a refactor on refspec related code, move
the refspec parsing logic into its own file.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of register_commit_graft to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Any other caller of 'repo_read_index' dies upon a negative return of
it, so grep should, too.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After performing a merge that has conflicts git status will, by default,
attempt to detect renames which causes many objects to be examined. In a
virtualized repo, those objects do not exist locally so the rename logic
triggers them to be fetched from the server. This results in the status call
taking hours to complete on very large repos vs seconds with this patch.
Add a new config status.renames setting to enable turning off rename
detection during status and commit. This setting will default to the value
of diff.renames.
Add a new config status.renamelimit setting to to enable bounding the time
spent finding out inexact renames during status and commit. This setting
will default to the value of diff.renamelimit.
Add --no-renames command line option to status that enables overriding the
config setting from the command line. Add --find-renames[=<n>] command line
option to status that enables detecting renames and optionally setting the
similarity index.
Reviewed-by: Elijah Newren <newren@gmail.com>
Original-Patch-by: Alejandro Pauly <alpauly@microsoft.com>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default we want to fill the whole screen if possible, but we do not
want to use up _all_ terminal columns because the last character is
going hit the border, push the cursor over and wrap. Keep it at
default value zero, which will make print_columns() set the width at
term_columns() - 1.
This affects the test in t7004 because effective column width before
was 40 but now 39 so we need to compensate it by one or the output at
39 columns has a different layout.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This aims to make git-submodule foreach a builtin. 'foreach' is ported to
the submodule--helper, and submodule--helper is called from
git-submodule.sh.
Helped-by: Brandon Williams <bmwill@google.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clang 6 reports the following warning, which is turned into an error in a
DEVELOPER build:
builtin/fast-export.c:162:28: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic]
return ((uint32_t *)NULL) + mark;
~~~~~~~~~~~~~~~~~~ ^
1 error generated.
The compiler is correct, and the error message speaks for itself. There
is no need for any undefined operation -- just cast mark to void * or
uint32_t after an intermediate cast to uintptr_t. That encodes the
integer value into a pointer and later decodes it as intended.
While at it remove an outdated comment -- intptr_t has been used since
ffe659f94d (parse-options: make some arguments optional, add callbacks),
committed in October 2007.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)
Each of these `struct lock_file`s is used from within a single function.
Move them into the respective functions to make the scope clearer and
drop the staticness.
For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.
As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.
After this commit, the remaining occurrences of "static struct
lock_file" are locks that are used from within different functions. That
is, they need to remain static. (Short of more intrusive changes like
passing around pointers to non-static locks.)
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)
These `struct lock_file`s are local to their respective functions and we
can drop their staticness.
For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.
As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error messages from "git push" can be painted for more visibility.
* js/colored-push-errors:
config: document the settings to colorize push errors/hints
push: test to verify that push errors are colored
push: colorize errors
color: introduce support for colorizing stderr
"git gc --prune=nonsense" spent long time repacking and then
silently failed when underlying "git prune --expire=nonsense"
failed to parse its command line. This has been corrected.
* jc/parseopt-expiry-errors:
parseopt: handle malformed --expire arguments more nicely
gc: do not upcase error message shown with die()
"git fast-export" had a regression in v2.15.0 era where it skipped
some merge commits in certain cases, which has been corrected.
* ma/fast-export-skip-merge-fix:
fast-export: fix regression skipping some merge-commits
"git submodule status" did not check the symbolic revision name it
computed for the submodule HEAD is not the NULL, and threw it at
printf routines, which has been corrected.
* nd/submodule-status-fix:
submodule--helper: don't print null in 'submodule status'
"git ls-remote" learned an option to allow sorting its output based
on the refnames being shown.
* hn/sort-ls-remote:
ls-remote: create '--sort' option
"git config --get" learned the "--default" option, to help the
calling script. Building on top of the tb/config-type topic, the
"git config" learns "--type=color" type. Taken together, you can
do things like "git config --get foo.color --default blue" and get
the ANSI color sequence for the color given to foo.color variable,
or "blue" if the variable does not exist.
* tb/config-default:
builtin/config: introduce `color` type specifier
config.c: introduce 'git_config_color' to parse ANSI colors
builtin/config: introduce `--default`
The "git config" command uses separate options e.g. "--int",
"--bool", etc. to specify what type the caller wants the value to
be interpreted as. A new "--type=<typename>" option has been
introduced, which would make it cleaner to define new types.
* tb/config-type:
builtin/config.c: support `--type=<type>` as preferred alias for `--<type>`
builtin/config.c: treat type specifiers singularly
"git worktree remove" learned that "-f" is a shorthand for
"--force" option, just like for "git worktree add".
* sb/worktree-remove-opt-force:
worktree: accept -f as short for --force for removal
The effort to pass the repository in-core structure throughout the
API continues. This round deals with the code that implements the
refs/replace/ mechanism.
* sb/object-store-replace:
replace-object: allow lookup_replace_object to handle arbitrary repositories
replace-object: allow do_lookup_replace_object to handle arbitrary repositories
replace-object: allow prepare_replace_object to handle arbitrary repositories
refs: allow for_each_replace_ref to handle arbitrary repositories
refs: store the main ref store inside the repository struct
replace-object: add repository argument to lookup_replace_object
replace-object: add repository argument to do_lookup_replace_object
replace-object: add repository argument to prepare_replace_object
refs: add repository argument to for_each_replace_ref
refs: add repository argument to get_main_ref_store
replace-object: check_replace_refs is safe in multi repo environment
replace-object: eliminate replace objects prepared flag
object-store: move lookup_replace_object to replace-object.h
replace-object: move replace_map to object store
replace_object: use oidmap
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.
* ds/commit-graph:
commit-graph: implement "--append" option
commit-graph: build graph from starting commits
commit-graph: read only from specific pack-indexes
commit: integrate commit graph with commit parsing
commit-graph: close under reachability
commit-graph: add core.commitGraph setting
commit-graph: implement git commit-graph read
commit-graph: implement git-commit-graph write
commit-graph: implement write_commit_graph()
commit-graph: create git-commit-graph builtin
graph: add commit graph design document
commit-graph: add format document
csum-file: refactor finalize_hashfile() method
csum-file: rename hashclose() to finalize_hashfile()
Code restructuring, in preparation for further work.
* ot/libify-get-ref-atom-value:
ref-filter: libify get_ref_atom_value()
ref-filter: add return value to parsers
ref-filter: change parsing function error handling
ref-filter: add return value && strbuf to handlers
ref-filter: start adding strbufs with errors
ref-filter: add shortcut to work with strbufs
Moving a submodule that itself has submodule in it with "git mv"
forgot to make necessary adjustment to the nested sub-submodules;
now the codepath learned to recurse into the submodules.
* sb/submodule-move-nested:
submodule: fixup nested submodules after moving the submodule
submodule-config: remove submodule_from_cache
submodule-config: add repository argument to submodule_from_{name, path}
submodule-config: allow submodule_free to handle arbitrary repositories
grep: remove "repo" arg from non-supporting funcs
submodule.h: drop declaration of connect_work_tree_and_git_dir
The beginning of the next-gen transfer protocol.
* bw/protocol-v2: (35 commits)
remote-curl: don't request v2 when pushing
remote-curl: implement stateless-connect command
http: eliminate "# service" line when using protocol v2
http: don't always add Git-Protocol header
http: allow providing extra headers for http requests
remote-curl: store the protocol version the server responded with
remote-curl: create copy of the service name
pkt-line: add packet_buf_write_len function
transport-helper: introduce stateless-connect
transport-helper: refactor process_connect_service
transport-helper: remove name parameter
connect: don't request v2 when pushing
connect: refactor git_connect to only get the protocol version once
fetch-pack: support shallow requests
fetch-pack: perform a fetch using v2
upload-pack: introduce fetch server command
push: pass ref prefixes when pushing
fetch: pass ref prefixes when fetching
ls-remote: pass ref prefixes when requesting a remote's refs
transport: convert transport_get_remote_refs to take a list of ref prefixes
...
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These options are added in [1] [2] [3]. All these depend on running
rev-list internally which is normally true since they are always used
with "--all --objects" which implies --revs. But let's keep this
dependency explicit.
While at there, add documentation for them. These are mostly used
internally by git-repack. But it's still good to not chase down the
right commit message to know how they work.
[1] ca11b212eb (let pack-objects do the writing of unreachable objects
as loose objects - 2008-05-14)
[2] 08cdfb1337 (pack-objects --keep-unreachable - 2007-09-16)
[3] e26a8c4721 (repack: extend --keep-unreachable to loose objects -
2016-06-13)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you do something like
- git add .
- git status
- git commit
- git show (or git diff HEAD)
one would expect to have analogous output from git status and git show
(or similar diff-related programs). This is generally not the case, as
git status has hard coded values for diff related options.
With this commit the hard coded settings are dropped from the status
command in favour for values provided by git_diff_ui_config.
What follows are some remarks on the concrete options which were hard
coded in git status:
diffopt.detect_rename
Since the very beginning of git status in a3e870f2e2 ("Add "commit"
helper script", 2005-05-30), git status always used rename detection,
whereas with commands like show and log one had to activate it with a
command line option. After 5404c116aa ("diff: activate diff.renames by
default", 2016-02-25) the default behaves the same by coincidence, but
changing diff.renames to other values can break the consistency between
git status and other commands again. With this commit one control the
same default behaviour with diff.renames.
diffopt.rename_limit
Similarly one has the option diff.renamelimit to adjust this limit for
all commands but git status. With this commit git status will also honor
those.
diffopt.break_opt
Unlike the other two options this cannot be configured by a
configuration option yet. This commit will also change the default
behaviour to not use break rewrites. But as rename detection is most
likely on, this is dangerous to be activated anyway as one can see
here:
https://public-inbox.org/git/xmqqegqaahnh.fsf@gitster.dls.corp.google.com/
Signed-off-by: Eckhard S. Maaß <eckhard.s.maass@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the last use of EMPTY_TREE_SHA1_BIN to use a direct copy from
the_hash_algo->empty_tree to avoid a dependency on a given hash
algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert one use of EMPTY_TREE_SHA1_HEX to use empty_tree_oid_hex to
avoid a dependency on a given hash algorithm.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The read_empty and reset_hard functions are static and their callers
have already changed to use struct object_id, so convert them as well.
To avoid dependency on the hash algorithm in use, switch from using
EMPTY_TREE_SHA1_HEX to using empty_tree_oid_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert several uses of EMPTY_TREE_SHA1_BIN to use the_hash_algo
and struct object_id instead.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ macros instead of hard-coding
the constants 20 and 40. Switch one use of 20 with a format specifier
for a hex value to use the hex constant instead, as the original appears
to have been a typo.
At this point, avoid converting the hard-coded use of SHA-1 to use
the_hash_algo. SHA-1, even if not collision resistant, is secure in the
context in which it is used here, and the hash algorithm of the repo
need not match what is used here. When we adopt a new hash algorithm,
we can simply adopt the new algorithm wholesale here, as the nonce is
opaque and its length and validity are entirely controlled by the
server. Consequently, defer updating this code until that point.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code for reading certain pack v2 offsets had a hard-coded 5
representing the number of uint32_t words that we needed to skip over.
Specify this value in terms of a value from the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct llist_item and the rest of the linked list code to use
struct object_id. Add a use of GIT_MAX_HEXSZ to avoid a dependency on a
hard-coded constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the base_sha1 member of struct split_index to use struct
object_id and rename it base_oid. Include cache.h to make the structure
visible.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using hard-coded instances of the constant 20, use
the_hash_algo to look up the correct constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using hard-coded instances of the constant 20, use
the_hash_algo to look up the correct constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the various functions for freshening objects and
has_loose_object_nonlocal to use struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function to take a pointer to struct object_id and rename
it has_object_pack for consistency with has_object_file.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When formatting a series of patches using --attach and --cover-letter,
the cover letter lacks the closing MIME boundary, violating RFC 2046.
Certain clients, such as Thunderbird, discard the message body in such a
case.
Since the cover letter is just one part and sending it as
multipart/mixed is not very useful, always emit it as text/plain,
avoiding the boundary problem altogether.
Reported-by: Patrick Hemmer <git@stormcloud9.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option is intended to help with the transition away from the
now-deprecated graft file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When converting all grafts in a graft file to replace refs, and one of
them happens to leave the original commit's parents unchanged, we do not
want to error out. Instead, we would like to issue a warning.
Prepare the create_graft() function for such a use case by adding a
`gentle` parameter. If set, we do not return an error when the replace ref
is unchanged, but a mere warning.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
File this away as yet another patch in the "libification" category.
As with all useful functions, in the next commit we want to use
create_graft() from a higher-level function where it would be
inconvenient if the called function simply die()s: if there is a
problem, we want to let the user know how to proceed, and the callee
simply has no way of knowing what to say.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently 'git worktree add <path>' creates a new branch named after the
basename of the path by default. If a branch with that name already
exists, the command refuses to do anything, unless the '--force' option
is given.
However we can do a little better than that, and check the branch out if
it is not checked out anywhere else. This will help users who just want
to check an existing branch out into a new worktree, and save a few
keystrokes.
As the current behaviour is to simply 'die()' when a branch with the name
of the basename of the path already exists, there are no backwards
compatibility worries here.
We will still 'die()' if the branch is checked out in another worktree,
unless the --force flag is passed.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out a dwim_branch function, which takes care of the dwim'ery in
'git worktree add <path>'. It's not too much code currently, but we're
adding a new kind of dwim in a subsequent patch, at which point it makes
more sense to have it as a separate function.
Factor it out now to reduce the patch noise in the next patch.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently 'git worktree add' produces output like the following:
Preparing ../foo (identifier foo)
HEAD is now at 26da330922 <title>
The '../foo' is the path where the worktree is created, which the user
has just given on the command line. The identifier is an internal
implementation detail, which is not particularly relevant for the user
and indeed isn't mentioned explicitly anywhere in the man page.
Instead of this message, print a message that gives the user a bit more
detail of what exactly 'git worktree' is doing. There are various dwim
modes which perform some magic under the hood, which should be
helpful to users. Just from the output of the command it is not always
visible to users what exactly has happened.
Help the users a bit more by modifying the "Preparing ..." message and
adding some additional information of what 'git worktree add' did under
the hood, while not displaying the identifier anymore.
Currently there are several different cases:
- 'git worktree add -b ...' or 'git worktree add <path>', both of
which create a new branch, either through the user explicitly
requesting it, or through 'git worktree add' implicitly creating
it. This will end up with the following output:
Preparing worktree (new branch '<branch>')
HEAD is now at 26da330922 <title>
- 'git worktree add -B ...', which may either create a new branch if
the branch with the given name does not exist yet, or resets an
existing branch to the current HEAD, or the commit-ish given.
Depending on which action is taken, we'll end up with the following
output:
Preparing worktree (resetting branch '<branch>'; was at caa68db14)
HEAD is now at 26da330922 <title>
or:
Preparing worktree (new branch '<branch>')
HEAD is now at 26da330922 <title>
- 'git worktree add --detach' or 'git worktree add <path>
<commit-ish>', both of which create a new worktree with a detached
HEAD, for which we will print the following output:
Preparing worktree (detached HEAD 26da330922)
HEAD is now at 26da330922 <title>
- 'git worktree add <path> <local-branch>', which checks out the
branch and prints the following output:
Preparing worktree (checking out '<local-branch>')
HEAD is now at 47007d5 <title>
Additionally currently the "Preparing ..." line is printed to stderr,
while the "HEAD is now at ..." line is printed to stdout by 'git reset
--hard', which is used internally by 'git worktree add'. Fix this
inconsistency by printing the "Preparing ..." message to stdout as
well. As "Preparing ..." is not an error, stdout also seems like the
more appropriate output stream.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two members of 'struct add_opts', which are only used inside
the 'add()' function, but being part of 'struct add_opts' they are
needlessly also passed to the 'add_worktree' function.
Make them local to the 'add()' function to make it clearer where they
are used.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have the BUG() macro for that purpose.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is yet another patch to be filed under the keyword "libification".
There is one subtle change in behavior here, where a `git log` that has
been asked to show the mergetags would now stop reporting the mergetags
upon the first failure, whereas previously, it would have continued to the
next mergetag, if any.
In practice, that change should not matter, as it is 1) uncommon to
perform octopus merges using multiple tags as merge heads, and 2) when the
user asks to be shown those tags, they really should be there.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running `git rebase --rebase-merges` non-interactively with an
ancestor of HEAD as <upstream> (or leaving the todo list unmodified),
we would ideally recreate the exact same commits as before the rebase.
However, if there are commits in the commit range <upstream>.. that do not
have <upstream> as direct ancestor (i.e. if `git log <upstream>..` would
show commits that are omitted by `git log --ancestry-path <upstream>..`),
this is currently not the case: we would turn them into commits that have
<upstream> as direct ancestor.
Let's illustrate that with a diagram:
C
/ \
A - B - E - F
\ /
D
Currently, after running `git rebase -i --rebase-merges B`, the new branch
structure would be (pay particular attention to the commit `D`):
--- C' --
/ \
A - B ------ E' - F'
\ /
D'
This is not really preserving the branch topology from before! The
reason is that the commit `D` does not have `B` as ancestor, and
therefore it gets rebased onto `B`.
This is unintuitive behavior. Even worse, when recreating branch
structure, most use cases would appear to want cousins *not* to be
rebased onto the new base commit. For example, Git for Windows (the
heaviest user of the Git garden shears, which served as the blueprint
for --rebase-merges) frequently merges branches from `next` early, and
these branches certainly do *not* want to be rebased. In the example
above, the desired outcome would look like this:
--- C' --
/ \
A - B ------ E' - F'
\ /
-- D' --
Let's introduce the term "cousins" for such commits ("D" in the
example), and let's not rebase them by default. For hypothetical
use cases where cousins *do* need to be rebased, `git rebase
--rebase=merges=rebase-cousins` needs to be used.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to the `preserve` mode simply passing the `--preserve-merges`
option to the `rebase` command, the `merges` mode simply passes the
`--rebase-merges` option.
This will allow users to conveniently rebase non-trivial commit
topologies when pulling new commits, without flattening them.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sequencer just learned new commands intended to recreate branch
structure (similar in spirit to --preserve-merges, but with a
substantially less-broken design).
Let's allow the rebase--helper to generate todo lists making use of
these commands, triggered by the new --rebase-merges option. For a
commit topology like this (where the HEAD points to C):
- A - B - C
\ /
D
the generated todo list would look like this:
# branch D
pick 0123 A
label branch-point
pick 1234 D
label D
reset branch-point
pick 2345 B
merge -C 3456 D # C
To keep things simple, we first only implement support for merge commits
with exactly two parents, leaving support for octopus merges to a later
patch series.
All merge-rebasing todo lists start with a hard-coded `label onto` line.
This makes it convenient to refer later on to the revision onto which
everything is rebased, e.g. as starting point for branches other than
the very first one.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of packed_object_info to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of oid_object_info
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow oid_object_info_extended callers
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename bunch of source files to more consistently use dashes
instead of underscores to connect words.
* sb/filenames-with-dashes:
replace_object.c: rename to use dash in file name
sha1_file.c: rename to use dash in file name
sha1_name.c: rename to use dash in file name
exec_cmd: rename to use dash in file name
unicode_width.h: rename to use dash in file name
write_or_die.c: rename to use dashes in file name
API clean-up aournd ref-filter code.
* jk/ref-array-push:
ref-filter: factor ref_array pushing into its own function
ref-filter: make ref_array_item allocation more consistent
ref-filter: use "struct object_id" consistently
"cd sub/dir && git commit ../path" ought to record the changes to
the file "sub/path", but this regressed long time ago.
* bw/commit-partial-from-subdirectory-fix:
commit: allow partial commits with relative paths
Teach fetch to optionally accept server options by specifying them on
the cmdline via '-o' or '--server-option'. These server options are
sent to the remote end when performing a fetch communicating using
protocol version 2.
If communicating using a protocol other than v2 the provided options are
ignored and not sent to the remote end.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach ls-remote to optionally accept server options by specifying them
on the cmdline via '-o' or '--server-option'. These server options are
sent to the remote end when querying for the remote end's refs using
protocol version 2.
If communicating using a protocol other than v2 the provided options are
ignored and not sent to the remote end.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* bw/protocol-v2: (35 commits)
remote-curl: don't request v2 when pushing
remote-curl: implement stateless-connect command
http: eliminate "# service" line when using protocol v2
http: don't always add Git-Protocol header
http: allow providing extra headers for http requests
remote-curl: store the protocol version the server responded with
remote-curl: create copy of the service name
pkt-line: add packet_buf_write_len function
transport-helper: introduce stateless-connect
transport-helper: refactor process_connect_service
transport-helper: remove name parameter
connect: don't request v2 when pushing
connect: refactor git_connect to only get the protocol version once
fetch-pack: support shallow requests
fetch-pack: perform a fetch using v2
upload-pack: introduce fetch server command
push: pass ref prefixes when pushing
fetch: pass ref prefixes when fetching
ls-remote: pass ref prefixes when requesting a remote's refs
transport: convert transport_get_remote_refs to take a list of ref prefixes
...
Add a config option that allows selecting the default color scheme for
blame. The command line still takes precedence over the configuration.
It is to be seen, how color.ui will integrate with blame coloring.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Choose a different color for dates and imitate a 'temperature cool down'
depending upon age.
Originally I had planned to have the temperature cool down dependent on
the age of the project or file for example, as that might scale better,
but that can be added on top of this commit, e.g. instead of giving a
date, you could imagine giving a percentage that would be the linearly
interpolated between now and the beginning of the file.
Similarly to the previous patch, this offers the command line option
'--color-by-age' to enable this mode and the config option
'color.blame.highlightrecent' to select colors. A later patch will offer
a config option to select the default mode.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using git-blame lots of lines contain redundant information, for
example in hunks that consist of multiple lines, the metadata (commit
name, author, date) are repeated. A reader may not be interested in those,
so offer an option to color the information that is repeated from the
previous line differently. Traditionally, we use CYAN for lines that
are less interesting than others (e.g. hunk header), so go with that.
The command line option '--color-lines' will trigger the coloring of
repeated lines, and the config option 'color.blame.colorLines' is
provided to select the color. Setting the config option doesn't imply
that repeated lines are colored. A later patch will introduce a config
to enable this mode by default.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is an attempt to resolve an issue I experience with people that are
new to Git -- especially colleagues in a team setting -- where they miss
that their push to a remote location failed because the failure and
success both return a block of white text.
An example is if I push something to a remote repository and then a
colleague attempts to push to the same remote repository and the push
fails because it requires them to pull first, but they don't notice
because a success and failure both return a block of white text. They
then continue about their business, thinking it has been successfully
pushed.
This patch colorizes the errors and hints (in red and yellow,
respectively) so whenever there is a failure when pushing to a remote
repository that fails, it is more noticeable.
[jes: fixed a couple bugs, added the color.{advice,push,transport}
settings, refactored to use want_color_stderr().]
Signed-off-by: Ryan Dammrose ryandammrose@gmail.com
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As of this commit, the canonical way to retreive an ANSI-compatible
color escape sequence from a configuration file is with the
`--get-color` action.
This is to allow Git to "fall back" on a default value for the color
should the given section not exist in the specified configuration(s).
With the addition of `--default`, this is no longer needed since:
$ git config --default red --type=color core.section
will be have exactly as:
$ git config --get-color core.section red
For consistency, let's introduce `--type=color` and encourage its use
with `--default` together over `--get-color` alone.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some use cases, callers of the `git-config(1)` builtin would like to
fallback to default values when the variable asked for does not exist.
In addition, users would like to use existing type specifiers to ensure
that values are parsed correctly when they do exist in the
configuration.
For example, to fetch a value without a type specifier and fallback to
`$fallback`, the following is required:
$ git config core.foo || echo "$fallback"
This is fine for most values, but can be tricky for difficult-to-express
`$fallback`'s, like ANSI color codes.
This motivates `--get-color`, which is a one-off exception to the normal
type specifier rules wherein a user specifies both the configuration
variable and an optional fallback. Both are formatted according to their
type specifier, which eases the burden on the user to ensure that values
are correctly formatted.
This commit (and those following it in this series) aim to eventually
replace `--get-color` with a consistent alternative. By introducing
`--default`, we allow the `--get-color` action to be promoted to a
`--type=color` type specifier, retaining the "fallback" behavior via the
`--default` flag introduced in this commit.
For example, we aim to replace:
$ git config --get-color variable [default] [...]
with:
$ git config --default default --type=color variable [...]
Values filled by `--default` behave exactly as if they were present in
the affected configuration file; they will be parsed by type specifiers
without the knowledge that they are not themselves present in the
configuration.
Specifically, this means that the following will work:
$ git config --int --default 1M does.not.exist
1048576
In subsequent commits, we will offer `--type=color`, which (in
conjunction with `--default`) will be sufficient to replace
`--get-color`.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few commands that parse --expire=<time> command line option behave
sillily when given nonsense input. For example
$ git prune --no-expire
Segmentation falut
$ git prune --expire=npw; echo $?
129
Both come from parse_opt_expiry_date_cb().
The former is because the function is not prepared to see arg==NULL
(for "--no-expire", it is a norm; "--expire" at the end of the
command line could be made to pass NULL, if it is told that the
argument is optional, but we don't so we do not have to worry about
that case).
The latter is because it does not check the value returned from the
underlying parse_expiry_date().
This seems to be a recent regression introduced while we attempted
to avoid spewing the entire usage message when given a correct
option but with an invalid value at 3bb0923f ("parse-options: do not
show usage upon invalid option value", 2018-03-22). Before that, we
didn't fail silently but showed a full usage help (which arguably is
not all that better).
Also catch this error early when "git gc --prune=<expiration>" is
misspelled by doing a dummy parsing before the main body of "gc"
that is time consuming even begins. Otherwise, we'd spend time to
pack objects and then later have "git prune" first notice the error.
Aborting "gc" in the middle that way is not harmful but is ugly and
can be avoided.
Helped-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop(&array)`.
Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return without doing anything with the object.
The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.
Re-add a commit when it is not yet time to handle it. An alternative
that was considered was to peek-then-pop. That carries some risk with it
since the peeking and popping need to act on the same object, in a
concerted fashion.
Add a test that would have caught this.
Reported-by: Isaac Chou <Isaac.Chou@microfocus.com>
Analyzed-by: Isaac Chou <Isaac.Chou@microfocus.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git config` has long allowed the ability for callers to provide a 'type
specifier', which instructs `git config` to (1) ensure that incoming
values can be interpreted as that type, and (2) that outgoing values are
canonicalized under that type.
In another series, we propose to extend this functionality with
`--type=color` and `--default` to replace `--get-color`.
However, we traditionally use `--color` to mean "colorize this output",
instead of "this value should be treated as a color".
Currently, `git config` does not support this kind of colorization, but
we should be careful to avoid squatting on this option too soon, so that
`git config` can support `--color` (in the traditional sense) in the
future, if that is desired.
In this patch, we support `--type=<int|bool|bool-or-int|...>` in
addition to `--int`, `--bool`, and etc. This allows the aforementioned
upcoming patch to support querying a color value with a default via
`--type=color --default=...`, without squandering `--color`.
We retain the historic behavior of complaining when multiple,
legacy-style `--<type>` flags are given, as well as extend this to
conflicting new-style `--type=<type>` flags. `--int --type=int` (and its
commutative pair) does not complain, but `--bool --type=int` (and its
commutative pair) does.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function compute_rev_name() can return NULL sometimes (e.g. right
after 'submodule init'). The current code makes 'submodule status'
print this:
19d97bf5af05312267c2e874ee6bcf584d9e9681 sha1collisiondetection ((null))
This ugly 'null' adds no value to the user using this command. More
importantly printf() on some platform can't handle NULL as a string
and will crash instead of printing '(null)'.
Check for this and skip printing this part (the alternative is
printing '(n/a)' or something but I think that is just noise).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many commands support a "--force" option, frequently abbreviated as
"-f", however, "git worktree remove"'s hand-rolled OPT_BOOL forgets
to recognize the short form, despite git-worktree.txt documenting
"-f" as supported. Replace OPT_BOOL with OPT__FORCE, which provides
"-f" for free, and makes 'remove' consistent with 'add' option
parsing (which also specifies the PARSE_OPT_NOCOMPLETE flag).
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We only show progress when there are new objects to be packed. But
when --keep-pack is specified on the base pack, we will exclude most
of objects. This makes 'pack-objects' stay silent for a long time
while the counting phase is going.
Let's show some progress whenever we visit an object instead. The old
"Counting objects" is renamed to "Enumerating objects" and a new
progress "Counting objects" line is added.
This new "Counting objects" line should progress pretty quick when the
system is beefy. But when the system is under pressure, the reading
object header done in this phase could be slow and showing progress is
an improvement over staying silent in the current code.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pack-objects could be a big memory hog especially on large repos,
everybody knows that. The suggestion to stick a .keep file on the
giant base pack to avoid this problem is also known for a long time.
Recent patches add an option to do just this, but it has to be either
configured or activated manually. This patch lets `git gc --auto`
activate this mode automatically when it thinks `repack -ad` will use
a lot of memory and start affecting the system due to swapping or
flushing OS cache.
gc --auto decides to do this based on an estimation of pack-objects
memory usage, which is quite accurate at least for the heap part, and
whether that fits in half of system memory (the assumption here is for
desktop environment where there are many other applications running).
This mechanism only kicks in if gc.bigBasePackThreshold is not configured.
If it is, it is assumed that the user already knows what they want.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This config allows us to keep <N> packs back if their size is larger
than a limit. But if this N >= gc.autoPackLimit, we may have a
problem. We are supposed to reduce the number of packs after a
threshold because it affects performance.
We could tell the user that they have incompatible gc.bigPackThreshold
and gc.autoPackLimit, but it's kinda hard when 'git gc --auto' runs in
background. Instead let's fall back to the next best stategy: try to
reduce the number of packs anyway, but keep the base pack out. This
reduces the number of packs to two and hopefully won't take up too
much resources to repack (the assumption still is the base pack takes
most resources to handle).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --keep-largest-pack option is not very convenient to use because
you need to tell gc to do this explicitly (and probably on just a few
large repos).
Add a config key that enables this mode when packs larger than a limit
are found. Note that there's a slight behavior difference compared to
--keep-largest-pack: all packs larger than the threshold are kept, not
just the largest one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a new repack mode that combines everything into a secondary
pack, leaving the largest pack alone.
This could help reduce memory pressure. On linux-2.6.git, valgrind
massif reports 1.6GB heap in "pack all" case, and 535MB in "pack
all except the base pack" case. We save roughly 1GB memory by
excluding the base pack.
This should also lower I/O because we don't have to rewrite a giant
pack every time (e.g. for linux-2.6.git that's a 1.4GB pack file)..
PS. The use of string_list here seems overkill, but we'll need it in
the next patch...
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allow to keep existing packs by having companion .keep files. This
is helpful when a pack is permanently kept. In the next patch, git-gc
just wants to keep a pack temporarily, for one pack-objects
run. git-gc can use --keep-pack for this use case.
A note about why the pack_keep field cannot be reused and
pack_keep_in_core has to be added. This is about the case when
--keep-pack is specified together with either --keep-unreachable or
--unpack-unreachable, but --honor-pack-keep is NOT specified.
In this case, we want to exclude objects from the packs specified on
command line, not from ones with .keep files. If only one bit flag is
used, we have to clear pack_keep on pack files with the .keep file.
But we can't make any assumption about unreachable objects in .keep
packs. If "pack_keep" field is false for .keep packs, we could
potentially pull lots of unreachable objects into the new pack, or
unpack them loose. The safer approach is ignore all packs with either
.keep file or --keep-pack.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allowing a delta size of 64 bits is crazy. Shrink this field down to
20 bits with one overflow bit.
If we find an existing delta larger than 1MB, we do not cache
delta_size at all and will get the value from oe_size(), potentially
from disk if it's larger than 4GB.
Note, since DELTA_SIZE() is used in try_delta() code, it must be
thread-safe. Luckily oe_size() does guarantee this so we it is
thread-safe.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's very very rare that an uncompressed object is larger than 4GB
(partly because Git does not handle those large files very well to
begin with). Let's optimize it for the common case where object size
is smaller than this limit.
Shrink size field down to 31 bits and one overflow bit. If the size is
too large, we read it back from disk. As noted in the previous patch,
we need to return the delta size instead of canonical size when the
to-be-reused object entry type is a delta instead of a canonical one.
Add two compare helpers that can take advantage of the overflow
bit (e.g. if the file is 4GB+, chances are it's already larger than
core.bigFileThreshold and there's no point in comparing the actual
value).
Another note about oe_get_size_slow(). This function MUST be thread
safe because SIZE() macro is used inside try_delta() which may run in
parallel. Outside parallel code, no-contention locking should be dirt
cheap (or insignificant compared to i/o access anyway). To exercise
this code, it's best to run the test suite with something like
make test GIT_TEST_OE_SIZE=4
which forces this code on all objects larger than 3 bytes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While this field most of the time contains the canonical object size,
there is one case it does not: when we have found that the base object
of the delta in question is also to be packed, we will very happily
reuse the delta by copying it over instead of regenerating the new
delta.
"size" in this case will record the delta size, not canonical object
size. Later on in write_reuse_object(), we reconstruct the delta
header and "size" is used for this purpose. When this happens, the
"type" field contains a delta type instead of a canonical type.
Highlight this in the code since it could be tricky to see.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sha1_object_info() in check_objects() may fail to locate an object in
the pack and return type OBJ_BAD. In that case, it will likely leave
the "size" field untouched. We delay error handling until later in
prepare_pack() though. Until then, do not touch "size" field.
This field should contain the default value zero, but we can't say
sha1_object_info() cannot damage it. This becomes more important later
when the object size may have to be retrieved back from the
(non-existing) pack.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We only cache deltas when it's smaller than a certain limit. This limit
defaults to 1000 but save its compressed length in a 64-bit field.
Shrink that field down to 20 bits, so you can only cache 1MB deltas.
Larger deltas must be recomputed at when the pack is written down.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These delta pointers always point to elements in the objects[] array
in packing_data struct. We can only hold maximum 4G of those objects
because the array size in nr_objects is uint32_t. We could use
uint32_t indexes to address these elements instead of pointers. On
64-bit architecture (8 bytes per pointer) this would save 4 bytes per
pointer.
Convert these delta pointers to indexes. Since we need to handle NULL
pointers as well, the index is shifted by one [1].
[1] This means we can only index 2^32-2 objects even though nr_objects
could contain 2^32-1 objects. It should not be a problem in
practice because when we grow objects[], nr_alloc would probably
blow up long before nr_objects hits the wall.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
pack. Use an index instead since the number of packs should be
relatively small.
This limits the number of packs we can handle to 1k. Since we can't be
sure people can never run into the situation where they have more than
1k pack files. Provide a fall back route for it.
If we find out they have too many packs, the new in_pack_by_idx[]
array (which has at most 1k elements) will not be used. Instead we
allocate in_pack[] array that holds nr_objects elements. This is
similar to how the optional in_pack_pos field is handled.
The new simple test is just to make sure the too-many-packs code path
is at least executed. The true test is running
make test GIT_TEST_FULL_IN_PACK_ARRAY=1
to take advantage of other special case tests.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This field is only need for pack-bitmap, which is an optional
feature. Move it to a separate array that is only allocated when
pack-bitmap is used (like objects[], it is not freed, since we need it
until the end of the process)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An extra field type_valid is added to carry the equivalent of OBJ_BAD
in the original "type" field. in_pack_type always contains a valid
type so we only need 3 bits for it.
A note about accepting OBJ_NONE as "valid" type. The function
read_object_list_from_stdin() can pass this value [1] and it
eventually calls create_object_entry() where current code skip setting
"type" field if the incoming type is zero. This does not have any bad
side effects because "type" field should be memset()'d anyway.
But since we also need to set type_valid now, skipping oe_set_type()
leaves type_valid zero/false, which will make oe_type() return
OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in
prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger
fatal: unable to get type of object ...
Accepting OBJ_NONE [2] does sound wrong, but this is how it is has
been for a very long time and I haven't time to dig in further.
[1] See 5c49c11686 (pack-objects: better check_object() performances -
2007-04-16)
[2] 21666f1aae (convert object type handling from a string to a number
- 2007-02-26)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_replace_object
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow for_each_replace_ref callers to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the get_main_ref_store caller
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
lookup_replace_object is a low-level function that most users of the
object store do not need to use directly.
Move it to replace-object.h to avoid a dependency loop in an upcoming
change to its inline definition that will make use of repository.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.
Signed-off-by: Stefan Beller <sbeller@google.com>
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.
Signed-off-by: Stefan Beller <sbeller@google.com>
"git submodule status" misbehaved on a submodule that has been
removed from the working tree.
* rs/status-with-removed-submodule:
submodule: check for NULL return of get_submodule_ref_store()
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.
Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.
* sb/object-store: (27 commits)
sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
sha1_file: allow map_sha1_file to handle arbitrary repositories
sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
sha1_file: allow open_sha1_file to handle arbitrary repositories
sha1_file: allow stat_sha1_file to handle arbitrary repositories
sha1_file: allow sha1_file_name to handle arbitrary repositories
sha1_file: add repository argument to sha1_loose_object_info
sha1_file: add repository argument to map_sha1_file
sha1_file: add repository argument to map_sha1_file_1
sha1_file: add repository argument to open_sha1_file
sha1_file: add repository argument to stat_sha1_file
sha1_file: add repository argument to sha1_file_name
sha1_file: allow prepare_alt_odb to handle arbitrary repositories
sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
sha1_file: add repository argument to prepare_alt_odb
sha1_file: add repository argument to link_alt_odb_entries
sha1_file: add repository argument to read_info_alternates
sha1_file: add repository argument to link_alt_odb_entry
sha1_file: add raw_object_store argument to alt_odb_usable
pack: move approximate object count to object store
...
In anticipation of making trees load lazily, create a Coccinelle
script (contrib/coccinelle/commit.cocci) to ensure that all
references to the 'maybe_tree' member of struct commit are either
mutations or accesses through get_commit_tree() or
get_commit_tree_oid().
Apply the Coccinelle script to create the rest of the patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the commit-graph file to walk commit history removes the large
cost of parsing commits during the walk. This exposes a performance
issue: lookup_tree() takes a large portion of the computation time,
even when Git never uses those trees.
In anticipation of lazy-loading these trees, rename the 'tree' member
of struct commit to 'maybe_tree'. This serves two purposes: it hints
at the future role of possibly being NULL even if the commit has a
valid tree, and it allows for unambiguous transformation from simple
member access (i.e. commit->maybe_tree) to method access.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-commit-graph to add all commits from the existing
commit-graph file to the file about to be written. This should be
used when adding new commits without performing garbage collection.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-commit-graph to read commits from stdin when the
--stdin-commits flag is specified. Commits reachable from these
commits are added to the graph. This is a much faster way to construct
the graph than inspecting all packed objects, but is restricted to
known tips.
For the Linux repository, 700,000+ commits were added to the graph
file starting from 'master' in 7-9 seconds, depending on the number
of packfiles in the repo (1, 24, or 120).
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-commit-graph to inspect the objects only in a certain list
of pack-indexes within the given pack directory. This allows updating
the commit graph iteratively.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-commit-graph to read commit graph files and summarize their contents.
Use the read subcommand to verify the contents of a commit graph file in the
tests.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>