The list of commands with their various attributes were spread
across a few places in the build procedure, but it now is getting a
bit more consolidated to allow more automation.
* nd/command-list:
completion: allow to customize the completable command list
completion: add and use --list-cmds=alias
completion: add and use --list-cmds=nohelpers
Move declaration for alias.c to alias.h
completion: reduce completable command list
completion: let git provide the completable command list
command-list.txt: documentation and guide line
help: use command-list.txt for the source of guides
help: add "-a --verbose" to list all commands with synopsis
git: support --list-cmds=list-<category>
completion: implement and use --list-cmds=main,others
git --list-cmds: collect command list in a string_list
git.c: convert --list-* to --list-cmds=*
Remove common-cmds.h
help: use command-list.h for common command list
generate-cmds.sh: export all commands to command-list.h
generate-cmds.sh: factor out synopsis extract code
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (42 commits)
merge-one-file: compute empty blob object ID
add--interactive: compute the empty tree value
Update shell scripts to compute empty tree object ID
sha1_file: only expose empty object constants through git_hash_algo
dir: use the_hash_algo for empty blob object ID
sequencer: use the_hash_algo for empty tree object ID
cache-tree: use is_empty_tree_oid
sha1_file: convert cached object code to struct object_id
builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
submodule: convert several uses of EMPTY_TREE_SHA1_HEX
sequencer: convert one use of EMPTY_TREE_SHA1_HEX
merge: convert empty tree constant to the_hash_algo
builtin/merge: switch tree functions to use object_id
builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
sha1-file: add functions for hex empty tree and blob OIDs
builtin/receive-pack: avoid hard-coded constants for push certs
diff: specify abbreviation size in terms of the_hash_algo
upload-pack: replace use of several hard-coded constants
...
"git pack-objects" needs to allocate tons of "struct object_entry"
while doing its work, and shrinking its size helps the performance
quite a bit.
* nd/pack-objects-pack-struct:
ci: exercise the whole test suite with uncommon code in pack-objects
pack-objects: reorder members to shrink struct object_entry
pack-objects: shrink delta_size field in struct object_entry
pack-objects: shrink size field in struct object_entry
pack-objects: clarify the use of object_entry::size
pack-objects: don't check size when the object is bad
pack-objects: shrink z_delta_size field in struct object_entry
pack-objects: refer to delta objects by index instead of pointer
pack-objects: move in_pack out of struct object_entry
pack-objects: move in_pack_pos out of struct object_entry
pack-objects: use bitfield for object_entry::depth
pack-objects: use bitfield for object_entry::dfs_state
pack-objects: turn type and in_pack_type to bitfields
pack-objects: a bit of document about struct object_entry
read-cache.c: make $GIT_TEST_SPLIT_INDEX boolean
The codepath around object-info API has been taught to take the
repository object (which in turn tells the API which object store
the objects are to be located).
* sb/oid-object-info:
cache.h: allow oid_object_info to handle arbitrary repositories
packfile: add repository argument to cache_or_unpack_entry
packfile: add repository argument to unpack_entry
packfile: add repository argument to read_object
packfile: add repository argument to packed_object_info
packfile: add repository argument to packed_to_object_type
packfile: add repository argument to retry_bad_packed_offset
cache.h: add repository argument to oid_object_info
cache.h: add repository argument to oid_object_info_extended
* maint-2.16:
Git 2.16.4
Git 2.15.2
Git 2.14.4
Git 2.13.7
verify_path: disallow symlinks in .gitmodules
update-index: stat updated files earlier
verify_dotfile: mention case-insensitivity in comment
verify_path: drop clever fallthrough
skip_prefix: add case-insensitive variant
is_{hfs,ntfs}_dotgitmodules: add tests
is_ntfs_dotgit: match other .git files
is_hfs_dotgit: match other .git files
is_ntfs_dotgit: use a size_t for traversing string
submodule-config: verify submodule names as paths
* maint-2.15:
Git 2.15.2
Git 2.14.4
Git 2.13.7
verify_path: disallow symlinks in .gitmodules
update-index: stat updated files earlier
verify_dotfile: mention case-insensitivity in comment
verify_path: drop clever fallthrough
skip_prefix: add case-insensitive variant
is_{hfs,ntfs}_dotgitmodules: add tests
is_ntfs_dotgit: match other .git files
is_hfs_dotgit: match other .git files
is_ntfs_dotgit: use a size_t for traversing string
submodule-config: verify submodule names as paths
* maint-2.14:
Git 2.14.4
Git 2.13.7
verify_path: disallow symlinks in .gitmodules
update-index: stat updated files earlier
verify_dotfile: mention case-insensitivity in comment
verify_path: drop clever fallthrough
skip_prefix: add case-insensitive variant
is_{hfs,ntfs}_dotgitmodules: add tests
is_ntfs_dotgit: match other .git files
is_hfs_dotgit: match other .git files
is_ntfs_dotgit: use a size_t for traversing string
submodule-config: verify submodule names as paths
* maint-2.13:
Git 2.13.7
verify_path: disallow symlinks in .gitmodules
update-index: stat updated files earlier
verify_dotfile: mention case-insensitivity in comment
verify_path: drop clever fallthrough
skip_prefix: add case-insensitive variant
is_{hfs,ntfs}_dotgitmodules: add tests
is_ntfs_dotgit: match other .git files
is_hfs_dotgit: match other .git files
is_ntfs_dotgit: use a size_t for traversing string
submodule-config: verify submodule names as paths
There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:
1. It won't be portable to systems without symlinks.
2. It may behave inconsistently, since Git may look at
this file in the index or a tree without bothering to
resolve any symbolic links. We don't do this _yet_, but
the config infrastructure is there and it's planned for
the future.
With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:
a. A symlinked .gitmodules file may circumvent any fsck
checks of the content.
b. Git may read and write from the on-disk file without
sanity checking the symlink target. So for example, if
you link ".gitmodules" to "../oops" and run "git
submodule add", we'll write to the file "oops" outside
the repository.
Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.
Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.
Signed-off-by: Jeff King <peff@peff.net>
When we started to catch NTFS short names that clash with .git, we only
looked for GIT~1. This is sufficient because we only ever clone into an
empty directory, so .git is guaranteed to be the first subdirectory or
file in that directory.
However, even with a fresh clone, .gitmodules is *not* necessarily the
first file to be written that would want the NTFS short name GITMOD~1: a
malicious repository can add .gitmodul0000 and friends, which sorts
before `.gitmodules` and is therefore checked out *first*. For that
reason, we have to test not only for ~1 short names, but for others,
too.
It's hard to just adapt the existing checks in is_ntfs_dotgit(): since
Windows 2000 (i.e., in all Windows versions still supported by Git),
NTFS short names are only generated in the <prefix>~<number> form up to
number 4. After that, a *different* prefix is used, calculated from the
long file name using an undocumented, but stable algorithm.
For example, the short name of .gitmodules would be GITMOD~1, but if it
is taken, and all of ~2, ~3 and ~4 are taken, too, the short name
GI7EBA~1 will be used. From there, collisions are handled by
incrementing the number, shortening the prefix as needed (until ~9999999
is reached, in which case NTFS will not allow the file to be created).
We'd also want to handle .gitignore and .gitattributes, which suffer
from a similar problem, using the fall-back short names GI250A~1 and
GI7D29~1, respectively.
To accommodate for that, we could reimplement the hashing algorithm, but
it is just safer and simpler to provide the known prefixes. This
algorithm has been reverse-engineered and described at
https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but
still available via https://web.archive.org/.
These can be recomputed by running the following Perl script:
-- snip --
use warnings;
use strict;
sub compute_short_name_hash ($) {
my $checksum = 0;
foreach (split('', $_[0])) {
$checksum = ($checksum * 0x25 + ord($_)) & 0xffff;
}
$checksum = ($checksum * 314159269) & 0xffffffff;
$checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000);
$checksum -= (($checksum * 1152921497) >> 60) * 1000000007;
return scalar reverse sprintf("%x", $checksum & 0xffff);
}
print compute_short_name_hash($ARGV[0]);
-- snap --
E.g., running that with the argument ".gitignore" will
result in "250a" (which then becomes "gi250a" in the code).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
The current document mentions OBJ_* constants without their actual
values. A git developer would know these are from cache.h but that's
not very friendly to a person who wants to read this file to implement
a pack file parser.
Similarly, the deltified representation is not documented at all (the
"document" is basically patch-delta.c). Translate that C code to
English with a bit more about what ofs-delta and ref-delta mean.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The effort to pass the repository in-core structure throughout the
API continues. This round deals with the code that implements the
refs/replace/ mechanism.
* sb/object-store-replace:
replace-object: allow lookup_replace_object to handle arbitrary repositories
replace-object: allow do_lookup_replace_object to handle arbitrary repositories
replace-object: allow prepare_replace_object to handle arbitrary repositories
refs: allow for_each_replace_ref to handle arbitrary repositories
refs: store the main ref store inside the repository struct
replace-object: add repository argument to lookup_replace_object
replace-object: add repository argument to do_lookup_replace_object
replace-object: add repository argument to prepare_replace_object
refs: add repository argument to for_each_replace_ref
refs: add repository argument to get_main_ref_store
replace-object: check_replace_refs is safe in multi repo environment
replace-object: eliminate replace objects prepared flag
object-store: move lookup_replace_object to replace-object.h
replace-object: move replace_map to object store
replace_object: use oidmap
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.
* ds/commit-graph:
commit-graph: implement "--append" option
commit-graph: build graph from starting commits
commit-graph: read only from specific pack-indexes
commit: integrate commit graph with commit parsing
commit-graph: close under reachability
commit-graph: add core.commitGraph setting
commit-graph: implement git commit-graph read
commit-graph: implement git-commit-graph write
commit-graph: implement write_commit_graph()
commit-graph: create git-commit-graph builtin
graph: add commit graph design document
commit-graph: add format document
csum-file: refactor finalize_hashfile() method
csum-file: rename hashclose() to finalize_hashfile()
A build-time option has been added to allow Git to be told to refer
to its associated files relative to the main binary, in the same
way that has been possible on Windows for quite some time, for
Linux, BSDs and Darwin.
* dj/runtime-prefix:
Makefile: quote $INSTLIBDIR when passing it to sed
Makefile: remove unused @@PERLLIBDIR@@ substitution variable
mingw/msvc: use the new-style RUNTIME_PREFIX helper
exec_cmd: provide a new-style RUNTIME_PREFIX helper for Windows
exec_cmd: RUNTIME_PREFIX on some POSIX systems
Makefile: add Perl runtime prefix support
Makefile: generate Perl header from template file
There really isn't any case in which we want to expose the constants for
empty trees and blobs outside of using the hash algorithm abstraction.
Make these constants static and stop exposing the defines in cache.h.
Remove the constants which are no longer in use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Oftentimes, we'll want to refer to an empty tree or empty blob by its
hex name without having to call oid_to_hex or explicitly refer to
the_hash_algo. Add helper functions that format these values into
static buffers and return them for easy use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust struct index_state to use struct object_id instead of unsigned
char [20].
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the various functions for freshening objects and
has_loose_object_nonlocal to use struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sha1 member in struct pack_entry is unused except for one instance
in which we store a value in it. Since nobody ever reads this value,
don't bother to compute it and remove the member from struct pack_entry.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tree member of struct object_context is unused except in one place
where we write to it. Since there are no users of this member, remove
it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In various places throughout the codebase, we need to read data into a
struct object_id from a pack or other unsigned char buffer. Add an
inline function that does this based on the current hash algorithm in
use, and use it in several places.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This involves also adapting oid_object_info_extended and a some
internal functions that are used to implement these. It all has to
happen in one patch, because of a single recursive chain of calls visits
all these functions.
oid_object_info_extended is also used in partial clones, which allow
fetching missing objects. As this series will not add the repository
struct to the transport code and fetch_object(), add a TODO note and
omit fetching if a user tries to use a partial clone in a repository
other than the_repository.
Among the functions modified to handle arbitrary repositories,
unpack_entry() is one of them. Note that it still references the globals
"delta_base_cache" and "delta_base_cached", but those are safe to be
referenced (the former is indexed partly by "struct packed_git *", which
is repo-specific, and the latter is only used to limit the size of the
former as an optimization).
Helped-by: Brandon Williams <bmwill@google.com>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of oid_object_info
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow oid_object_info_extended callers
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some codepaths, including the refs API, get and keep relative
paths, that go out of sync when the process does chdir(2). The
chdir-notify API is introduced to let these codepaths adjust these
cached paths to the new current directory.
* jk/relative-directory-fix:
refs: use chdir_notify to update cached relative paths
set_work_tree: use chdir_notify
add chdir-notify API
trace.c: export trace_setup_key
set_git_dir: die when setenv() fails
An extra field type_valid is added to carry the equivalent of OBJ_BAD
in the original "type" field. in_pack_type always contains a valid
type so we only need 3 bits for it.
A note about accepting OBJ_NONE as "valid" type. The function
read_object_list_from_stdin() can pass this value [1] and it
eventually calls create_object_entry() where current code skip setting
"type" field if the incoming type is zero. This does not have any bad
side effects because "type" field should be memset()'d anyway.
But since we also need to set type_valid now, skipping oe_set_type()
leaves type_valid zero/false, which will make oe_type() return
OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in
prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger
fatal: unable to get type of object ...
Accepting OBJ_NONE [2] does sound wrong, but this is how it is has
been for a very long time and I haven't time to dig in further.
[1] See 5c49c11686 (pack-objects: better check_object() performances -
2007-04-16)
[2] 21666f1aae (convert object type handling from a string to a number
- 2007-02-26)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
lookup_replace_object is a low-level function that most users of the
object store do not need to use directly.
Move it to replace-object.h to avoid a dependency loop in an upcoming
change to its inline definition that will make use of repository.h.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable Git to resolve its own binary location using a variety of
OS-specific and generic methods, including:
- procfs via "/proc/self/exe" (Linux)
- _NSGetExecutablePath (Darwin)
- KERN_PROC_PATHNAME sysctl on BSDs.
- argv0, if absolute (all, including Windows).
This is used to enable RUNTIME_PREFIX support for non-Windows systems,
notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will
do a best-effort resolution of its executable path and automatically use
this as its "exec_path" for relative helper and data lookups, unless
explicitly overridden.
Small incidental formatting cleanup of "exec_cmd.c".
Signed-off-by: Dan Jacques <dnj@google.com>
Thanks-to: Robbie Iannucci <iannucci@google.com>
Thanks-to: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.
Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.
* sb/object-store: (27 commits)
sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
sha1_file: allow map_sha1_file to handle arbitrary repositories
sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
sha1_file: allow open_sha1_file to handle arbitrary repositories
sha1_file: allow stat_sha1_file to handle arbitrary repositories
sha1_file: allow sha1_file_name to handle arbitrary repositories
sha1_file: add repository argument to sha1_loose_object_info
sha1_file: add repository argument to map_sha1_file
sha1_file: add repository argument to map_sha1_file_1
sha1_file: add repository argument to open_sha1_file
sha1_file: add repository argument to stat_sha1_file
sha1_file: add repository argument to sha1_file_name
sha1_file: allow prepare_alt_odb to handle arbitrary repositories
sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
sha1_file: add repository argument to prepare_alt_odb
sha1_file: add repository argument to link_alt_odb_entries
sha1_file: add repository argument to read_info_alternates
sha1_file: add repository argument to link_alt_odb_entry
sha1_file: add raw_object_store argument to alt_odb_usable
pack: move approximate object count to object store
...
The commit graph feature is controlled by the new core.commitGraph config
setting. This defaults to 0, so the feature is opt-in.
The intention of core.commitGraph is that a user can always stop checking
for or parsing commit graph files if core.commitGraph=0.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up for the "repository" abstraction.
* nd/remove-ignore-env-field:
repository.h: add comment and clarify repo_set_gitdir
repository: delete ignore_env member
sha1_file.c: move delayed getenv(altdb) back to setup_git_env()
repository.c: delete dead functions
repository.c: move env-related setup code back to environment.c
repository: initialize the_repository in main()
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (36 commits)
convert: convert to struct object_id
sha1_file: introduce a constant for max header length
Convert lookup_replace_object to struct object_id
sha1_file: convert read_sha1_file to struct object_id
sha1_file: convert read_object_with_reference to object_id
tree-walk: convert tree entry functions to object_id
streaming: convert istream internals to struct object_id
tree-walk: convert get_tree_entry_follow_symlinks internals to object_id
builtin/notes: convert static functions to object_id
builtin/fmt-merge-msg: convert remaining code to object_id
sha1_file: convert sha1_object_info* to object_id
Convert remaining callers of sha1_object_info_extended to object_id
packfile: convert unpack_entry to struct object_id
sha1_file: convert retry_bad_packed_offset to struct object_id
sha1_file: convert assert_sha1_type to object_id
builtin/mktree: convert to struct object_id
streaming: convert open_istream to use struct object_id
sha1_file: convert check_sha1_signature to struct object_id
sha1_file: convert read_loose_object to use struct object_id
builtin/index-pack: convert struct ref_delta_entry to object_id
...
A "git fetch" from a repository with insane number of refs into a
repository that is already up-to-date still wasted too many cycles
making many lstat(2) calls to see if these objects at the tips
exist as loose objects locally. These lstat(2) calls are optimized
away by enumerating all loose objects beforehand.
It is unknown if the new strategy negatively affects existing use
cases, fetching into a repository with many loose objects from a
repository with small number of refs.
* ti/fetch-everything-local-optim:
fetch-pack.c: use oidset to check existence of loose object
The set_git_dir() function returns an error if setenv()
fails, but there are zero callers who pay attention to this
return value. If this ever were to happen, it could cause
confusing results, as sub-processes would see a potentially
stale GIT_DIR (e.g., if it is relative and we chdir()-ed to
the root of the working tree).
We _could_ try to fix each caller, but there's really
nothing useful to do after this failure except die. Let's
just lump setenv() failure into the same category as malloc
failure: things that should never happen and cause us to
abort catastrophically.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow map_sha1_file callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
While at it, move the declaration to object-store.h, where it should
be easier to find.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow sha1_file_name callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
While at it, move the declaration to object-store.h, where it should
be easier to find.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a process with multiple repositories open, packfile accessors
should be associated to a single repository and not shared globally.
Move packed_git and packed_git_mru into the_repository and adjust
callers to reflect this.
[nd: while at there, wrap access to these two fields in get_packed_git()
and get_packed_git_mru(). This allows us to lazily initialize these
fields without caller doing that explicitly]
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrate the struct alternate_object_database and all its related
functions to the object store as these functions are easier found in
that header. The migration is just a verbatim copy, no need to
include the object store header at any C file, because cache.h includes
repository.h which in turn includes the object-store.h
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The split-index mode had a few corner case bugs fixed.
* tg/split-index-fixes:
travis: run tests with GIT_TEST_SPLIT_INDEX
split-index: don't write cache tree with null oid entries
read-cache: fix reading the shared index for other repos
Internal API clean-up to allow write_locked_index() optionally skip
writing the in-core index when it is not modified.
* ma/skip-writing-unchanged-index:
write_locked_index(): add flag to avoid writing unchanged index
When fetching from a repository with large number of refs, because to
check existence of each refs in local repository to packed and loose
objects, 'git fetch' ends up doing a lot of lstat(2) to non-existing
loose form, which makes it slow.
Instead of making as many lstat(2) calls as the refs the remote side
advertised to see if these objects exist in the loose form, first
enumerate all the existing loose objects in hashmap beforehand and use
it to check existence of them if the number of refs is larger than the
number of loose objects.
With this patch, the number of lstat(2) calls in `git fetch` is reduced
from 411412 to 13794 for chromium repository, it has more than 480000
remote refs.
I took time stat of `git fetch` when fetch-pack happens for chromium
repository 3 times on linux with SSD.
* with this patch
8.105s
8.309s
7.640s
avg: 8.018s
* master
12.287s
11.175s
12.227s
avg: 11.896s
On my MacBook Air which has slower lstat(2).
* with this patch
14.501s
* master
1m16.027s
`git fetch` on slow disk will be improved largely.
Signed-off-by: Takuto Ikuta <tikuta@chromium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert both the argument and the return value to be pointers to struct
object_id. Update the callers and their internals to deal with the new
type. Remove several temporaries which are no longer needed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_object_with_reference to take pointers to struct object_id.
Update the internals of the function accordingly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names. Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:
@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(&E1, E2)
@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function to take a pointer to struct object_id and rename
it to assert_oid_type.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function to take a pointer to struct object_id and rename
it check_object_signature. Introduce temporaries to convert the return
values of lookup_replace_object and lookup_replace_object_extended into
struct object_id.
The temporaries are needed because in order to convert
lookup_replace_object, open_istream needs to be converted, and
open_istream needs check_sha1_signature to be converted, causing a loop
of dependencies. The temporaries will be removed in a future patch.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>