Add Documentation/technical/commit-graph.txt with details of the planned
commit graph feature, including future plans.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add document specifying the binary format for commit graphs. This
format allows for:
* New versions.
* New hash functions and hash lengths.
* Optional extensions.
Basic header information is followed by a binary table of contents
into "chunks" that include:
* An ordered list of commit object IDs.
* A 256-entry fanout into that list of OIDs.
* A list of metadata for the commits.
* A list of "large edges" to enable octopus merges.
The format automatically includes two parent positions for every
commit. This favors speed over space, since using only one position
per commit would cause an extra level of indirection for every merge
commit. (Octopus merges suffer from this indirection, but they are
very rare.)
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At some point we may want to rename the function so that it describes what
it actually does as 'submodule_free' doesn't quite describe that this
clears a repository's submodule cache. But that's beyond the scope of
this series.
While at it remove the extern key word from its declaration.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Attempt to clarify what the SHAttered attack means in practice for
Git. The previous version of the text made no mention whatsoever of
Git already having a mitigation for this specific attack, which the
SHAttered researchers claim will detect cryptanalytic collision
attacks.
I may have gotten some of the nuances wrong, but as far as I know this
new text accurately summarizes the current situation with SHA-1 in
git. I.e. git doesn't really use SHA-1 anymore, it uses
Hardened-SHA-1 (they just so happen to produce the same outputs
99.99999999999...% of the time).
Thus the previous text was incorrect in asserting that:
[...]As a result [of SHAttered], SHA-1 cannot be considered
cryptographically secure any more[...]
That's not the case. We have a mitigation against SHAttered, *however*
we consider it prudent to move to work towards a NewHash should future
vulnerabilities in either SHA-1 or Hardened-SHA-1 emerge.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "Repository format extension" to accurately describe what
happens with different versions of Git when they encounter NewHash
repositories, instead of only saying what happens with versions v2.7.0
and later.
See ab9cb76f66 ("Repository format version check.", 2005-11-25) and
00a09d57eb ("introduce "extensions" form of
core.repositoryformatversion", 2015-06-23) for the relevant changes to
the setup code where these variables are checked.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable shallow clones and deepen requests using protocol version 2 if
the server 'fetch' command supports the 'shallow' feature.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the ls-refs server command. In protocol v2, the ls-refs
command is used to request the ref advertisement from the server. Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-prefixes.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce git-serve, the base server for protocol version 2.
Protocol version 2 is intended to be a replacement for Git's current
wire protocol. The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.
Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement. In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement. A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters. At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The http-protocol.txt spec fails to mention that a flush packet
comes in the smart server response after sending the "service"
header.
Technically the client code is actually ready to receive an
arbitrary number of headers here, but since we haven't
introduced any other headers in the past decade (and the
client would just throw them away), let's not mention it in
the spec.
This fixes both BNF and the example. While we're fixing the
latter, let's also add the missing flush after the ref list.
Reported-by: Dorian Taylor <dorian.taylor.lists@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a small number of misspellings, ".gitmodule", scattered
throughout the code base, correct them ... no apparent functional
changes.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.
* jh/partial-clone:
t5616: test bulk prefetch after partial fetch
fetch: inherit filter-spec from partial clone
t5616: end-to-end tests for partial clone
fetch-pack: restore save_commit_buffer after use
unpack-trees: batch fetching of missing blobs
clone: partial clone
partial-clone: define partial clone settings in config
fetch: support filters
fetch: refactor calculation of remote list
fetch-pack: test support excluding large blobs
fetch-pack: add --no-filter
fetch-pack, index-pack, transport: partial clone
upload-pack: add object filtering for partial clone
In preparation for implementing narrow/partial clone, the machinery
for checking object connectivity used by gc and fsck has been
taught that a missing object is OK when it is referenced by a
packfile specially marked as coming from trusted repository that
promises to make them available on-demand and lazily.
* jh/fsck-promisors:
gc: do not repack promisor packfiles
rev-list: support termination at promisor objects
sha1_file: support lazily fetching missing objects
introduce fetch-object: fetch one promisor object
index-pack: refactor writing of .keep files
fsck: support promisor objects as CLI argument
fsck: support referenced promisor objects
fsck: support refs pointing to promisor objects
fsck: introduce partialclone extension
extension.partialclone: introduce partial clone extension
Convert the declaration and definition of pretend_sha1_file to use
struct object_id and adjust all usages of this function. Rename it to
pretend_object_file.
Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Separating out the implementation of the handshake when starting a
long-running subprocess (for example, as is done for a clean/smudge
filter) was done in commit fa64a2fdbe ("sub-process: refactor
handshake to common function", 2017-07-26), but its documentation still
resides in gitattributes. Split out the documentation as well.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach upload-pack to negotiate object filtering over the protocol and
to send filter parameters to pack-objects. This is intended for partial
clone and fetch.
The idea to make upload-pack configurable using uploadpack.allowFilter
comes from Jonathan Tan's work in [1].
[1] https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathantanmy@google.com/
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the names of the identifiers in decorate.h, document them, and
add an example of how to use these functions.
The example is compiled and run as part of the test suite.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.
* bw/protocol-v1:
Documentation: document Extra Parameters
ssh: introduce a 'simple' ssh variant
i5700: add interop test for protocol transition
http: tell server that the client understands v1
connect: tell server that the client understands v1
connect: teach client to recognize v1 server response
upload-pack, receive-pack: introduce protocol version 1
daemon: recognize hidden request arguments
protocol: introduce protocol extension mechanisms
pkt-line: add packet_write function
connect: in ref advertisement, shallows are last
Introduce new repository extension option:
`extensions.partialclone`
See the update to Documentation/technical/repository-version.txt
in this patch for more information.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.
* bp/fsmonitor:
fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
fsmonitor: read entirety of watchman output
fsmonitor: MINGW support for watchman integration
fsmonitor: add a performance test
fsmonitor: add a sample integration script for Watchman
fsmonitor: add test cases for fsmonitor extension
split-index: disable the fsmonitor extension when running the split index test
fsmonitor: add a test tool to dump the index extension
update-index: add fsmonitor support to update-index
ls-files: Add support in ls-files to display the fsmonitor valid bit
fsmonitor: add documentation for the fsmonitor extension.
fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
update-index: add a new --force-write-index option
preload-index: add override to enable testing preload-index
bswap: add 64 bit endianness helper get_be64
Document the server support for Extra Parameters, additional information
that the client can send in its first message to the server during a
Git client-server interaction.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 4f665f2cf3 (string-list.h: move documentation from Documentation/api/
into header, 2017-09-26) the string-list API documentation was moved to
string-list.h. The argv-array API documentation may follow a similar
course in the future. Until then, prevent the broken link from making
it to the end-user documentation.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This includes the core.fsmonitor setting, the fsmonitor integration hook,
and the fsmonitor index extension.
Also add documentation for the new fsmonitor options to ls-files and
update-index.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This document describes what a transition to a new hash function for
Git would look like. Add it to Documentation/technical/ as the plan
of record so that future changes can be recorded as patches.
Also-by: Brandon Williams <bmwill@google.com>
Also-by: Jonathan Tan <jonathantanmy@google.com>
Also-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This mirrors commit 'bdfdaa497 ("strbuf.h: integrate api-strbuf.txt
documentation, 2015-01-16") which did the same for strbuf.h:
* API documentation uses /** */ to set it apart from other comments.
* Function names were stripped from the comments.
* Ordering of the header was adjusted to follow the one from the text
file.
* Edited some existing comments from string-list.h for consistency.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.
* nd/prune-in-worktree:
refs.c: reindent get_submodule_ref_store()
refs.c: remove fallback-to-main-store code get_submodule_ref_store()
rev-list: expose and document --single-worktree
revision.c: --reflog add HEAD reflog from all worktrees
files-backend: make reflog iterator go through per-worktree reflog
revision.c: --all adds HEAD from all worktrees
refs: remove dead for_each_*_submodule()
refs.c: move for_each_remote_ref_submodule() to submodule.c
revision.c: use refs_for_each*() instead of for_each_*_submodule()
refs: add refs_head_ref()
refs: move submodule slash stripping code to get_submodule_ref_store
refs.c: refactor get_submodule_ref_store(), share common free block
revision.c: --indexed-objects add objects from all worktrees
revision.c: refactor add_index_objects_to_pending()
refs.c: use is_dir_sep() in resolve_gitlink_ref()
revision.h: new flag in struct rev_info wrt. worktree-related refs
Message and doc updates.
* ma/up-to-date:
treewide: correct several "up-to-date" to "up to date"
Documentation/user-manual: update outdated example output
The function was deprecated in commit 89576613 ("treewide: deprecate
git_config_maybe_bool, use git_parse_maybe_bool", 2017-08-07) and has no
users.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are used in revision.c. After the last patch they are replaced
with the refs_ version. Delete them.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow the Oxford style, which says to use "up-to-date" before the noun,
but "up to date" after it. Don't change plumbing (specifically
send-pack.c, but transport.c (git push) also has the same string).
This was produced by grepping for "up-to-date" and "up to date". It
turned out we only had to edit in one direction, removing the hyphens.
Fix a typo in Documentation/git-diff-index.txt while we're there.
Reported-by: Jeffrey Manian <jeffrey.manian@gmail.com>
Reported-by: STEVEN WHITE <stevencharleswhitevoices@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* ma/parse-maybe-bool:
parse_decoration_style: drop unused argument `var`
treewide: deprecate git_config_maybe_bool, use git_parse_maybe_bool
config: make git_{config,parse}_maybe_bool equivalent
config: introduce git_parse_maybe_bool_text
t5334: document that git push --signed=1 does not work
Doc/git-{push,send-pack}: correct --sign= to --signed=
All callers of fill_tree_descriptor() have been converted to object_id
already, so convert that function as well. As a nice side-effect we get
rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already
does them for us.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "tag.pager" configuration variable was useless for those who
actually create tag objects, as it interfered with the use of an
editor. A new mechanism has been introduced for commands to enable
pager depending on what operation is being carried out to fix this,
and then "git tag -l" is made to run pager by default.
* ma/pager-per-subcommand-action:
git.c: ignore pager.* when launching builtin as dashed external
tag: change default of `pager.tag` to "on"
tag: respect `pager.tag` in list-mode only
t7006: add tests for how git tag paginates
git.c: provide setup_auto_pager()
git.c: let builtins opt for handling `pager.foo` themselves
builtin.h: take over documentation from api-builtin.txt
The only difference between these is that the former takes an argument
`name` which it ignores completely. Still, the callers are quite careful
to provide reasonable values for it.
Once in-flight topics have landed, we should be able to remove
git_config_maybe_bool. In the meantime, document it as deprecated in the
technical documentation. While at it, document git_parse_maybe_bool.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Delete Documentation/technical/api-builtin.txt and move its content
into builtin.h. Format it as a comment. Remove a '+' which was needed
when the information was formatted for AsciiDoc. Similarly, change
"::" to ":".
Document SUPPORT_SUPER_PREFIX, thereby bringing the documentation up to
date with the available flags.
While at it, correct '3 more things to do' to '4 more things to do'.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the documentation for the sub-process API from a separate txt file
to its header file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While at it, clarify the use of `key`, `keydata`, `entry_or_key` as well
as documenting the new data pointer for the compare function.
Rework the example.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clean -d" used to clean directories that has ignored files,
even though the command should not lose ignored ones without "-x".
"git status --ignored" did not list ignored and untracked files
without "-uall". These have been corrected.
* sl/clean-d-ignored-fix:
clean: teach clean -d to preserve ignored paths
dir: expose cmp_name() and check_contains()
dir: hide untracked contents of untracked dirs
dir: recurse into untracked dirs for ignored files
t7061: status --ignored should search untracked dirs
t7300: clean -d should skip dirs with ignored files
Code from "conversion using external process" codepath has been
extracted to a separate sub-process.[ch] module.
* bp/sub-process-convert-filter:
convert: update subprocess_read_status() to not die on EOF
sub-process: move sub-process functions into separate files
convert: rename reusable sub-process functions
convert: update generic functions to only use generic data structures
convert: separate generic structures and variables from the filter specific ones
convert: split start_multi_file_filter() into two separate functions
pkt-line: annotate packet_writel with LAST_ARG_MUST_BE_NULL
convert: move packet_write_line() into pkt-line as packet_writel()
pkt-line: add packet_read_line_gently()
pkt-line: fix packet_read_line() to handle len < 0 errors
convert: remove erroneous tests for errno == EPIPE
The receive-pack program now makes sure that the push certificate
records the same set of push options used for pushing.
* jt/push-options-doc:
receive-pack: verify push options in cert
docs: correct receive.advertisePushOptions default
When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.
We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.
This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.
Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some platforms have ulong that is smaller than time_t, and our
historical use of ulong for timestamp would mean they cannot
represent some timestamp that the platform allows. Invent a
separate and dedicated timestamp_t (so that we can distingiuish
timestamps and a vanilla ulongs, which along is already a good
move), and then declare uintmax_t is the type to be used as the
timestamp_t.
* js/larger-timestamps:
archive-tar: fix a sparse 'constant too large' warning
use uintmax_t for timestamps
date.c: abort if the system time cannot handle one of our timestamps
timestamp_t: a new data type for timestamps
PRItime: introduce a new "printf format" for timestamps
parse_timestamp(): specify explicitly where we parse timestamps
t0006 & t5000: skip "far in the future" test when time_t is too limited
t0006 & t5000: prepare for 64-bit timestamps
ref-filter: avoid using `unsigned long` for catch-all data type
Move the sub-proces functions into sub-process.h/c. Add documentation
for the new module in Documentation/technical/api-sub-process.txt
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit f6a4e61 ("push: accept push options", 2016-07-14), send-pack
was taught to include push options both within the signed cert (if the
push is a signed push) and outside the signed cert; however,
receive-pack ignores push options within the cert, only handling push
options outside the cert.
Teach receive-pack, in the case that push options are provided for a
signed push, to verify that the push options both within the cert and
outside the cert are consistent.
This sets in stone the requirement that send-pack redundantly send its
push options in 2 places, but I think that this is better than the
alternatives. Sending push options only within the cert is
backwards-incompatible with existing Git servers (which read push
options only from outside the cert), and sending push options only
outside the cert means that the push options are not signed for.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch-pack" was not prepared to accept ERR packet that the
upload-pack can send with a human-readable error message. It
showed the packet contents with ERR prefix, so there was no data
loss, but it was redundant to say "ERR" in an error message.
* jt/fetch-pack-error-reporting:
fetch-pack: show clearer error message upon ERR
Conversion from unsigned char [40] to struct object_id continues.
* bc/object-id:
Documentation: update and rename api-sha1-array.txt
Rename sha1_array to oid_array
Convert sha1_array_for_each_unique and for_each_abbrev to object_id
Convert sha1_array_lookup to take struct object_id
Convert remaining callers of sha1_array_lookup to object_id
Make sha1_array_append take a struct object_id *
sha1-array: convert internal storage for struct sha1_array to object_id
builtin/pull: convert to struct object_id
submodule: convert check_for_new_submodule_commits to object_id
sha1_name: convert disambiguate_hint_fn to take object_id
sha1_name: convert struct disambiguate_state to object_id
test-sha1-array: convert most code to struct object_id
parse-options-cb: convert sha1_array_append caller to struct object_id
fsck: convert init_skiplist to struct object_id
builtin/receive-pack: convert portions to struct object_id
builtin/pull: convert portions to struct object_id
builtin/diff: convert to struct object_id
Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
Define new hash-size constants for allocating memory
Currently, fetch-pack prints a confusing error message ("expected
ACK/NAK") when the server it's communicating with sends a pkt-line
starting with "ERR". Replace it with a less confusing error message.
Also update the documentation describing the fetch-pack/upload-pack
protocol (pack-protocol.txt) to indicate that "ERR" can be sent in the
place of "ACK" or "NAK". In practice, this has been done for quite some
time by other Git implementations (e.g. JGit sends "want $id not valid")
and by Git itself (since commit bdb31ea: "upload-pack: report "not our
ref" to client", 2017-02-23) whenever a "want" line references an object
that it does not have. (This is uncommon, but can happen if a repository
is garbage-collected during a negotiation.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the structure and functions have changed names, update the code
examples and the documentation. Rename the file to match the new name
of the API.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name-hash used for detecting paths that are different only in
cases (which matter on case insensitive filesystems) has been
optimized to take advantage of multi-threading when it makes sense.
* jh/memihash-opt:
name-hash: add test-lazy-init-name-hash to .gitignore
name-hash: add perf test for lazy_init_name_hash
name-hash: add test-lazy-init-name-hash
name-hash: perf improvement for lazy_init_name_hash
hashmap: document memihash_cont, hashmap_disallow_rehash api
hashmap: add disallow_rehash setting
hashmap: allow memihash computation to be continued
name-hash: specify initial size for istate.dir_hash table
Document memihash_cont() and hashmap_disallow_rehash()
in Documentation/technical/api-hashmap.txt.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The gitattributes machinery is being taught to work better in a
multi-threaded environment.
* bw/attr: (27 commits)
attr: reformat git_attr_set_direction() function
attr: push the bare repo check into read_attr()
attr: store attribute stack in attr_check structure
attr: tighten const correctness with git_attr and match_attr
attr: remove maybe-real, maybe-macro from git_attr
attr: eliminate global check_all_attr array
attr: use hashmap for attribute dictionary
attr: change validity check for attribute names to use positive logic
attr: pass struct attr_check to collect_some_attrs
attr: retire git_check_attrs() API
attr: convert git_check_attrs() callers to use the new API
attr: convert git_all_attrs() to use "struct attr_check"
attr: (re)introduce git_check_attr() and struct attr_check
attr: rename function and struct related to checking attributes
attr.c: outline the future plans by heavily commenting
Documentation: fix a typo
attr.c: add push_stack() helper
attr: support quoting pathname patterns in C style
attr.c: plug small leak in parse_attr_line()
attr.c: tighten constness around "git_attr" structure
...
"git describe" and "git name-rev" have been taught to take more
than one refname patterns to restrict the set of refs to base their
naming output on, and also learned to take negative patterns to
name refs not to be used for naming via their "--exclude" option.
* jk/describe-omit-some-refs:
describe: teach describe negative pattern matches
describe: teach --match to accept multiple patterns
name-rev: add support to exclude refs by pattern match
name-rev: extend --refs to accept multiple patterns
doc: add documentation for OPT_STRING_LIST
Since nobody uses the old API, make it file-scope static, and update
the documentation to describe the new API.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A crashing bug introduced in v2.11 timeframe has been found (it is
triggerable only in fast-import) and fixed.
* jk/clear-delta-base-cache-fix:
clear_delta_base_cache(): don't modify hashmap while iterating
Commit c8ba163916 ("parse-options: add OPT_STRING_LIST helper",
2011-06-09) added the OPT_STRING_LIST as a way to accumulate a repeated
list of strings. However, this was not documented in the
api-parse-options documentation. Add documentation now so that future
developers may learn of its existence.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When looking for documentation for a specific function, you may be tempted
to run
git -C Documentation grep index_name_pos
only to find the file technical/api-in-core-index.txt, which doesn't
help for understanding the given function. It would be better to not find
these functions in the documentation, such that people directly dive into
the code instead.
In the previous patches we have documented
* index_name_pos()
* remove_index_entry_at()
* add_[file_]to_index()
in cache.h
We already have documentation for:
* add_index_entry()
* read_index()
Which leaves us with a TODO for:
* cache -> the_index macros
* refresh_index()
* discard_index()
* ie_match_stat() and ie_modified(); how they are different and when to
use which.
* write_index() that was renamed to write_locked_index
* cache_tree_invalidate_path()
* cache_tree_update()
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Thu, Jan 19, 2017 at 03:03:46PM +0100, Ulrich Spörlein wrote:
> > I suspect the patch below may fix things for you. It works around it by
> > walking over the lru list (either is fine, as they both contain all
> > entries, and since we're clearing everything, we don't care about the
> > order).
>
> Confirmed. With the patch applied, I can import the whole 55G in one go
> without any crashes or aborts. Thanks much!
Thanks. Here it is rolled up with a commit message.
-- >8 --
Subject: clear_delta_base_cache(): don't modify hashmap while iterating
Removing entries while iterating causes fast-import to
access an already-freed `struct packed_git`, leading to
various confusing errors.
What happens is that clear_delta_base_cache() drops the
whole contents of the cache by iterating over the hashmap,
calling release_delta_base_cache() on each entry. That
function removes the item from the hashmap. The hashmap code
may then shrink the table, but the hashmap_iter struct
retains an offset from the old table.
As a result, the next call to hashmap_iter_next() may claim
that the iteration is done, even though some items haven't
been visited.
The only caller of clear_delta_base_cache() is fast-import,
which wants to clear the cache because it is discarding the
packed_git struct for its temporary pack. So by failing to
remove all of the entries, we still have references to the
freed packed_git.
To make things even more confusing, this doesn't seem to
trigger with the test suite, because it depends on
complexities like the size of the hash table, which entries
got cleared, whether we try to access them before they're
evicted from the cache, etc.
So I've been able to identify the problem with large
imports like freebsd's svn import, or a fast-export of
linux.git. But nothing that would be reasonable to run as
part of the normal test suite.
We can fix this easily by iterating over the lru linked list
instead of the hashmap. They both contain the same entries,
and we can use the "safe" variant of the list iterator,
which exists for exactly this case.
Let's also add a warning to the hashmap API documentation to
reduce the chances of getting bit by this again.
Reported-by: Ulrich Spörlein <uqs@freebsd.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up in the pathspec API.
* bw/pathspec-cleanup:
pathspec: rename prefix_pathspec to init_pathspec_item
pathspec: small readability changes
pathspec: create strip submodule slash helpers
pathspec: create parse_element_magic helper
pathspec: create parse_long_magic function
pathspec: create parse_short_magic function
pathspec: factor global magic into its own function
pathspec: simpler logic to prefix original pathspec elements
pathspec: always show mnemonic and name in unsupported_magic
pathspec: remove unused variable from unsupported_magic
pathspec: copy and free owned memory
pathspec: remove the deprecated get_pathspec function
ls-tree: convert show_recursive to use the pathspec struct interface
dir: convert fill_directory to use the pathspec struct interface
dir: remove struct path_simplify
mv: remove use of deprecated 'get_pathspec()'
Now that all callers of the old 'get_pathspec' interface have been
migrated to use the new pathspec struct interface it can be removed
from the codebase.
Since there are no more users of the '_raw' field in the pathspec struct
it can also be removed. This patch also removes the old functionality
of modifying the const char **argv array that was passed into
parse_pathspec. Instead the constructed 'match' string (which is a
pathspec element with the prefix prepended) is only stored in its
corresponding pathspec_item entry.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is also possible to pass in any treeish name to lookup a submodule
config. Make it clear by naming the variables accordingly. Looking up
a submodule config by tree hash will come in handy in a later patch.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
The callbacks for iterating a sha1_array must have a void
return. This is unlike our usual for_each semantics, where
a callback may interrupt iteration and have its value
propagated. Let's switch it to the usual form, which will
enable its use in more places (e.g., where we are replacing
an existing iteration with a different data structure).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct an age-old calco (is that a typo-like word for calc)
in the documentation.
* ls/packet-line-protocol-doc-fix:
pack-protocol: fix maximum pkt-line size
According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
therefore the pkt-line data component must not exceed 65516 bytes.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The API documentation for hashmap was unclear if hashmap_entry
can be safely discarded without any other consideration. State
that it is safe to do so.
* jc/hashmap-doc-init:
hashmap: clarify that hashmap_entry can safely be discarded
The API documentation said that the hashmap_entry structure to be
embedded in the caller's structure is to be treated as opaque, which
left the reader wondering if it can safely be discarded when it no
longer is necessary. If the hashmap_entry structure had references
to external resources such as allocated memory or an open file
descriptor, merely free(3)ing the containing structure (when the
caller's structure is on the heap) or letting it go out of scope
(when it is on the stack) would end up leaking the external
resource.
Document that there is no need for hashmap_entry_clear() that
corresponds to hashmap_entry_init() to give the API users a little
bit of peace of mind.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the transport protocol we use NAK to signal the non existence of a
common base, so fix the documentation. This helps readers of the document,
as they don't have to wonder about the difference between NAK and NACK.
As NACK is used in git archive and upload-archive, this is easy to get
wrong.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.
Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use different types of signature formats in different places.
Set up the infrastructure and overview to describe them systematically
in our technical documentation.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should allow the user to say "create a shallow clone of this branch
after version <some-tag>".
Short refs are accepted and expanded at the server side with expand_ref()
because we cannot expand (unknown) refs from the client side.
Like deepen-since, deepen-not cannot be used with deepen. But deepen-not
can be mixed with deepen-since. The result is exactly how you do the
command "git rev-list --since=... --not ref".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should allow the user to say "create a shallow clone containing the
work from last year" (once the client side is fixed up, of course).
In theory deepen-since and deepen (aka --depth) can be used together to
draw the shallow boundary (whether it's intersection or union is up to
discussion, but if rev-list is used, it's likely intersection). However,
because deepen goes with a custom commit walker, we can't mix the two
yet.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit" learned to pay attention to "commit.verbose"
configuration variable and act as if "--verbose" option was
given from the command line.
* pb/commit-verbose-config:
commit: add a commit.verbose config variable
t7507-commit-verbose: improve test coverage by testing number of diffs
parse-options.c: make OPTION_COUNTUP respect "unspecified" values
t/t7507: improve test coverage
t0040-parse-options: improve test coverage
test-parse-options: print quiet as integer
t0040-test-parse-options.sh: fix style issues
There are a handful of incorrect "linkgit:<page>[<section>]"
instances in our documentation set.
* Some have an extra colon after "linkgit:"; fix them by removing
the extra colon;
* Some refer to a page outside the Git suite, namely curl(1); fix
them by using the `curl(1)` that already appears on the same page
for the same purpose of referring the readers to its manual page.
* Some spell the name of the page incorrectly, e.g. "rev-list" when
they mean "git-rev-list"; fix them.
* Some list the manual section incorrectly; fix them to make sure
they match what is at the top of the target of the link.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many instances of duplicate words (e.g. "the the path") and
a few typoes are fixed, originally in multiple patches.
wildmatch: fix duplicate words of "the"
t: fix duplicate words of "output"
transport-helper: fix duplicate words of "read"
Git.pm: fix duplicate words of "return"
path: fix duplicate words of "look"
pack-protocol.txt: fix duplicate words of "the"
precompose-utf8: fix typo of "sequences"
split-index: fix typo
worktree.c: fix typo
remote-ext: fix typo
utf8: fix duplicate words of "the"
git-cvsserver: fix duplicate words
Signed-off-by: Li Peng <lip@dtdream.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OPT_COUNTUP() merely increments the counter upon --option, and resets it
to 0 upon --no-option, which means that there is no "unspecified" value
with which a client can initialize the counter to determine whether or
not --[no]-option was seen at all.
Make OPT_COUNTUP() treat any negative number as an "unspecified" value
to address this shortcoming. In particular, if a client initializes the
counter to -1, then if it is still -1 after parse_options(), then
neither --option nor --no-option was seen; if it is 0, then --no-option
was seen last, and if it is 1 or greater, than --option was seen last.
This change does not affect the behavior of existing clients because
they all use the initial value of 0 (or more).
Note that builtin/clean.c initializes the variable used with
OPT__FORCE (which uses OPT_COUNTUP()) to a negative value, but it is set
to either 0 or 1 by reading the configuration before the code calls
parse_options(), i.e. as far as parse_options() is concerned, the
initial value of the variable is not negative.
To test this behavior, in test-parse-options.c, "verbose" is set to
"unspecified" while quiet is set to 0 which will test the new behavior
with all sets of values.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repository set-up sequence has been streamlined (the biggest
change is that there is no longer git_config_early()), so that we
do not attempt to look into refs/* when we know we do not have a
Git repository.
* jk/check-repository-format:
verify_repository_format: mark messages for translation
setup: drop repository_format_version global
setup: unify repository version callbacks
init: use setup.c's repo version verification
setup: refactor repo format reading and verification
config: drop git_config_early
check_repository_format_gently: stop using git_config_early
lazily load core.sharedrepository
wrap shared_repository global in get/set accessors
setup: document check_repository_format()