A crashing bug introduced in v2.11 timeframe has been found (it is
triggerable only in fast-import) and fixed.
* jk/clear-delta-base-cache-fix:
clear_delta_base_cache(): don't modify hashmap while iterating
When looking for documentation for a specific function, you may be tempted
to run
git -C Documentation grep index_name_pos
only to find the file technical/api-in-core-index.txt, which doesn't
help for understanding the given function. It would be better to not find
these functions in the documentation, such that people directly dive into
the code instead.
In the previous patches we have documented
* index_name_pos()
* remove_index_entry_at()
* add_[file_]to_index()
in cache.h
We already have documentation for:
* add_index_entry()
* read_index()
Which leaves us with a TODO for:
* cache -> the_index macros
* refresh_index()
* discard_index()
* ie_match_stat() and ie_modified(); how they are different and when to
use which.
* write_index() that was renamed to write_locked_index
* cache_tree_invalidate_path()
* cache_tree_update()
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Thu, Jan 19, 2017 at 03:03:46PM +0100, Ulrich Spörlein wrote:
> > I suspect the patch below may fix things for you. It works around it by
> > walking over the lru list (either is fine, as they both contain all
> > entries, and since we're clearing everything, we don't care about the
> > order).
>
> Confirmed. With the patch applied, I can import the whole 55G in one go
> without any crashes or aborts. Thanks much!
Thanks. Here it is rolled up with a commit message.
-- >8 --
Subject: clear_delta_base_cache(): don't modify hashmap while iterating
Removing entries while iterating causes fast-import to
access an already-freed `struct packed_git`, leading to
various confusing errors.
What happens is that clear_delta_base_cache() drops the
whole contents of the cache by iterating over the hashmap,
calling release_delta_base_cache() on each entry. That
function removes the item from the hashmap. The hashmap code
may then shrink the table, but the hashmap_iter struct
retains an offset from the old table.
As a result, the next call to hashmap_iter_next() may claim
that the iteration is done, even though some items haven't
been visited.
The only caller of clear_delta_base_cache() is fast-import,
which wants to clear the cache because it is discarding the
packed_git struct for its temporary pack. So by failing to
remove all of the entries, we still have references to the
freed packed_git.
To make things even more confusing, this doesn't seem to
trigger with the test suite, because it depends on
complexities like the size of the hash table, which entries
got cleared, whether we try to access them before they're
evicted from the cache, etc.
So I've been able to identify the problem with large
imports like freebsd's svn import, or a fast-export of
linux.git. But nothing that would be reasonable to run as
part of the normal test suite.
We can fix this easily by iterating over the lru linked list
instead of the hashmap. They both contain the same entries,
and we can use the "safe" variant of the list iterator,
which exists for exactly this case.
Let's also add a warning to the hashmap API documentation to
reduce the chances of getting bit by this again.
Reported-by: Ulrich Spörlein <uqs@freebsd.org>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up in the pathspec API.
* bw/pathspec-cleanup:
pathspec: rename prefix_pathspec to init_pathspec_item
pathspec: small readability changes
pathspec: create strip submodule slash helpers
pathspec: create parse_element_magic helper
pathspec: create parse_long_magic function
pathspec: create parse_short_magic function
pathspec: factor global magic into its own function
pathspec: simpler logic to prefix original pathspec elements
pathspec: always show mnemonic and name in unsupported_magic
pathspec: remove unused variable from unsupported_magic
pathspec: copy and free owned memory
pathspec: remove the deprecated get_pathspec function
ls-tree: convert show_recursive to use the pathspec struct interface
dir: convert fill_directory to use the pathspec struct interface
dir: remove struct path_simplify
mv: remove use of deprecated 'get_pathspec()'
Now that all callers of the old 'get_pathspec' interface have been
migrated to use the new pathspec struct interface it can be removed
from the codebase.
Since there are no more users of the '_raw' field in the pathspec struct
it can also be removed. This patch also removes the old functionality
of modifying the const char **argv array that was passed into
parse_pathspec. Instead the constructed 'match' string (which is a
pathspec element with the prefix prepended) is only stored in its
corresponding pathspec_item entry.
Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is also possible to pass in any treeish name to lookup a submodule
config. Make it clear by naming the variables accordingly. Looking up
a submodule config by tree hash will come in handy in a later patch.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
The callbacks for iterating a sha1_array must have a void
return. This is unlike our usual for_each semantics, where
a callback may interrupt iteration and have its value
propagated. Let's switch it to the usual form, which will
enable its use in more places (e.g., where we are replacing
an existing iteration with a different data structure).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct an age-old calco (is that a typo-like word for calc)
in the documentation.
* ls/packet-line-protocol-doc-fix:
pack-protocol: fix maximum pkt-line size
According to LARGE_PACKET_MAX in pkt-line.h the maximal length of a
pkt-line packet is 65520 bytes. The pkt-line header takes 4 bytes and
therefore the pkt-line data component must not exceed 65516 bytes.
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The API documentation for hashmap was unclear if hashmap_entry
can be safely discarded without any other consideration. State
that it is safe to do so.
* jc/hashmap-doc-init:
hashmap: clarify that hashmap_entry can safely be discarded
The API documentation said that the hashmap_entry structure to be
embedded in the caller's structure is to be treated as opaque, which
left the reader wondering if it can safely be discarded when it no
longer is necessary. If the hashmap_entry structure had references
to external resources such as allocated memory or an open file
descriptor, merely free(3)ing the containing structure (when the
caller's structure is on the heap) or letting it go out of scope
(when it is on the stack) would end up leaking the external
resource.
Document that there is no need for hashmap_entry_clear() that
corresponds to hashmap_entry_init() to give the API users a little
bit of peace of mind.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the transport protocol we use NAK to signal the non existence of a
common base, so fix the documentation. This helps readers of the document,
as they don't have to wonder about the difference between NAK and NACK.
As NACK is used in git archive and upload-archive, this is easy to get
wrong.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.
Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use different types of signature formats in different places.
Set up the infrastructure and overview to describe them systematically
in our technical documentation.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should allow the user to say "create a shallow clone of this branch
after version <some-tag>".
Short refs are accepted and expanded at the server side with expand_ref()
because we cannot expand (unknown) refs from the client side.
Like deepen-since, deepen-not cannot be used with deepen. But deepen-not
can be mixed with deepen-since. The result is exactly how you do the
command "git rev-list --since=... --not ref".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should allow the user to say "create a shallow clone containing the
work from last year" (once the client side is fixed up, of course).
In theory deepen-since and deepen (aka --depth) can be used together to
draw the shallow boundary (whether it's intersection or union is up to
discussion, but if rev-list is used, it's likely intersection). However,
because deepen goes with a custom commit walker, we can't mix the two
yet.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit" learned to pay attention to "commit.verbose"
configuration variable and act as if "--verbose" option was
given from the command line.
* pb/commit-verbose-config:
commit: add a commit.verbose config variable
t7507-commit-verbose: improve test coverage by testing number of diffs
parse-options.c: make OPTION_COUNTUP respect "unspecified" values
t/t7507: improve test coverage
t0040-parse-options: improve test coverage
test-parse-options: print quiet as integer
t0040-test-parse-options.sh: fix style issues
There are a handful of incorrect "linkgit:<page>[<section>]"
instances in our documentation set.
* Some have an extra colon after "linkgit:"; fix them by removing
the extra colon;
* Some refer to a page outside the Git suite, namely curl(1); fix
them by using the `curl(1)` that already appears on the same page
for the same purpose of referring the readers to its manual page.
* Some spell the name of the page incorrectly, e.g. "rev-list" when
they mean "git-rev-list"; fix them.
* Some list the manual section incorrectly; fix them to make sure
they match what is at the top of the target of the link.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many instances of duplicate words (e.g. "the the path") and
a few typoes are fixed, originally in multiple patches.
wildmatch: fix duplicate words of "the"
t: fix duplicate words of "output"
transport-helper: fix duplicate words of "read"
Git.pm: fix duplicate words of "return"
path: fix duplicate words of "look"
pack-protocol.txt: fix duplicate words of "the"
precompose-utf8: fix typo of "sequences"
split-index: fix typo
worktree.c: fix typo
remote-ext: fix typo
utf8: fix duplicate words of "the"
git-cvsserver: fix duplicate words
Signed-off-by: Li Peng <lip@dtdream.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OPT_COUNTUP() merely increments the counter upon --option, and resets it
to 0 upon --no-option, which means that there is no "unspecified" value
with which a client can initialize the counter to determine whether or
not --[no]-option was seen at all.
Make OPT_COUNTUP() treat any negative number as an "unspecified" value
to address this shortcoming. In particular, if a client initializes the
counter to -1, then if it is still -1 after parse_options(), then
neither --option nor --no-option was seen; if it is 0, then --no-option
was seen last, and if it is 1 or greater, than --option was seen last.
This change does not affect the behavior of existing clients because
they all use the initial value of 0 (or more).
Note that builtin/clean.c initializes the variable used with
OPT__FORCE (which uses OPT_COUNTUP()) to a negative value, but it is set
to either 0 or 1 by reading the configuration before the code calls
parse_options(), i.e. as far as parse_options() is concerned, the
initial value of the variable is not negative.
To test this behavior, in test-parse-options.c, "verbose" is set to
"unspecified" while quiet is set to 0 which will test the new behavior
with all sets of values.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repository set-up sequence has been streamlined (the biggest
change is that there is no longer git_config_early()), so that we
do not attempt to look into refs/* when we know we do not have a
Git repository.
* jk/check-repository-format:
verify_repository_format: mark messages for translation
setup: drop repository_format_version global
setup: unify repository version callbacks
init: use setup.c's repo version verification
setup: refactor repo format reading and verification
config: drop git_config_early
check_repository_format_gently: stop using git_config_early
lazily load core.sharedrepository
wrap shared_repository global in get/set accessors
setup: document check_repository_format()
The repository set-up sequence has been streamlined (the biggest
change is that there is no longer git_config_early()), so that we
do not attempt to look into refs/* when we know we do not have a
Git repository.
* jk/check-repository-format:
verify_repository_format: mark messages for translation
setup: drop repository_format_version global
setup: unify repository version callbacks
init: use setup.c's repo version verification
setup: refactor repo format reading and verification
config: drop git_config_early
check_repository_format_gently: stop using git_config_early
lazily load core.sharedrepository
wrap shared_repository global in get/set accessors
setup: document check_repository_format()
The correct api is trace_printf_key(), not trace_print_key().
Also do not throw a random string at printf(3)-like function;
instead, feed it as a parameter that is fed to a "%s" conversion
specifier.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OPT_CMDMODE mechanism was introduced in the release of 1.8.5 to actively
notice when multiple "operation mode" options that specify mutually
incompatible operation modes are given.
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are no more callers, and it's a rather confusing
interface. This could just be folded into
git_config_with_options(), but for the sake of readability,
we'll leave it as a separate (static) helper function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
The usual pattern for an argv array is to initialize it,
push in some strings, and then clear it when done. Very
occasionally, though, we must do other exotic things with
the memory, like freeing the list but keeping the strings.
Let's provide a detach function so that callers can make use
of our API to build up the array, and then take ownership of
it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CURLAUTH_ANY does not work with proxies which answer unauthenticated requests
with a 307 redirect to an error page instead of a 407 listing supported
authentication methods. Therefore, allow the authentication method to be set
using the environment variable GIT_HTTP_PROXY_AUTHMETHOD or configuration
variables http.proxyAuthmethod and remote.<name>.proxyAuthmethod (in analogy
to http.proxy and remote.<name>.proxy).
The following values are supported:
* anyauth (default)
* basic
* digest
* negotiate
* ntlm
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git daemon" uses "run_command()" without "finish_command()", so it
needs to release resources itself, which it forgot to do.
* rs/daemon-plug-child-leak:
daemon: plug memory leak
run-command: factor out child_process_clear()
Prepare for Git on-disk repository representation to undergo
backward incompatible changes by introducing a new repository
format version "1", with an extension mechanism.
* jk/repository-extension:
introduce "preciousObjects" repository extension
introduce "extensions" form of core.repositoryformatversion
"git daemon" uses "run_command()" without "finish_command()", so it
needs to release resources itself, which it forgot to do.
* rs/daemon-plug-child-leak:
daemon: plug memory leak
run-command: factor out child_process_clear()
Avoid duplication by moving the code to release allocated memory for
arguments and environment to its own function, child_process_clear().
Export it to provide a counterpart to child_process_init().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prepare for Git on-disk repository representation to undergo
backward incompatible changes by introducing a new repository
format version "1", with an extension mechanism.
* jk/repository-extension:
introduce "preciousObjects" repository extension
introduce "extensions" form of core.repositoryformatversion