There are a few things that need to move around a little before
making a big refactoring in the topo-order logic:
1. We need access to record_author_date() and
compare_commits_by_author_date() in revision.c. These are used
currently by sort_in_topological_order() in commit.c.
2. Moving these methods to commit.h requires adding an author_date_slab
declaration to commit.h. Consumers will need their own implementation.
3. The add_parents_to_list() method in revision.c performs logic
around the UNINTERESTING flag and other special cases depending
on the struct rev_info. Allow this method to ignore a NULL 'list'
parameter, as we will not be populating the list for our walk.
Also rename the method to the slightly more generic name
process_parents() to make clear that this method does more than
add to a list (and no list is required anymore).
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git rev-list --topo-order' and its kin, the topo_order
setting in struct rev_info implies the limited setting. This means
that the following things happen during prepare_revision_walk():
* revs->limited implies we run limit_list() to walk the entire
reachable set. There are some short-cuts here, such as if we
perform a range query like 'git rev-list COMPARE..HEAD' and we
can stop limit_list() when all queued commits are uninteresting.
* revs->topo_order implies we run sort_in_topological_order(). See
the implementation of that method in commit.c. It implies that
the full set of commits to order is in the given commit_list.
These two methods imply that a 'git rev-list --topo-order HEAD'
command must walk the entire reachable set of commits _twice_ before
returning a single result.
If we have a commit-graph file with generation numbers computed, then
there is a better way. This patch introduces some necessary logic
redirection when we are in this situation.
In v2.18.0, the commit-graph file contains zero-valued bytes in the
positions where the generation number is stored in v2.19.0 and later.
Thus, we use generation_numbers_enabled() to check if the commit-graph
is available and has non-zero generation numbers.
When setting revs->limited only because revs->topo_order is true,
only do so if generation numbers are not available. There is no
reason to use the new logic as it will behave similarly when all
generation numbers are INFINITY or ZERO.
In prepare_revision_walk(), if we have revs->topo_order but not
revs->limited, then we trigger the new logic. It breaks the logic
into three pieces, to fit with the existing framework:
1. init_topo_walk() fills a new struct topo_walk_info in the rev_info
struct. We use the presence of this struct as a signal to use the
new methods during our walk. In this patch, this method simply
calls limit_list() and sort_in_topological_order(). In the future,
this method will set up a new data structure to perform that logic
in-line.
2. next_topo_commit() provides get_revision_1() with the next topo-
ordered commit in the list. Currently, this simply pops the commit
from revs->commits.
3. expand_topo_walk() provides get_revision_1() with a way to signal
walking beyond the latest commit. Currently, this calls
add_parents_to_list() exactly like the old logic.
While this commit presents method redirection for performing the
exact same logic as before, it allows the next commit to focus only
on the new logic.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rev-list command is critical to Git's functionality. Ensure it
works in the three commit-graph environments constructed in
t6600-test-reach.sh. Here are a few important types of rev-list
operations:
* Basic: git rev-list --topo-order HEAD
* Range: git rev-list --topo-order compare..HEAD
* Ancestry: git rev-list --topo-order --ancestry-path compare..HEAD
* Symmetric Difference: git rev-list --topo-order compare...HEAD
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'test_three_modes' method assumes we are using the 'test-tool
reach' command for our test. However, we may want to use the data
shape of our commit graph and the three modes (no commit-graph,
full commit-graph, partial commit-graph) for other git commands.
Split test_three_modes to be a simple translation on a more general
run_three_modes method that executes the given command and tests
the actual output to the expected output.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When consuming a priority queue, it can be convenient to inspect
the next object that will be dequeued without actually dequeueing
it. Our existing library did not have such a 'peek' operation, so
add it as prio_queue_peek().
Add a reference-level comparison in t/helper/test-prio-queue.c
so this method is exercised by t0009-prio-queue.sh. Further, add
a test that checks the behavior when the compare function is NULL
(i.e. the queue becomes a stack).
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rules used by "git push" and "git fetch" to determine if a ref
can or cannot be updated were inconsistent; specifically, fetching
to update existing tags were allowed even though tags are supposed
to be unmoving anchoring points. "git fetch" was taught to forbid
updates to existing tags without the "--force" option.
* ab/fetch-tags-noclobber:
fetch: stop clobbering existing tags without --force
fetch: document local ref updates with/without --force
push doc: correct lies about how push refspecs work
push doc: move mention of "tag <tag>" later in the prose
push doc: remove confusing mention of remote merger
fetch tests: add a test for clobbering tag behavior
push tests: use spaces in interpolated string
push tests: make use of unused $1 in test description
fetch: change "branch" to "reference" in --force -h output
Fix a bug in which the same path could be registered under multiple
worktree entries if the path was missing (for instance, was removed
manually). Also, as a convenience, expand the number of cases in
which --force is applicable.
* es/worktree-forced-ops-fix:
doc-diff: force worktree add
worktree: delete .git/worktrees if empty after 'remove'
worktree: teach 'remove' to override lock when --force given twice
worktree: teach 'move' to override lock when --force given twice
worktree: teach 'add' to respect --force for registered but missing path
worktree: disallow adding same path multiple times
worktree: prepare for more checks of whether path can become worktree
worktree: generalize delete_git_dir() to reduce code duplication
worktree: move delete_git_dir() earlier in file for upcoming new callers
worktree: don't die() in library function find_worktree()
Dev doc update.
* jk/diff-rendered-docs:
Revert "doc/Makefile: drop doc-diff worktree and temporary files on "make clean""
doc/Makefile: drop doc-diff worktree and temporary files on "make clean"
doc-diff: add --clean mode to remove temporary working gunk
doc-diff: fix non-portable 'man' invocation
doc-diff: always use oids inside worktree
SubmittingPatches: mention doc-diff
Malformed or crafted data in packstream can make our code attempt
to read or write past the allocated buffer and abort, instead of
reporting an error, which has been fixed.
* jk/patch-corrupted-delta-fix:
t5303: use printf to generate delta bases
patch-delta: handle truncated copy parameters
patch-delta: consistently report corruption
patch-delta: fix oob read
t5303: test some corrupt deltas
test-delta: read input into a heap buffer
Hotfix of the base topic.
* jk/pack-objects-with-bitmap-fix:
pack-bitmap: drop "loaded" flag
traverse_bitmap_commit_list(): don't free result
t5310: test delta reuse with bitmaps
bitmap_has_sha1_in_uninteresting(): drop BUG check
"git mailinfo" used in "git am" learned to make a best-effort
recovery of a patch corrupted by MUA that sends text/plain with
format=flawed option.
* rs/mailinfo-format-flowed:
mailinfo: support format=flowed
spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.
* jk/cocci:
show_dirstat: simplify same-content check
read-cache: use oideq() in ce_compare functions
convert hashmap comparison functions to oideq()
convert "hashcmp() != 0" to "!hasheq()"
convert "oidcmp() != 0" to "!oideq()"
convert "hashcmp() == 0" to hasheq()
convert "oidcmp() == 0" to oideq()
introduce hasheq() and oideq()
coccinelle: use <...> for function exclusion
Clarify a part of technical documentation for rerere.
* tg/rerere-doc-updates:
rerere: add note about files with existing conflict markers
rerere: mention caveat about unmatched conflict markers
"git format-patch" learned a new "--range-diff" option to explain
the difference between this version and the previous attempt in
the cover letter (or after the tree-dashes as a comment).
* es/format-patch-rangediff:
format-patch: allow --range-diff to apply to a lone-patch
format-patch: add --creation-factor tweak for --range-diff
format-patch: teach --range-diff to respect -v/--reroll-count
format-patch: extend --range-diff to accept revision range
format-patch: add --range-diff option to embed diff in cover letter
range-diff: relieve callers of low-level configuration burden
range-diff: publish default creation factor
range-diff: respect diff_option.file rather than assuming 'stdout'
"git format-patch" learned a new "--interdiff" option to explain
the difference between this version and the previous atttempt in
the cover letter (or after the tree-dashes as a comment).
* es/format-patch-interdiff:
format-patch: allow --interdiff to apply to a lone-patch
log-tree: show_log: make commentary block delimiting reusable
interdiff: teach show_interdiff() to indent interdiff
format-patch: teach --interdiff to respect -v/--reroll-count
format-patch: add --interdiff option to embed diff in cover letter
format-patch: allow additional generated content in make_cover_letter()
Lift code from GitHub to restrict delta computation so that an
object that exists in one fork is not made into a delta against
another object that does not appear in the same forked repository.
* cc/delta-islands:
pack-objects: move 'layer' into 'struct packing_data'
pack-objects: move tree_depth into 'struct packing_data'
t5320: tests for delta islands
repack: add delta-islands support
pack-objects: add delta-islands support
pack-objects: refactor code into compute_layer_order()
Add delta-islands.{c,h}
"git interpret-trailers" and its underlying machinery had a buggy
code that attempted to ignore patch text after commit log message,
which triggered in various codepaths that will always get the log
message alone and never get such an input.
* jk/trailer-fixes:
append_signoff: use size_t for string offsets
sequencer: ignore "---" divider when parsing trailers
pretty, ref-filter: format %(trailers) with no_divider option
interpret-trailers: allow suppressing "---" divider
interpret-trailers: tighten check for "---" patch boundary
trailer: pass process_trailer_opts to trailer_info_get()
trailer: use size_t for iterating trailer list
trailer: use size_t for string offsets
The color output support for recently introduced "range-diff"
command got tweaked a bit.
* sb/range-diff-colors:
range-diff: indent special lines as context
range-diff: make use of different output indicators
diff.c: add --output-indicator-{new, old, context}
diff.c: rewrite emit_line_0 more understandably
diff.c: omit check for line prefix in emit_line_0
diff: use emit_line_0 once per line
diff.c: add set_sign to emit_line_0
diff.c: reorder arguments for emit_line_ws_markup
diff.c: simplify caller of emit_line_0
t3206: add color test for range-diff --dual-color
test_decode_color: understand FAINT and ITALIC
When creating a thin pack, which allows objects to be made into a
delta against another object that is not in the resulting pack but
is known to be present on the receiving end, the code learned to
take advantage of the reachability bitmap; this allows the server
to send a delta against a base beyond the "boundary" commit.
* jk/pack-delta-reuse-with-bitmap:
pack-objects: reuse on-disk deltas for thin "have" objects
pack-bitmap: save "have" bitmap from walk
t/perf: add perf tests for fetches from a bitmapped server
t/perf: add infrastructure for measuring sizes
t/perf: factor out percent calculations
t/perf: factor boilerplate out of test_perf
The unpack_trees() API used in checking out a branch and merging
walks one or more trees along with the index. When the cache-tree
in the index tells us that we are walking a tree whose flattened
contents is known (i.e. matches a span in the index), as linearly
scanning a span in the index is much more efficient than having to
open tree objects recursively and listing their entries, the walk
can be optimized, which is done in this topic.
* nd/unpack-trees-with-cache-tree:
Document update for nd/unpack-trees-with-cache-tree
cache-tree: verify valid cache-tree in the test suite
unpack-trees: add missing cache invalidation
unpack-trees: reuse (still valid) cache-tree from src_index
unpack-trees: reduce malloc in cache-tree walk
unpack-trees: optimize walking same trees with cache-tree
unpack-trees: add performance tracing
trace.h: support nested performance tracing
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.
* ds/reachable:
commit-reach: correct accidental #include of C file
commit-reach: use can_all_from_reach
commit-reach: make can_all_from_reach... linear
commit-reach: replace ref_newer logic
test-reach: test commit_contains
test-reach: test can_all_from_reach_with_flags
test-reach: test reduce_heads
test-reach: test get_merge_bases_many
test-reach: test is_descendant_of
test-reach: test in_merge_bases
test-reach: create new test tool for ref_newer
commit-reach: move can_all_from_reach_with_flags
upload-pack: generalize commit date cutoff
upload-pack: refactor ok_to_give_up()
upload-pack: make reachable() more generic
commit-reach: move commit_contains from ref-filter
commit-reach: move ref_newer from remote.c
commit.h: remove method declarations
commit-reach: move walk methods from commit.c
"git submodule update" is getting rewritten piece-by-piece into C.
* sb/submodule-update-in-c:
submodule--helper: introduce new update-module-mode helper
submodule--helper: replace connect-gitdir-workingtree by ensure-core-worktree
builtin/submodule--helper: factor out method to update a single submodule
builtin/submodule--helper: store update_clone information in a struct
builtin/submodule--helper: factor out submodule updating
git-submodule.sh: rename unused variables
git-submodule.sh: align error reporting for update mode to use path
Fixes to "git rerere" corner cases, especially when conflict
markers cannot be parsed in the file.
* tg/rerere:
rerere: recalculate conflict ID when unresolved conflict is committed
rerere: teach rerere to handle nested conflicts
rerere: return strbuf from handle path
rerere: factor out handle_conflict function
rerere: only return whether a path has conflicts or not
rerere: fix crash with files rerere can't handle
rerere: add documentation for conflict normalization
rerere: mark strings for translation
rerere: wrap paths in output in sq
rerere: lowercase error messages
rerere: unify error messages when read_cache fails
When there are too many packfiles in a repository (which is not
recommended), looking up an object in these would require
consulting many pack .idx files; a new mechanism to have a single
file that consolidates all of these .idx files is introduced.
* ds/multi-pack-index: (32 commits)
pack-objects: consider packs in multi-pack-index
midx: test a few commands that use get_all_packs
treewide: use get_all_packs
packfile: add all_packs list
midx: fix bug that skips midx with alternates
midx: stop reporting garbage
midx: mark bad packed objects
multi-pack-index: store local property
multi-pack-index: provide more helpful usage info
midx: clear midx on repack
packfile: skip loading index if in multi-pack-index
midx: prevent duplicate packfile loads
midx: use midx in approximate_object_count
midx: use existing midx when writing new one
midx: use midx in abbreviation calculations
midx: read objects from multi-pack-index
config: create core.multiPackIndex setting
midx: write object offsets
midx: write object id fanout chunk
midx: write object ids in a chunk
...
Updated plan to repurpose the "-l" option to "git branch".
* jk/branch-l-1-repurpose:
doc/git-branch: remove obsolete "-l" references
branch: make "-l" a synonym for "--list"
"git rev-list --stdin </dev/null" used to be an error; it now shows
no output without an error. "git rev-list --stdin --default HEAD"
still falls back to the given default when nothing is given on the
standard input.
* jk/rev-list-stdin-noop-is-ok:
rev-list: make empty --stdin not an error
"git checkout -b newbranch [HEAD]" should not have to do as much as
checking out a commit different from HEAD. An attempt is made to
optimize this special case.
* bp/checkout-new-branch-optim:
checkout: optimize "git checkout -b <new_branch>"
Running "git clone" against a project that contain two files with
pathnames that differ only in cases on a case insensitive
filesystem would result in one of the files lost because the
underlying filesystem is incapable of holding both at the same
time. An attempt is made to detect such a case and warn.
* nd/clone-case-smashing-warning:
clone: report duplicate entries on case-insensitive filesystems
This reverts commit 6f924265a0, which
started to require that we have an executable git available in order
to say "make clean", which gives us a chicken-and-egg problem.
Having to have Git installed, or be in a repository, in order to be
able to run an optional "doc-diff" tool is fine. Requiring either
in order to run "make clean" is a different story.
Reported by Jonathan Nieder <jrnieder@gmail.com>.
This is a test of smart HTTP, so it should use the smart HTTP endpoints
(e.g. /info/refs?service=git-receive-pack), not dumb HTTP (HEAD).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The earlier attempt barfed when given a CONTENT_LENGTH that is
set to an empty string. RFC 3875 is fairly clear that in this
case we should not read any message body, but we've been reading
through to the EOF in previous versions (which did not even pay
attention to the environment variable), so keep that behaviour for
now in this late update.
* mk/http-backend-content-length:
http-backend: allow empty CONTENT_LENGTH
This reverts commit 7e25437d35, reversing
changes made to 00624d608c.
v2.19.0-rc0~165^2~1 (submodule: ensure core.worktree is set after
update, 2018-06-18) assumes an "absorbed" submodule layout, where the
submodule's Git directory is in the superproject's .git/modules/
directory and .git in the submodule worktree is a .git file pointing
there. In particular, it uses $GIT_DIR/modules/$name to find the
submodule to find out whether it already has core.worktree set, and it
uses connect_work_tree_and_git_dir if not, resulting in
fatal: could not open sub/.git for writing
The context behind that patch: v2.19.0-rc0~165^2~2 (submodule: unset
core.worktree if no working tree is present, 2018-06-12) unsets
core.worktree when running commands like "git checkout
--recurse-submodules" to switch to a branch without the submodule. If
a user then uses "git checkout --no-recurse-submodules" to switch back
to a branch with the submodule and runs "git submodule update", this
patch is needed to ensure that commands using the submodule directly
are aware of the path to the worktree.
It is late in the release cycle, so revert the whole 3-patch series.
We can try again later for 2.20.
Reported-by: Allan Sandfeld Jensen <allan.jensen@qt.io>
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to RFC3875, empty environment variable is equivalent to unset,
and for CONTENT_LENGTH it should mean zero body to read.
However, unset CONTENT_LENGTH is also used for chunked encoding to indicate
reading until EOF. At least, the test "large fetch-pack requests can be split
across POSTs" from t5551 starts faliing, if unset or empty CONTENT_LENGTH is
treated as zero length body. So keep the existing behavior as much as possible.
Add a test for the case.
Reported-By: Jelmer Vernooij <jelmer@jelmer.uk>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>