"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.
* ll/clone-reject-shallow:
builtin/clone.c: add --reject-shallow option
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.
* tb/reverse-midx:
midx.c: improve cache locality in midx_pack_order_cmp()
pack-revindex: write multi-pack reverse indexes
pack-write.c: extract 'write_rev_file_order'
pack-revindex: read multi-pack reverse indexes
Documentation/technical: describe multi-pack reverse indexes
midx: make some functions non-static
midx: keep track of the checksum
midx: don't free midx_name early
midx: allow marking a pack as preferred
t/helper/test-read-midx.c: add '--show-objects'
builtin/multi-pack-index.c: display usage on unrecognized command
builtin/multi-pack-index.c: don't enter bogus cmd_mode
builtin/multi-pack-index.c: split sub-commands
builtin/multi-pack-index.c: define common usage with a macro
builtin/multi-pack-index.c: don't handle 'progress' separately
builtin/multi-pack-index.c: inline 'flags' with options
Fsck API clean-up.
* ab/fsck-api-cleanup:
fetch-pack: use new fsck API to printing dangling submodules
fetch-pack: use file-scope static struct for fsck_options
fetch-pack: don't needlessly copy fsck_options
fsck.c: move gitmodules_{found,done} into fsck_options
fsck.c: add an fsck_set_msg_type() API that takes enums
fsck.c: pass along the fsck_msg_id in the fsck_error callback
fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
fsck.c: give "FOREACH_MSG_ID" a more specific name
fsck.c: undefine temporary STR macro after use
fsck.c: call parse_msg_type() early in fsck_set_msg_type()
fsck.h: re-order and re-assign "enum fsck_msg_type"
fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
fsck.c: remove (mostly) redundant append_msg_id() function
fsck.c: rename variables in fsck_set_msg_type() for less confusion
fsck.h: use "enum object_type" instead of "int"
fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
fsck.c: refactor and rename common config callback
A few option description strings started with capital letters,
which were corrected.
* cc/downcase-opt-help:
column, range-diff: downcase option description
"git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.
* zh/commit-trailer:
commit: add --trailer option
Plug or annotate remaining leaks that trigger while running the
very basic set of tests.
* ah/plugleaks:
transport: also free remote_refs in transport_disconnect()
parse-options: don't leak alias help messages
parse-options: convert bitfield values to use binary shift
init-db: silence template_dir leak when converting to absolute path
init: remove git_init_db_config() while fixing leaks
worktree: fix leak in dwim_branch()
clone: free or UNLEAK further pointers when finished
reset: free instead of leaking unneeded ref
symbolic-ref: don't leak shortened refname in check_symref()
The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch -v<n>" learned to allow a reroll count that is
not an integer.
* zh/format-patch-fractional-reroll-count:
format-patch: allow a non-integral version numbers
A simple IPC interface gets introduced to build services like
fsmonitor on top.
* jh/simple-ipc:
t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
simple-ipc: add Unix domain socket implementation
unix-stream-server: create unix domain socket under lock
unix-socket: disallow chdir() when creating unix domain sockets
unix-socket: add backlog size option to unix_stream_listen()
unix-socket: eliminate static unix_stream_socket() helper function
simple-ipc: add win32 implementation
simple-ipc: design documentation for new IPC mechanism
pkt-line: add options argument to read_packetized_to_strbuf()
pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
pkt-line: do not issue flush packets in write_packetized_*()
pkt-line: eliminate the need for static buffer in packet_write_gently()
Preparatory API changes for parallel checkout.
* mt/parallel-checkout-part-1:
entry: add checkout_entry_ca() taking preloaded conv_attrs
entry: move conv_attrs lookup up to checkout_entry()
entry: extract update_ce_after_write() from write_entry()
entry: make fstat_output() and read_blob_entry() public
entry: extract a header file for entry.c functions
convert: add classification for conv_attrs struct
convert: add get_stream_filter_ca() variant
convert: add [async_]convert_to_working_tree_ca() variants
convert: make convert_attrs() and convert structs public
When multiple packs in the multi-pack index contain the same object, the
MIDX machinery must make a choice about which pack it associates with
that object. Prior to this patch, the lowest-ordered[1] pack was always
selected.
Pack selection for duplicate objects is relatively unimportant today,
but it will become important for multi-pack bitmaps. This is because we
can only invoke the pack-reuse mechanism when all of the bits for reused
objects come from the reuse pack (in order to ensure that all reused
deltas can find their base objects in the same pack).
To encourage the pack selection process to prefer one pack over another
(the pack to be preferred is the one a caller would like to later use as
a reuse pack), introduce the concept of a "preferred pack". When
provided, the MIDX code will always prefer an object found in a
preferred pack over any other.
No format changes are required to store the preferred pack, since it
will be able to be inferred with a corresponding MIDX bitmap, by looking
up the pack associated with the object in the first bit position (this
ordering is described in detail in a subsequent commit).
[1]: the ordering is specified by MIDX internals; for our purposes we
can consider the "lowest ordered" pack to be "the one with the
most-recent mtime.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some scenarios, users may want more history than the repository
offered for cloning, which happens to be a shallow repository, can
give them. But because users don't know it is a shallow repository
until they download it to local, we may want to refuse to clone
this kind of repository, without creating any unnecessary files.
The '--depth=x' option cannot be used as a solution; the source may
be deep enough to give us 'x' commits when cloned, but the user may
later need to deepen the history to arbitrary depth.
Teach '--reject-shallow' option to "git clone" to abort as soon as
we find out that we are cloning from a shallow repository.
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a new pack with a bitmap, it is sometimes convenient to
indicate some reference prefixes which should receive priority when
selecting which commits to receive bitmaps.
A truly motivated caller could accomplish this by setting
'pack.islandCore', (since all commits in the core island are similarly
marked as preferred) but this requires callers to opt into using delta
islands, which they may or may not want to do.
Introduce a new multi-valued configuration, 'pack.preferBitmapTips' to
allow callers to specify a list of reference prefixes. All references
which have a prefix contained in 'pack.preferBitmapTips' will mark their
tips as "preferred" in the same way as commits are marked as preferred
for selection by 'pack.islandCore'.
The choice of the verb "prefer" is intentional: marking the NEEDS_BITMAP
flag on an object does *not* guarantee that that object will receive a
bitmap. It merely guarantees that that commit will receive a bitmap over
any *other* commit in the same window by bitmap_writer_select_commits().
The test this patch adds reflects this quirk, too. It only tests that
a commit (which didn't receive bitmaps by default) is selected for
bitmaps after changing the value of 'pack.preferBitmapTips' to include
it. Other commits may lose their bitmaps as a byproduct of how the
selection process works (bitmap_writer_select_commits() ignores the
remainder of a window after seeing a commit with the NEEDS_BITMAP flag).
This configuration will aide in selecting important references for
multi-pack bitmaps, since they do not respect the same pack.islandCore
configuration. (They could, but doing so may be confusing, since it is
packs--not bitmaps--which are influenced by the delta-islands
configuration).
In a fork network repository (one which lists all forks of a given
repository as remotes), for example, it is useful to set
pack.preferBitmapTips to 'refs/remotes/<root>/heads' and
'refs/remotes/<root>/tags', where '<root>' is an opaque identifier
referring to the repository which is at the base of the fork chain.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
save_opts() should save any non-default values. It was intended to do
this, but since most options in struct replay_opts default to 0, it only
saved non-zero values. Unfortunately, this does not always work for
options.edit. Roughly speaking, options.edit had a default value of 0
for cherry-pick but a default value of 1 for revert. Make save_opts()
record a value whenever it differs from the default.
options.edit was also overly simplistic; we had more than two cases.
The behavior that previously existed was as follows:
Non-conflict commits Right after Conflict
revert Edit iff isatty(0) Edit (ignore isatty(0))
cherry-pick No edit See above
Specify --edit Edit (ignore isatty(0)) See above
Specify --no-edit (*) See above
(*) Before stopping for conflicts, No edit is the behavior. After
stopping for conflicts, the --no-edit flag is not saved so see
the first two rows.
However, the expected behavior is:
Non-conflict commits Right after Conflict
revert Edit iff isatty(0) Edit iff isatty(0)
cherry-pick No edit Edit iff isatty(0)
Specify --edit Edit (ignore isatty(0)) Edit (ignore isatty(0))
Specify --no-edit No edit No edit
In order to get the expected behavior, we need to change options.edit
to a tri-state: unspecified, false, or true. When specified, we follow
what it says. When unspecified, we need to check whether the current
commit being created is resolving a conflict as well as consulting
options.action and isatty(0). While at it, add a should_edit() utility
function that compresses options.edit down to a boolean based on the
additional information for the non-conflict case.
continue_single_pick() is the function responsible for resuming after
conflict cases, regardless of whether there is one commit being picked
or many. Make this function stop assuming edit behavior in all cases,
so that it can correctly handle !isatty(0) and specific requests to not
edit the commit message.
Reported-by: Renato Botelho <garga@freebsd.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the final hint that we used to have a scripted "git rebase".
* ab/remove-rebase-usebuiltin:
rebase: remove transitory rebase.useBuiltin setting & env
Code simplification by removing support for a caller that is long gone.
* ab/read-tree:
tree.h API: simplify read_tree_recursive() signature
tree.h API: expose read_tree_1() as read_tree_at()
archive: stop passing "stage" through read_tree_recursive()
ls-files: refactor away read_tree()
ls-files: don't needlessly pass around stage variable
tree.c API: move read_tree() into builtin/ls-files.c
ls-files tests: add meaningful --with-tree tests
show tests: add test for "git show <tree>"
We use 'git sparse-checkout init --cone --sparse-index' to toggle the
sparse-index feature. It makes sense to also disable it when running
'git sparse-checkout disable'. This is particularly important because it
removes the extensions.sparseIndex config option, allowing other tools
to use this Git repository again.
This does mean that 'git sparse-checkout init' will not re-enable the
sparse-index feature, even if it was previously enabled.
While testing this feature, I noticed that the sparse-index was not
being written on the first run, but by a second. This was caught by the
call to 'test-tool read-cache --table'. This requires adjusting some
assignments to core_apply_sparse_checkout and pl.use_cone_patterns in
the sparse_checkout_init() logic.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse index extension is used to signal that index writes should be
in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1.
Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that
specifies if the sparse index should be used. It also updates the index
to use the correct format, either way. Add a warning in the
documentation that the use of a repository extension might reduce
compatibility with third-party tools. 'git sparse-checkout init' already
sets extension.worktreeConfig, which places most sparse-checkout users
outside of the scope of most third-party tools.
Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of
GIT_TEST_SPARSE_INDEX=1.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we modify the sparse-checkout definition, we perform index operations
on a pattern_list that only exists in-memory. This allows easy backing
out in case the index update fails.
However, if the index write itself cares about the sparse-checkout
pattern set, we need access to that in-memory copy. Place a pointer to
a 'struct pattern_list' in the index so we can access this on-demand.
This will be used in the next change which uses the sparse-checkout
definition to filter out directories that are outside the sparse cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When given a sub-command that it doesn't understand, 'git
multi-pack-index' dies with the following message:
$ git multi-pack-index bogus
fatal: unrecognized subcommand: bogus
Instead of 'die()'-ing, we can display the usage text, which is much
more helpful:
$ git.compile multi-pack-index bogus
error: unrecognized subcommand: bogus
usage: git multi-pack-index [<options>] write
or: git multi-pack-index [<options>] verify
or: git multi-pack-index [<options>] expire
or: git multi-pack-index [<options>] repack [--batch-size=<size>]
--object-dir <file> object directory containing set of packfile and pack-index pairs
--progress force progress reporting
While we're at it, clean up some duplication between the "no sub-command"
and "unrecognized sub-command" conditionals.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even before the recent refactoring, 'git multi-pack-index' calls
'trace2_cmd_mode()' before verifying that the sub-command is recognized.
Push this call down into the individual sub-commands so that we don't
enter a bogus command mode.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handle sub-commands of the 'git multi-pack-index' builtin (e.g.,
"write", "repack", etc.) separately from one another. This allows
sub-commands with unique options, without forcing cmd_multi_pack_index()
to reject invalid combinations itself.
This comes at the cost of some duplication and boilerplate. Luckily, the
duplication is reduced to a minimum, since common options are shared
among sub-commands due to a suggestion by Ævar. (Sub-commands do have to
retain the common options, too, since this builtin accepts common
options on either side of the sub-command).
Roughly speaking, cmd_multi_pack_index() parses options (including
common ones), and stops at the first non-option, which is the
sub-command. It then dispatches to the appropriate sub-command, which
parses the remaining options (also including common options).
Unknown options are kept by the sub-commands in order to detect their
presence (and complain that too many arguments were given).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out the usage message into pieces corresponding to each mode.
This avoids options specific to one sub-command from being shared with
another in the usage.
A subsequent commit will use these #define macros to have usage
variables for each sub-command without duplicating their contents.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that there is a shared 'flags' member in the options structure,
there is no need to keep track of whether to force progress or not,
since ultimately the decision of whether or not to show a progress meter
is controlled by a bit in the flags member.
Manipulate that bit directly, and drop the now-unnecessary 'progress'
field while we're at it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subcommands of the 'git multi-pack-index' command (e.g., 'write',
'verify', etc.) will want to optionally change a set of shared flags
that are eventually passed to the MIDX libraries.
Right now, options and flags are handled separately. That's fine, since
the options structure is never passed around. But a future patch will
make it so that common options shared by all sub-commands are defined in
a common location. That means that "flags" would have to become a global
variable.
Group it with the options structure so that we reduce the number of
global variables we have overall.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is customary not to begin the help text for each option given to
the parse-options API with a capital letter. Various (sub)commands'
option arrays don't follow the guideline provided by the parse_options
Documentation regarding the descriptions.
Downcase the first word of some option descriptions for "column"
and "range-diff".
Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the check added in 5476e1efde (fetch-pack: print and use
dangling .gitmodules, 2021-02-22) to make use of us now passing the
"msg_id" to the user defined "error_func". We can now compare against
the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated
message.
Let's also replace register_found_gitmodules() with directly
manipulating the "gitmodules_found" member. A recent commit moved it
into "fsck_options" so we could do this here.
I'm sticking this callback in fsck.c. Perhaps in the future we'd like
to accumulate such callbacks into another file (maybe fsck-cb.c,
similar to parse-options-cb.c?), but while we've got just the one
let's just put it into fsck.c.
A better alternative in this case would be some library some more
obvious library shared by fetch-pack.c ad builtin/index-pack.c, but
there isn't such a thing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the behavior of the .gitmodules validation added in
5476e1efde (fetch-pack: print and use dangling .gitmodules,
2021-02-22) so we're using one "fsck_options".
I found that code confusing to read. One might think that not setting
up the error_func earlier means that we're relying on the "error_func"
not being set in some code in between the two hunks being modified
here.
But we're not, all we're doing in the rest of "cmd_index_pack()" is
further setup by calling fsck_set_msg_types(), and assigning to
do_fsck_object.
So there was no reason in 5476e1efde to make a shallow copy of the
fsck_options struct before setting error_func. Let's just do this
setup at the top of the function, along with the "walk" assignment.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change code I added in acf9de4c94 (mktag: use fsck instead of custom
verify_tag(), 2021-01-05) to make use of a new API function that takes
the fsck_msg_{id,type} types, instead of arbitrary strings that
we'll (hopefully) parse into those types.
At the time that the fsck_set_msg_type() API was introduced in
0282f4dced (fsck: offer a function to demote fsck errors to warnings,
2015-06-22) it was only intended to be used to parse user-supplied
data.
For things that are purely internal to the C code it makes sense to
have the compiler check these arguments, and to skip the sanity
checking of the data in fsck_set_msg_type() which is redundant to
checks we get from the compiler.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the fsck_error callback to also pass along the
fsck_msg_id. Before this change the only way to get the message id was
to parse it back out of the "message".
Let's pass it down explicitly for the benefit of callers that might
want to use it, as discussed in [1].
Passing the msg_type is now redundant, as you can always get it back
from the msg_id, but I'm not changing that convention. It's really
common to need the msg_type, and the report() function itself (which
calls "fsck_error") needs to call fsck_msg_type() to discover
it. Let's not needlessly re-do that work in the user callback.
1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new
fsck_msg_type enum.
These defines were originally introduced in:
- ba002f3b28 (builtin-fsck: move common object checking code to
fsck.c, 2008-02-25)
- f50c440730 (fsck: disallow demoting grave fsck errors to warnings,
2015-06-22)
- efaba7cc77 (fsck: optionally ignore specific fsck issues
completely, 2015-06-22)
- f27d05b170 (fsck: allow upgrading fsck warnings to errors,
2015-06-22)
The reason these were defined in two different places is because we
use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are
used by external callbacks.
Untangling that would take some more work, since we expose the new
"enum fsck_msg_type" to both. Similar to "enum object_type" it's not
worth structuring the API in such a way that only those who need
FSCK_{ERROR,WARN} pass around a different type.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the fsck_walk_func to use an "enum object_type" instead of an
"int" type. The types are compatible, and ever since this was added in
355885d531 (add generic, type aware object chain walker, 2008-02-25)
we've used entries from object_type (OBJ_BLOB etc.).
So this doesn't really change anything as far as the generated code is
concerned, it just gives the compiler more information and makes this
easier to read.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit --fixup=<commit>", which was to tweak the changes made
to the contents while keeping the original log message intact,
learned "--fixup=(amend|reword):<commit>", that can be used to
tweak both the message and the contents, and only the message,
respectively.
* cm/rebase-i-fixup-amend-reword:
doc/git-commit: add documentation for fixup=[amend|reword] options
t3437: use --fixup with options to create amend! commit
t7500: add tests for --fixup=[amend|reword] options
commit: add a reword suboption to --fixup
commit: add amend suboption to --fixup to create amend! commit
sequencer: export and rename subject_length()
"git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size). A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.
* tb/geometric-repack:
builtin/pack-objects.c: ignore missing links with --stdin-packs
builtin/repack.c: reword comment around pack-objects flags
builtin/repack.c: be more conservative with unsigned overflows
builtin/repack.c: assign pack split later
t7703: test --geometric repack with loose objects
builtin/repack.c: do not repack single packs with --geometric
builtin/repack.c: add '--geometric' option
packfile: add kept-pack cache for find_kept_pack_entry()
builtin/pack-objects.c: rewrite honor-pack-keep logic
p5303: measure time to repack with keep
p5303: add missing &&-chains
builtin/pack-objects.c: add '--stdin-packs' option
revision: learn '--no-kept-objects'
packfile: introduce 'find_kept_pack_entry()'
As record_reused_object(offset, offset - hashfile_total(out)) said,
reused_chunk.difference should be the offset of original packfile minus
the offset of the generated packfile. But the comment presented an opposite way.
Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the rebase.useBuiltin setting and the now-obsolete
GIT_TEST_REBASE_USE_BUILTIN test flag.
This was left in place after my d03ebd411c (rebase: remove the
rebase.useBuiltin setting, 2019-03-18) to help anyone who'd used the
experimental flag and wanted to know that it was the default, or that
they should transition their test environment to use the builtin
rebase unconditionally.
It's been more than long enough for those users to get a headsup about
this. So remove all the scaffolding that was left inplace after
d03ebd411c. I'm also removing the documentation entry, if anyone
still has this left in their configuration they can do some source
archaeology to figure out what it used to do, which makes more sense
than exposing every git user reading the documentation to this legacy
configuration switch.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `-v<n>` option of `format-patch` can give nothing but an
integral iteration number to patches in a series. Some people,
however, prefer to mark a new iteration with only a small fixup
with a non integral iteration number (e.g. an "oops, that was
wrong" fix-up patch for v4 iteration may be labeled as "v4.1").
Allow `format-patch` to take such a non-integral iteration
number.
`<n>` can be any string, such as '3.1' or '4rev2'. In the case
where it is a non-integral value, the "Range-diff" and "Interdiff"
headers will not include the previous version.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header. While at it let's also move a comment related to
checkout_entry() from entry.c to entry.h as it's more useful to describe
the function there.
Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Historically, Git has supported the 'Signed-off-by' commit trailer
using the '--signoff' and the '-s' option from the command line.
But users may need to provide other trailer information from the
command line such as "Helped-by", "Reported-by", "Mentored-by",
Now implement a new `--trailer <token>[(=|:)<value>]` option to pass
other trailers to `interpret-trailers` and insert them into commit
messages.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git -c core.bare=false clone --bare ..." would have segfaulted,
which has been corrected.
* bc/clone-bare-with-conflicting-config:
builtin/init-db: handle bare clones when core.bare set to false
"git stash show" learned to optionally show untracked part of the
stash.
* dl/stash-show-untracked:
stash show: learn stash.showIncludeUntracked
stash show: teach --include-untracked and --only-untracked
Updates to memory allocation code around the use of pcre2 library.
* ab/grep-pcre2-allocfix:
grep/pcre2: move definitions of pcre2_{malloc,free}
grep/pcre2: move back to thread-only PCREv2 structures
grep/pcre2: actually make pcre2 use custom allocator
grep/pcre2: use pcre2_maketables_free() function
grep/pcre2: use compile-time PCREv2 version test
grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
grep/pcre2: prepare to add debugging to pcre2_malloc()
grep/pcre2: correct reference to grep_init() in comment
grep/pcre2: drop needless assignment to NULL
grep/pcre2: drop needless assignment + assert() on opt->pcre2
Update C code that sets a few configuration variables when a remote
is configured so that it spells configuration variable names in the
canonical camelCase.
* ab/remote-write-config-in-camel-case:
remote: write camel-cased *.pushRemote on rename
remote: add camel-cased *.tagOpt key, like clone
It does not make sense to make ".gitattributes", ".gitignore" and
".mailmap" symlinks, as they are supposed to be usable from the
object store (think: bare repositories where HEAD:.mailmap etc. are
used). When these files are symbolic links, we used to read the
contents of the files pointed by them by mistake, which has been
corrected.
* jk/open-dotgitx-with-nofollow:
mailmap: do not respect symlinks for in-tree .mailmap
exclude: do not respect symlinks for in-tree .gitignore
attr: do not respect symlinks for in-tree .gitattributes
exclude: add flags parameter to add_patterns()
attr: convert "macro_ok" into a flags field
add open_nofollow() helper
transport_get_remote_refs() can populate the transport struct's
remote_refs. transport_disconnect() is already responsible for most of
transport's cleanup - therefore we also take care of freeing remote_refs
there.
There are 2 locations where transport_disconnect() is called before
we're done using the returned remote_refs. This patch changes those
callsites to only call transport_disconnect() after the returned refs
are no longer being used - which is necessary to safely be able to
free remote_refs during transport_disconnect().
This commit fixes the following leak which was found while running
t0000, but is expected to also fix the same pattern of leak in all
locations that use transport_get_remote_refs():
Direct leak of 165 byte(s) in 1 object(s) allocated from:
#0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
#2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
#3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
#4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
#5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
#6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
#7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
#8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
#9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
#10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the signature of read_tree_recursive() to omit the "base",
"baselen" and "stage" arguments. No callers of it use these parameters
for anything anymore.
The last function to call read_tree_recursive() with a non-"" path was
read_tree_recursive() itself, but that was changed in
ffd31f661d (Reimplement read_tree_recursive() using
tree_entry_interesting(), 2011-03-25).
The last user of the "stage" parameter went away in the last commit,
and even that use was mere boilerplate.
So let's remove those and rename the read_tree_recursive() function to
just read_tree(). We had another read_tree() function that I've
refactored away in preceding commits, since all in-tree users read
trees recursively with a callback we can change the name to signify
that this is the norm.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor away the read_tree() function into its only user,
overlay_tree_on_index().
First, change read_one_entry_opt() to use the strbuf parameter
read_tree_recursive() passes down in place. This finishes up a partial
refactoring started in 6a0b0b6de9 (tree.c: update read_tree_recursive
callback to pass strbuf as base, 2014-11-30).
Moving the rest into overlay_tree_on_index() makes this index juggling
we're doing easier to read.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that read_tree() has been moved to ls-files.c we can get rid of
the stage != 1 case that'll never happen.
Let's not use read_tree_recursive() as a pass-through to pass "stage =
1" either. For now we'll pass an unused "stage = 0" for consistency
with other read_tree_recursive() callers, that argument will be
removed in a follow-up commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the read_tree() API was added around the same time as
read_tree_recursive() in 94537c78a8 (Move "read_tree()" to
"tree.c"[...], 2005-04-22) and b12ec373b8 ([PATCH] Teach read-tree
about commit objects, 2005-04-20) things have gradually migrated over
to the read_tree_recursive() version.
Now builtin/ls-files.c is the last user of this code, let's move all
the relevant code there. This allows for subsequent simplification of
it, and an eventual move to read_tree_recursive().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a variety of questions users might ask while resolving
conflicts:
* What changes have been made since the previous (first) parent?
* What changes are staged?
* What is still unstaged? (or what is still conflicted?)
* What changes did I make to resolve conflicts so far?
The first three of these have simple answers:
* git diff HEAD
* git diff --cached
* git diff
There was no way to answer the final question previously. Adding one
is trivial in merge-ort, since it works by creating a tree representing
what should be written to the working copy complete with conflict
markers. Simply write that tree to .git/AUTO_MERGE, allowing users to
answer the fourth question with
* git diff AUTO_MERGE
I avoided using a name like "MERGE_AUTO", because that would be
merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD,
CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on
which type of operation the merge was part of.
Ensure that paths which clean out other temporary operation-specific
files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory)
also clean out this AUTO_MERGE file.
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.
* tb/git-mv-icase-fix:
git mv foo FOO ; git mv foo bar gave an assert
CALLOC_ARRAY() macro replaces many uses of xcalloc().
* rs/calloc-array:
cocci: allow xcalloc(1, size)
use CALLOC_ARRAY
git-compat-util.h: drop trailing semicolon from macro definition
"git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well. This regression
has been corrected.
* jk/bisect-peel-tag-fix:
bisect: peel annotated tags to commits
When 'git pack-objects --stdin-packs' encounters a commit in a pack, it
marks it as a starting point of a best-effort reachability traversal
that is used to populate the name-hash of the objects listed in the
given packs.
The traversal expects that it should be able to walk the ancestors of
all commits in a pack without issue. Ordinarily this is the case, but it
is possible to having missing parents from an unreachable part of the
repository. In that case, we'd consider any missing objects in the
unreachable portion of the graph to be junk.
This should be handled gracefully: since the traversal is best-effort
(i.e., we don't strictly need to fill in all of the name-hash fields),
we should simply ignore any missing links.
This patch does that (by setting the 'ignore_missing_links' bit on the
rev_info struct), and ensures we don't regress in the future by adding a
test which demonstrates this case.
It is a little over-eager, since it will also ignore missing links in
reachable parts of the packs (which would indicate a corrupted
repository), but '--stdin-packs' is explicitly *not* about reachability.
So this step isn't making anything worse for a repository which contains
packs missing reachable objects (since we never drop objects with
'--stdin-packs').
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor code I recently changed in 1f3299fda9 (fsck: make
fsck_config() re-usable, 2021-01-05) so that I could use fsck's config
callback in mktag in 1f3299fda9 (fsck: make fsck_config() re-usable,
2021-01-05).
I don't know what I was thinking in structuring the code this way, but
it clearly makes no sense to have an fsck_config_internal() at all
just so it can get a fsck_options when git_config() already supports
passing along some void* data.
Let's just make use of that instead, which gets us rid of the two
wrapper functions, and brings fsck's common config callback in line
with other such reusable config callbacks.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch fixes a bug where git-bisect doesn't handle receiving
annotated tags as "git bisect good <tag>", etc. It's a regression in
27257bc466 (bisect--helper: reimplement `bisect_state` & `bisect_head`
shell functions in C, 2020-10-15).
The original shell code called:
sha=$(git rev-parse --verify "$rev^{commit}") ||
die "$(eval_gettext "Bad rev input: \$rev")"
which will peel the input to a commit (or complain if that's not
possible). But the C code just calls get_oid(), which will yield the oid
of the tag.
The fix is to peel to a commit. The error message here is a little
non-idiomatic for Git (since it starts with a capital). I've mostly left
it, as it matches the other converted messages (like the "Bad rev input"
we print when get_oid() fails), though I did add an indication that it
was the peeling that was the problem. It might be worth taking a pass
through this converted code to modernize some of the error messages.
Note also that the test does a bare "grep" (not i18ngrep) on the
expected "X is the first bad commit" output message. This matches the
rest of the test script.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calls to `chdir()` are dangerous in a multi-threaded context. If
`unix_stream_listen()` or `unix_stream_connect()` is given a socket
pathname that is too long to fit in a `sockaddr_un` structure, it will
`chdir()` to the parent directory of the requested socket pathname,
create the socket using a relative pathname, and then `chdir()` back.
This is not thread-safe.
Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
flag is set.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update `unix_stream_listen()` to take an options structure to override
default behaviors. This commit includes the size of the `listen()` backlog.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git commit --fixup=reword:<commit>` aliases
`--fixup=amend:<commit> --only`, where it creates an empty "amend!"
commit that will reword <commit> without changing its contents when
it is rebased with `--autosquash`.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git commit --fixup=amend:<commit>` will create an "amend!" commit.
The resulting commit message subject will be "amend! ..." where
"..." is the subject line of <commit> and the initial message
body will be <commit>'s message.
The "amend!" commit when rebased with --autosquash will fixup the
contents and replace the commit message of <commit> with the
"amend!" commit's message body.
In order to prevent rebase from creating commits with an empty
message we refuse to create an "amend!" commit if commit message
body is empty.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
template_dir starts off pointing to either argv or nothing. However if
the value supplied in argv is a relative path, absolute_pathdup() is
used to turn it into an absolute path. absolute_pathdup() allocates
a new string, and we then "leak" it when cmd_init_db() completes.
We don't bother to actually free the return value (instead we UNLEAK
it), because there's no significant advantage to doing so here.
Correctly freeing it would require more significant changes to code flow
which would be more noisy than beneficial.
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The primary goal of this change is to stop leaking init_db_template_dir.
This leak can happen because:
1. git_init_db_config() allocates new memory into init_db_template_dir
without first freeing the existing value.
2. init_db_template_dir might already contain data, either because:
2.1 git_config() can be invoked twice with this callback in a single
process - at least 2 allocations are likely.
2.2 A single git_config() allocation can invoke the callback multiple
times for a given key (see further explanation in the function
docs) - each of those calls will trigger another leak.
The simplest fix for the leak would be to free(init_db_template_dir)
before overwriting it. Instead we choose to convert to fetching
init.templatedir via git_config_get_value() as that is more explicit,
more efficient, and avoids allocations (the returned result is owned by
the config cache, so we aren't responsible for freeing it).
If we remove init_db_template_dir, git_init_db_config() ends up being
responsible only for forwarding core.* config values to
platform_core_config(). However platform_core_config() already ignores
non-core.* config values, so we can safely remove git_init_db_config()
and invoke git_config() directly with platform_core_config() as the
callback.
The platform_core_config forwarding was originally added in:
287853392a (mingw: respect core.hidedotfiles = false in git-init again, 2019-03-11
And I suspect the potential for a leak existed since the original
implementation of git_init_db_config in:
90b45187ba (Add `init.templatedir` configuration variable., 2010-02-17)
LSAN output from t0001:
Direct leak of 73 byte(s) in 1 object(s) allocated from:
#0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9a7276 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
#2 0x9362ad in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
#3 0x936eaa in strbuf_add /home/ahunt/oss-fuzz/git/strbuf.c:295:2
#4 0x868112 in strbuf_addstr /home/ahunt/oss-fuzz/git/./strbuf.h:304:2
#5 0x86a8ad in expand_user_path /home/ahunt/oss-fuzz/git/path.c:758:2
#6 0x720bb1 in git_config_pathname /home/ahunt/oss-fuzz/git/config.c:1287:10
#7 0x5960e2 in git_init_db_config /home/ahunt/oss-fuzz/git/builtin/init-db.c:161:11
#8 0x7255b8 in configset_iter /home/ahunt/oss-fuzz/git/config.c:1982:7
#9 0x7253fc in repo_config /home/ahunt/oss-fuzz/git/config.c:2311:2
#10 0x725ca7 in git_config /home/ahunt/oss-fuzz/git/config.c:2399:2
#11 0x593e8d in create_default_files /home/ahunt/oss-fuzz/git/builtin/init-db.c:225:2
#12 0x5935c6 in init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:449:11
#13 0x59588e in cmd_init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:714:9
#14 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#15 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#16 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#17 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#18 0x69c4de in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#19 0x7f23552d6349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make sure that we release the temporary strbuf during dwim_branch() for
all codepaths (and not just for the early return).
This leak appears to have been introduced in:
f60a7b763f (worktree: teach "add" to check out existing branches, 2018-04-24)
Note that UNLEAK(branchname) is still needed: the returned result is
used in add(), and is stored in a pointer which is used to point at one
of:
- a string literal ("HEAD")
- member of argv (whatever the user specified in their invocation)
- or our newly allocated string returned from dwim_branch()
Fixing the branchname leak isn't impossible, but does not seem
worthwhile given that add() is called directly from cmd_main(), and
cmd_main() returns immediately thereafter - UNLEAK is good enough.
This leak was found when running t0001 with LSAN, see also LSAN output
below:
Direct leak of 60 byte(s) in 1 object(s) allocated from:
#0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ab076 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
#2 0x939fcd in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
#3 0x93af53 in strbuf_splice /home/ahunt/oss-fuzz/git/strbuf.c:239:3
#4 0x83559a in strbuf_check_branch_ref /home/ahunt/oss-fuzz/git/object-name.c:1593:2
#5 0x6988b9 in dwim_branch /home/ahunt/oss-fuzz/git/builtin/worktree.c:454:20
#6 0x695f8f in add /home/ahunt/oss-fuzz/git/builtin/worktree.c:525:19
#7 0x694a04 in cmd_worktree /home/ahunt/oss-fuzz/git/builtin/worktree.c:1036:10
#8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#12 0x69caee in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#13 0x7f7b7dd10349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of these pointers can safely be freed when cmd_clone() completes,
therefore we make sure to free them. The one exception is that we
have to UNLEAK(repo) because it can point either to argv[0], or a
malloc'd string returned by absolute_pathdup().
We also have to free(path) in the middle of cmd_clone(): later during
cmd_clone(), path is unconditionally overwritten with a different path,
triggering a leak. Freeing the first path immediately after use (but
only in the case where it contains data) seems like the cleanest
solution, as opposed to freeing it unconditionally before path is reused
for another path. This leak appears to have been introduced in:
f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17)
These leaks were found when running t0001 with LSAN, see also an excerpt
of the LSAN output below (the full list is omitted because it's far too
long, and mostly consists of indirect leakage of members of the refs we
are freeing).
Direct leak of 178 byte(s) in 1 object(s) allocated from:
#0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
#2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
#3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
#4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
#5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
#6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 165 byte(s) in 1 object(s) allocated from:
#0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
#2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
#3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
#4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21
#5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17
#6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 178 byte(s) in 1 object(s) allocated from:
#0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
#2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
#3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
#4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
#5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
#6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 165 byte(s) in 1 object(s) allocated from:
#0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
#2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
#3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
#4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
#5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
#6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
#7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
#8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
#9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
#10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 105 byte(s) in 1 object(s) allocated from:
#0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
#2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
#3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3
#4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4
#5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2
#6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10
#7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dwim_ref() allocs a new string into ref. Instead of setting to NULL to
discard it, we can FREE_AND_NULL.
This leak appears to have been introduced in:
4cf76f6bbf (builtin/reset: compute checkout metadata for reset, 2020-03-16)
This leak was found when running t0001 with LSAN, see also LSAN output below:
Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9a7108 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
#2 0x8add6b in expand_ref /home/ahunt/oss-fuzz/git/refs.c:670:12
#3 0x8ad777 in repo_dwim_ref /home/ahunt/oss-fuzz/git/refs.c:644:22
#4 0x6394af in dwim_ref /home/ahunt/oss-fuzz/git/./refs.h:162:9
#5 0x637e5c in cmd_reset /home/ahunt/oss-fuzz/git/builtin/reset.c:426:4
#6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#10 0x69c5ce in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#11 0x7f57ebb9d349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
shorten_unambiguous_ref() returns an allocated string. We have to
track it separately from the const refname.
This leak has existed since:
9ab55daa55 (git symbolic-ref --delete $symref, 2012-10-21)
This leak was found when running t0001 with LSAN, see also LSAN output
below:
Direct leak of 19 byte(s) in 1 object(s) allocated from:
#0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ab048 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
#2 0x8b452f in refs_shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c
#3 0x8b47e8 in shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c:1287:9
#4 0x679fce in check_symref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:28:14
#5 0x679ad8 in cmd_symbolic_ref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:70:9
#6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
#7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
#8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
#9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
#10 0x69cc6e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
#11 0x7f98388a4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead. It shortens the code and infers the
element size automatically.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 552955ed7f ("clone: use more conventional config/option layering",
2020-10-01), clone learned to read configuration options earlier in its
execution, before creating the new repository. However, that led to a
problem: if the core.bare setting is set to false in the global config,
cloning a bare repository segfaults. This happens because the
repository is falsely thought to be non-bare, but clone has set the work
tree to NULL, which is then dereferenced.
The code to initialize the repository already considers the fact that a
user might want to override the --bare option for git init, but it
doesn't take into account clone, which uses a different option. Let's
just check that the work tree is not NULL, since that's how clone
indicates that the repository is bare. This is also the case for git
init, so we won't be regressing that case.
Reported-by: Joseph Vusich <jvusich@amazon.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit teaches `git stash show --include-untracked`. It
may be desirable for a user to be able to always enable the
--include-untracked behavior. Teach the stash.showIncludeUntracked
config option which allows users to do this in a similar manner to
stash.showPatch.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stash entries can be made with untracked files via
`git stash push --include-untracked`. However, because the untracked
files are stored in the third parent of the stash entry and not the
stash entry itself, running `git stash show` does not include the
untracked files as part of the diff.
With --include-untracked, untracked paths, which are recorded in the
third-parent if it exists, are shown in addition to the paths that have
modifications between the stash base and the working tree in the stash.
It is possible to manually craft a malformed stash entry where duplicate
untracked files in the stash entry will mask tracked files. We detect
and error out in that case via a custom unpack_trees() callback:
stash_worktree_untracked_merge().
Also, teach stash the --only-untracked option which only shows the
untracked files of a stash entry. This is similar to `git show stash^3`
but it is nice to provide a convenient abstraction for it so that users
do not have to think about the underlying implementation.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment in this block is meant to indicate that passing '--all',
'--reflog', and so on aren't necessary when repacking with the
'--geometric' option.
But, it has two problems: first, it is factually incorrect ('--all' is
*not* incompatible with '--stdin-packs' as the comment suggests);
second, it is quite focused on the geometric case for a block that is
guarding against it.
Reword this comment to address both issues.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a number of places in the geometric repack code where we
multiply the number of objects in a pack by another unsigned value. We
trust that the number of objects in a pack is always representable by a
uint32_t, but we don't necessarily trust that that number can be
multiplied without overflow.
Sprinkle some unsigned_add_overflows() and unsigned_mult_overflows() in
split_pack_geometry() to check that we never overflow any unsigned types
when adding or multiplying them.
Arguably these checks are a little too conservative, and certainly they
do not help the readability of this function. But they are serving a
useful purpose, so I think they are worthwhile overall.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To determine the where to place the split when repacking with the
'--geometric' option, split_pack_geometry() assigns the "split" variable
and then decrements it in a loop.
It would be equivalent (and more readable) to assign the split to the
loop position after exiting the loop, so do that instead.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 0fabafd0b9 (builtin/repack.c: add '--geometric' option, 2021-02-22),
the 'git repack --geometric' code aborts early when there is zero or one
pack.
When there are no packs, this code does the right thing by placing the
split at "0". But when there is exactly one pack, the split is placed at
"1", which means that "git repack --geometric" (with any factor)
repacks all of the objects in a single pack.
This is wasteful, and the remaining code in split_pack_geometry() does
the right thing (not repacking the objects in a single pack) even when
only one pack is present.
Loosen the guard to only stop when there aren't any packs, and let the
rest of the code do the right thing. Add a test to ensure that this is
the case.
Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following sequence, on a case-insensitive file system,
(strictly speeking with core.ignorecase=true)
leads to an assertion failure and leaves .git/index.lock behind.
git init
echo foo >foo
git add foo
git mv foo FOO
git mv foo bar
This regression was introduced in Commit 9b906af657,
"git-mv: improve error message for conflicted file"
The bugfix is to change the "file exist case-insensitive in the index"
into the correct "file exist (case-sensitive) in the index".
This avoids the "assert" later in the code and keeps setting up the
"ce" pointer for ce_stage(ce) done in the next else if.
This fixes
https://github.com/git-for-windows/git/issues/2920
Reported-By: Dan Moseley <Dan.Moseley@microsoft.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example. Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.
* jt/transfer-fsck-across-packs:
fetch-pack: print and use dangling .gitmodules
fetch-pack: with packfile URIs, use index-pack arg
http-fetch: allow custom index-pack args
http: allow custom index-pack args
"git push $there --delete ''" should have been diagnosed as an
error, but instead turned into a matching push, which has been
corrected.
* jc/push-delete-nothing:
push: do not turn --delete '' into a matching push
The "git maintenance register" command had trouble registering bare
repositories, which had been corrected.
* es/maintenance-of-bare-repositories:
maintenance: fix incorrect `maintenance.repo` path with bare repository
Various fixes on "git add --chmod".
* mt/add-chmod-fixes:
add: propagate --chmod errors to exit status
add: mark --chmod error string for translation
add --chmod: don't update index when --dry-run is used
"git rebase --[no-]fork-point" gained a configuration variable
rebase.forkPoint so that users do not have to keep specifying a
non-default setting.
* ah/rebase-no-fork-point-config:
rebase: add a config option for --no-fork-point
"git grep" has been tweaked to be limited to the sparse checkout
paths.
* mt/grep-sparse-checkout:
grep: honor sparse-checkout on working tree searches
"git {diff,log} --{skip,rotate}-to=<path>" allows the user to
discard diff output for early paths or move them to the end of the
output.
* jc/diffcore-rotate:
diff: --{rotate,skip}-to=<path>
The error codepath around the "--temp/--prefix" feature of "git
checkout-index" has been improved.
* mt/checkout-index-corner-cases:
checkout-index: omit entries with no tempname from --temp output
write_entry(): fix misuses of `path` in error messages
When a remote is renamed don't change the canonical "*.pushRemote"
form to "*.pushremote". Fixes and tests for a minor bug in
923d4a5ca4 (remote rename/remove: handle branch.<name>.pushRemote
config values, 2020-01-27). See the preceding commit for why this does
& doesn't matter.
While we're at it let's also test that we handle the "*.pushDefault"
key correctly. The code to handle that was added in
b3fd6cbf29 (remote rename/remove: gently handle remote.pushDefault
config, 2020-02-01) and does the right thing, but nothing tested that
we wrote out the canonical camel-cased form.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change "git remote add" so that it adds a *.tagOpt key, and not the
lower-cased *.tagopt on "git remote add --no-tags", just as "git clone
--no-tags" would do.
This doesn't matter for anything that reads the config. It's just
prettier if we write config keys in their documented camelCase form to
user-readable config files.
When I added support for "clone -no-tags" in 0dab2468ee (clone: add a
--no-tags option to clone without tags, 2017-04-26) I made it use
the *.tagOpt form, but the older "git remote add" added in
111fb85865 (remote add: add a --[no-]tags option, 2010-04-20) has
been using *.tagopt all this time.
It's easy enough to add a test for this, so let's do that. We can't
use "git config -l" there, because it'll normalize the keys to their
lower-cased form. Let's add the test for "git clone" too for good
measure, not just to the "git remote" codepath we're fixing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If `add` encounters an error while applying the --chmod changes, it
prints a message to stderr, but exits with a success code. This might
have been an oversight, as the command does exit with a non-zero code in
other situations where it cannot (or refuses to) update all of the
requested paths (e.g. when some of the given paths are ignored). So make
the exit behavior more consistent by also propagating --chmod errors to
the exit status.
Note: the test "all statuses changed in folder if . is given" uses paths
added by previous test cases, some of which might be symbolic links.
Because `git add --chmod` will now fail with such paths, this test would
depend on whether all the previous tests were executed, or only some
of them. Avoid that by running the test on a fresh repo with only
regular files.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This error message is intended for humans, so mark it for translation.
Also use error() instead of fprintf(stderr, ...), to make the
corresponding line a bit cleaner, and to display the "error:" prefix,
which helps classifying the nature/severity of the message.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git add --chmod` applies the mode changes even when `--dry-run` is
used. Fix that and add some tests for this option combination.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some users (myself included) would prefer to have this feature off by
default because it can silently drop commits.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we added a syntax sugar "git push remote --delete <ref>" to
"git push" as a synonym to the canonical "git push remote :<ref>"
syntax at f517f1f2 (builtin-push: add --delete as syntactic sugar
for :foo, 2009-12-30), we weren't careful enough to make sure that
<ref> is not empty.
Blindly rewriting "--delete <ref>" to ":<ref>" means that an empty
string <ref> results in refspec ":", which is the syntax to ask for
"matching" push that does not delete anything.
Worse yet, if there were matching refs that can be fast-forwarded,
they would have been published prematurely, even if the user feels
that they are not ready yet to be pushed out, which would be a real
disaster.
Noticed-by: Tilman Vogel <tilman.vogel@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an error message informs the user about an incorrect command
invocation, it should refer to "arguments", not "parameters".
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The periodic maintenance tasks configured by `git maintenance start`
invoke `git for-each-repo` to run `git maintenance run` on each path
specified by the multi-value global configuration variable
`maintenance.repo`. Because `git for-each-repo` will likely be run
outside of the repositories which require periodic maintenance, it is
mandatory that the repository paths specified by `maintenance.repo` are
absolute.
Unfortunately, however, `git maintenance register` does nothing to
ensure that the paths it assigns to `maintenance.repo` are indeed
absolute, and may in fact -- especially in the case of a bare repository
-- assign a relative path to `maintenance.repo` instead. Fix this
problem by converting all paths to absolute before assigning them to
`maintenance.repo`.
While at it, also fix `git maintenance unregister` to convert paths to
absolute, as well, in order to ensure that it can correctly remove from
`maintenance.repo` a path assigned via `git maintenance register`.
Reported-by: Clement Moyroud <clement.moyroud@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Often it is useful to both:
- have relatively few packfiles in a repository, and
- avoid having so few packfiles in a repository that we repack its
entire contents regularly
This patch implements a '--geometric=<n>' option in 'git repack'. This
allows the caller to specify that they would like each pack to be at
least a factor times as large as the previous largest pack (by object
count).
Concretely, say that a repository has 'n' packfiles, labeled P1, P2,
..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'.
With a geometric factor of 'r', it should be that:
objects(Pi) > r*objects(P(i-1))
for all i in [1, n], where the packs are sorted by
objects(P1) <= objects(P2) <= ... <= objects(Pn).
Since finding a true optimal repacking is NP-hard, we approximate it
along two directions:
1. We assume that there is a cutoff of packs _before starting the
repack_ where everything to the right of that cut-off already forms
a geometric progression (or no cutoff exists and everything must be
repacked).
2. We assume that everything smaller than the cutoff count must be
repacked. This forms our base assumption, but it can also cause
even the "heavy" packs to get repacked, for e.g., if we have 6
packs containing the following number of objects:
1, 1, 1, 2, 4, 32
then we would place the cutoff between '1, 1' and '1, 2, 4, 32',
rolling up the first two packs into a pack with 2 objects. That
breaks our progression and leaves us:
2, 1, 2, 4, 32
^
(where the '^' indicates the position of our split). To restore a
progression, we move the split forward (towards larger packs)
joining each pack into our new pack until a geometric progression
is restored. Here, that looks like:
2, 1, 2, 4, 32 ~> 3, 2, 4, 32 ~> 5, 4, 32 ~> ... ~> 9, 32
^ ^ ^ ^
This has the advantage of not repacking the heavy-side of packs too
often while also only creating one new pack at a time. Another wrinkle
is that we assume that loose, indexed, and reflog'd objects are
insignificant, and lump them into any new pack that we create. This can
lead to non-idempotent results.
Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have find_kept_pack_entry(), we don't have to manually keep
hunting through every pack to find a possible "kept" duplicate of the
object. This should be faster, assuming only a portion of your total
packs are actually kept.
Note that we have to re-order the logic a bit here; we can deal with the
disqualifying situations first (e.g., finding the object in a non-local
pack with --local), then "kept" situation(s), and then just fall back to
other "--local" conditions.
Here are the results from p5303 (measurements again taken on the
kernel):
Test HEAD^ HEAD
-----------------------------------------------------------------------------------------------
5303.5: repack (1) 57.26(54.59+10.84) 57.34(54.66+10.88) +0.1%
5303.6: repack with kept (1) 57.33(54.80+10.51) 57.38(54.83+10.49) +0.1%
5303.11: repack (50) 71.54(88.57+4.84) 71.70(88.99+4.74) +0.2%
5303.12: repack with kept (50) 85.12(102.05+4.94) 72.58(89.61+4.78) -14.7%
5303.17: repack (1000) 216.87(490.79+14.57) 217.19(491.72+14.25) +0.1%
5303.18: repack with kept (1000) 665.63(938.87+15.76) 246.12(520.07+14.93) -63.0%
and the --stdin-packs timings:
5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00) 0.00(0.00+0.00) -100.0%
5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24) 3.43(11.75+0.24) -2.8%
5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10) 130.50(307.15+7.66) -33.4%
So our repack with an empty .keep pack is roughly as fast as one without
a .keep pack up to 50 packs. But the --stdin-packs case scales a little
better, too.
Notably, it is faster than a repack of the same size and a kept pack. It
looks at fewer objects, of course, but the penalty for looking at many
packs isn't as costly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In an upcoming commit, 'git repack' will want to create a pack comprised
of all of the objects in some packs (the included packs) excluding any
objects in some other packs (the excluded packs).
This caller could iterate those packs themselves and feed the objects it
finds to 'git pack-objects' directly over stdin, but this approach has a
few downsides:
- It requires every caller that wants to drive 'git pack-objects' in
this way to implement pack iteration themselves. This forces the
caller to think about details like what order objects are fed to
pack-objects, which callers would likely rather not do.
- If the set of objects in included packs is large, it requires
sending a lot of data over a pipe, which is inefficient.
- The caller is forced to keep track of the excluded objects, too, and
make sure that it doesn't send any objects that appear in both
included and excluded packs.
But the biggest downside is the lack of a reachability traversal.
Because the caller passes in a list of objects directly, those objects
don't get a namehash assigned to them, which can have a negative impact
on the delta selection process, causing 'git pack-objects' to fail to
find good deltas even when they exist.
The caller could formulate a reachability traversal themselves, but the
only way to drive 'git pack-objects' in this way is to do a full
traversal, and then remove objects in the excluded packs after the
traversal is complete. This can be detrimental to callers who care
about performance, especially in repositories with many objects.
Introduce 'git pack-objects --stdin-packs' which remedies these four
concerns.
'git pack-objects --stdin-packs' expects a list of pack names on stdin,
where 'pack-xyz.pack' denotes that pack as included, and
'^pack-xyz.pack' denotes it as excluded. The resulting pack includes all
objects that are present in at least one included pack, and aren't
present in any excluded pack.
To address the delta selection problem, 'git pack-objects --stdin-packs'
works as follows. First, it assembles a list of objects that it is going
to pack, as above. Then, a reachability traversal is started, whose tips
are any commits mentioned in included packs. Upon visiting an object, we
find its corresponding object_entry in the to_pack list, and set its
namehash parameter appropriately.
To avoid the traversal visiting more objects than it needs to, the
traversal is halted upon encountering an object which can be found in an
excluded pack (by marking the excluded packs as kept in-core, and
passing --no-kept-objects=in-core to the revision machinery).
This can cause the traversal to halt early, for example if an object in
an included pack is an ancestor of ones in excluded packs. But stopping
early is OK, since filling in the namehash fields of objects in the
to_pack list is only additive (i.e., having it helps the delta selection
process, but leaving it blank doesn't impact the correctness of the
resulting pack).
Even still, it is unlikely that this hurts us much in practice, since
the 'git repack --geometric' caller (which is introduced in a later
commit) marks small packs as included, and large ones as excluded.
During ordinary use, the small packs usually represent pushes after a
large repack, and so are unlikely to be ancestors of objects that
already exist in the repository.
(I found it convenient while developing this patch to have 'git
pack-objects' report the number of objects which were visited and got
their namehash fields filled in during traversal. This is also included
in the below patch via trace2 data lines).
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A small memleak in "diff -I<regexp>" has been corrected.
* ab/diff-deferred-free:
diff: plug memory leak from regcomp() on {log,diff} -I
diff: add an API for deferred freeing
Signed commits and tags now allow verification of objects, whose
two object names (one in SHA-1, the other in SHA-256) are both
signed.
* bc/signed-objects-with-both-hashes:
gpg-interface: remove other signature headers before verifying
ref-filter: hoist signature parsing
commit: allow parsing arbitrary buffers with headers
gpg-interface: improve interface for parsing tags
commit: ignore additional signatures when parsing signed commits
ref-filter: switch some uses of unsigned long to size_t
Documentation, code and test clean-up around "git stash".
* dl/stash-cleanup:
stash: declare ref_stash as an array
t3905: use test_cmp() to check file contents
t3905: replace test -s with test_file_not_empty
t3905: remove nested git in command substitution
t3905: move all commands into test cases
t3905: remove spaces after redirect operators
git-stash.txt: be explicit about subcommand options
Teach index-pack to print dangling .gitmodules links after its "keep" or
"pack" line instead of declaring an error, and teach fetch-pack to check
such lines printed.
This allows the tree side of the .gitmodules link to be in one packfile
and the blob side to be in another without failing the fsck check,
because it is now fetch-pack which checks such objects after all
packfiles have been downloaded and indexed (and not index-pack on an
individual packfile, as it is before this commit).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git reflog expire --stale-fix" can be used to repair the reflog by
removing entries that refer to objects that have been pruned away,
but was not careful to tolerate missing objects.
* js/reflog-expire-stale-fix:
reflog expire --stale-fix: be generous about missing objects
"git grep --untracked" is meant to be "let's ALSO find in these
files on the filesystem" when looking for matches in the working
tree files, and does not make any sense if the primary search is
done against the index, or the tree objects. The "--cached" and
"--untracked" options have been marked as mutually incompatible.
* mt/grep-cached-untracked:
grep: error out if --untracked is used with --cached
The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.
* js/range-diff-one-side-only:
range-diff: offer --left-only/--right-only options
range-diff: move the diffopt initialization down one layer
range-diff: combine all options in a single data structure
range-diff: simplify code spawning `git log`
range-diff: libify the read_patches() function again
range-diff: avoid leaking memory in two error code paths
There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.
* js/range-diff-wo-dotdot:
range-diff(docs): explain how to specify commit ranges
range-diff/format-patch: handle commit ranges other than A..B
range-diff/format-patch: refactor check for commit range
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository. The protocol v2 learned how to do so.
* jt/clone-unborn-head:
clone: respect remote unborn HEAD
connect, transport: encapsulate arg in struct
ls-refs: report unborn targets of symrefs
Piecemeal of rewrite of "git bisect" in C continues.
* mr/bisect-in-c-4:
bisect--helper: retire `--check-and-set-terms` subcommand
bisect--helper: reimplement `bisect_skip` shell function in C
bisect--helper: retire `--bisect-auto-next` subcommand
bisect--helper: use `res` instead of return in BISECT_RESET case option
bisect--helper: retire `--bisect-write` subcommand
bisect--helper: reimplement `bisect_replay` shell function in C
bisect--helper: reimplement `bisect_log` shell function in C
Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init().
This change brings it in line with how the rest of the pcre2_* members
in the grep_pat structure are set up.
As noted in the preceding commit the approach 513f2b0bbd (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.
The approach of creating a global context in grep_init() is just added
complexity for almost zero gain. On my system it's 24 bytes saved
per-thread. For comparison PCREv2 will then go on to allocate at least
a kilobyte for its own thread-local state.
As noted in 6d423dd542 (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.
So let's remove this special case and make all of them thread-local
again for simplicity. With this change we could move the
pcre2_{malloc,free} functions around to live closer to their current
use. I'm not doing that here to keep this change small, that cleanup
will be done in a follow-up commit.
See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.
1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git blame --color-by-age`, the determine_line_heat() is called to
select how to color the output based on the commit's author date. It
uses the get_commit_info() to parse the information into a `commit_info`
structure, however, this is actually unnecessary because the
determine_line_heat() caller also does the same.
Instead, let's change the determine_line_heat() to take a `commit_info`
structure and remove the internal call to get_commit_info() thus
cleaning up and optimizing the code path.
Enabling Git's trace2 API in order to record the execution time for
every call to determine_line_heat() function:
+ trace2_region_enter("blame", "determine_line_heat", the_repository);
determine_line_heat(ent, &default_color);
+ trace2_region_enter("blame", "determine_line_heat", the_repository);
Then, running `git blame` for "kernel/fork.c" in linux.git and summing
all the execution time for every call (around 1.3k calls) resulted in
2.6x faster execution (best out 3):
git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
git built from 328c109303 + this change = 16ms
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With --temp (or --stage=all, which implies --temp), checkout-index
writes a list to stdout associating temporary file names to the entries'
names. But if it fails to write an entry, and the failure happens before
even assigning a temporary filename to that entry, we get an odd output
line. This can be seen when trying to check out a symlink whose blob is
missing:
$ missing_blob=$(git hash-object --stdin </dev/null)
$ git update-index --add --cacheinfo 120000,$missing_blob,foo
$ git checkout-index --temp foo
error: unable to read sha1 file of foo (e69de29bb2)
foo
The 'TAB foo' line is not much useful and it might break scripts that
expect the 'tempname TAB foo' output. So let's omit such entries from
the stdout list (but leaving the error message on stderr).
We could also consider omitting _all_ failed entries from the output
list, but that's probably not a good idea as the associated tempfiles
may have been created even when checkout failed, so scripts may want to
use the output list for cleanup.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a number of callers of add_patterns() and its sibling
functions. Let's give them a "flags" parameter for adding new options
without having to touch each caller. We'll use this in a future patch to
add O_NOFOLLOW support. But for now each caller just passes 0.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the implementation of "git difftool", there is a case where the
user wants to start viewing the diffs at a specific path and
continue on to the rest, optionally wrapping around to the
beginning. Since it is somewhat cumbersome to implement such a
feature as a post-processing step of "git diff" output, let's
support it internally with two new options.
- "git diff --rotate-to=C", when the resulting patch would show
paths A B C D E without the option, would "rotate" the paths to
shows patch to C D E A B instead. It is an error when there is
no patch for C is shown.
- "git diff --skip-to=C" would instead "skip" the paths before C,
and shows patch to C D E. Again, it is an error when there is no
patch for C is shown.
- "git log [-p]" also accepts these two options, but it is not an
error if there is no change to the specified path. Instead, the
set of output paths are rotated or skipped to the specified path
or the first path that sorts after the specified path.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands are started from a subdirectory, they may have to
compare the path to the subdirectory (called prefix and found out
from $(pwd)) with the tracked paths. On macOS, $(pwd) and
readdir() yield decomposed path, while the tracked paths are
usually normalized to the precomposed form, causing mismatch. This
has been fixed by taking the same approach used to normalize the
command line arguments.
* tb/precompose-prefix-too:
MacOS: precompose_argv_prefix()
Introduce an on-disk file to record revindex for packdata, which
traditionally was always created on the fly and only in-core.
* tb/pack-revindex-on-disk:
t5325: check both on-disk and in-memory reverse index
pack-revindex: ensure that on-disk reverse indexes are given precedence
t: support GIT_TEST_WRITE_REV_INDEX
t: prepare for GIT_TEST_WRITE_REV_INDEX
Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
builtin/pack-objects.c: respect 'pack.writeReverseIndex'
builtin/index-pack.c: write reverse indexes
builtin/index-pack.c: allow stripping arbitrary extensions
pack-write.c: prepare to write 'pack-*.rev' files
packfile: prepare for the existence of '*.rev' files
Save sizeof(const char *) bytes by declaring ref_stash as an array
instead of having a redundant pointer to an array.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can sometimes be useful to see which refs are contributing to the
overall repository size (e.g., does some branch have a bunch of objects
not found elsewhere in history, which indicates that deleting it would
shrink the size of a clone).
You can find that out by generating a list of objects, getting their
sizes from cat-file, and then summing them, like:
git rev-list --objects --no-object-names main..branch
git cat-file --batch-check='%(objectsize:disk)' |
perl -lne '$total += $_; END { print $total }'
Though note that the caveats from git-cat-file(1) apply here. We "blame"
base objects more than their deltas, even though the relationship could
easily be flipped. Still, it can be a useful rough measure.
But one problem is that it's slow to run. Teaching rev-list to sum up
the sizes can be much faster for two reasons:
1. It skips all of the piping of object names and sizes.
2. If bitmaps are in use, for objects that are in the
bitmapped packfile we can skip the oid_object_info()
lookup entirely, and just ask the revindex for the
on-disk size.
This patch implements a --disk-usage option which produces the same
answer in a fraction of the time. Here are some timings using a clone of
torvalds/linux:
[rev-list piped to cat-file, no bitmaps]
$ time git rev-list --objects --no-object-names --all |
git cat-file --buffer --batch-check='%(objectsize:disk)' |
perl -lne '$total += $_; END { print $total }'
1459938510
real 0m29.635s
user 0m38.003s
sys 0m1.093s
[internal, no bitmaps]
$ time git rev-list --disk-usage --objects --all
1459938510
real 0m31.262s
user 0m30.885s
sys 0m0.376s
Even though the wall-clock time is slightly worse due to parallelism,
notice the CPU savings between the two. We saved 21% of the CPU just by
avoiding the pipes.
But the real win is with bitmaps. If we use them without the new option:
[rev-list piped to cat-file, bitmaps]
$ time git rev-list --objects --no-object-names --all --use-bitmap-index |
git cat-file --batch-check='%(objectsize:disk)' |
perl -lne '$total += $_; END { print $total }'
1459938510
real 0m6.244s
user 0m8.452s
sys 0m0.311s
then we're faster to generate the list of objects, but we still spend a
lot of time piping and looking things up. But if we do both together:
[internal, bitmaps]
$ time git rev-list --disk-usage --objects --all --use-bitmap-index
1459938510
real 0m0.219s
user 0m0.169s
sys 0m0.049s
then we get the same answer much faster.
For "--all", that answer will correspond closely to "du objects/pack",
of course. But we're actually checking reachability here, so we're still
fast when we ask for more interesting things:
$ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10
374798628
real 0m0.429s
user 0m0.356s
sys 0m0.072s
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever a user runs `git reflog expire --stale-fix`, the most likely
reason is that their repository is at least _somewhat_ corrupt. Which
means that it is more than just possible that some objects are missing.
If that is the case, that can currently let the command abort through
the phase where it tries to mark all reachable objects.
Instead of adding insult to injury, let's be gentle and continue as best
as we can in such a scenario, simply by ignoring the missing objects and
moving on.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a diff_free() function to free anything we may have allocated in
the "diff_options" struct, and the ability to make calling it a noop
by setting "no_free" in "diff_options".
This is required because when e.g. "git diff" is run we'll allocate
things in that struct, use the diff machinery once, and then exit.
But if we run e.g. "git log -p" we're going to re-use what we
allocated across multiple diff_flush() calls, and only want to free
things at the end.
We've thus ended up with features like the recently added "diff -I"[1]
where we'll leak memory. As it turns out it could have simply used the
pattern established in 6ea57703f6 (log: prepare log/log-tree to reuse
the diffopt.close_file attribute, 2016-06-22).
Manually adding more such flags to things log_tree_commit() every time
we need to allocate something would be tedious. Let's instead move
that fclose() code it to a new diff_free(), in anticipation of freeing
more things in that function in follow-up commits.
Some functions such as log_tree_commit() need an idiom of optionally
retaining a previous "no_free", as they may either free the memory
themselves, or their caller may do so. I'm keeping that idiom in
log_show_early() for good measure, even though I don't think it's
currently called in this manner. It also gets passed an existing
"struct rev_info", so future callers may want to set the "no_free"
flag.
This change is a bit hard to read because while the freeing pattern
we're introducing isn't unusual, the "file" member is a special
snowflake. We usually don't want to fclose() it. This is because
"file" is usually stdout, in which case we don't want to fclose()
it. We only want to opt-in to closing it when we e.g. open a file on
the filesystem. Thus the opt-in "close_file" flag.
So the API in general just needs a "no_free" flag to defer freeing,
but the "file" member still needs its "close_file" flag. This is made
more confusing because while refactoring this code we could replace
some "close_file=0" with "no_free=1", whereas others need to set both
flags.
This is because there were some cases where an existing "close_file=0"
meant "let's defer deallocation", and others where it meant "we don't
want to close this file handle at all".
1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes,
2020-10-20)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a function which parses a buffer with a signature at the end,
parse_signature, and this function is used for signed tags. However,
we'll need to store values for multiple algorithms, and we'll do this by
using a header for the non-default algorithm.
Adjust the parse_signature interface to store the parsed data in two
strbufs and turn the existing function into parse_signed_buffer. The
latter is still used in places where we know we always have a signed
buffer, such as push certs.
Adjust all the callers to deal with this new interface.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cleaning various codepaths up.
* ds/more-index-cleanups:
t1092: test interesting sparse-checkout scenarios
test-lib: test_region looks for trace2 regions
sparse-checkout: load sparse-checkout patterns
name-hash: use trace2 regions for init
repository: add repo reference to index_state
fsmonitor: de-duplicate BUG()s around dirty bits
cache-tree: extract subtree_pos()
cache-tree: simplify verify_cache() prototype
cache-tree: clean up cache_tree_update()
Lose the debugging aid that may have been useful in the past, but
no longer is, in the "grep" codepaths.
* ab/lose-grep-debug:
grep/log: remove hidden --debug and --grep-debug options
Code clean-up to ensure our use of hashtables using object names as
keys use the "struct object_id" objects, not the raw hash values.
* jk/use-oid-pos:
oid_pos(): access table through const pointers
hash_pos(): convert to oid_pos()
rerere: use strmap to store rerere directories
rerere: tighten rr-cache dirname check
rerere: check dirname format while iterating rr_cache directory
commit_graft_pos(): take an oid instead of a bare hash
On a sparse checked out repository, `git grep` (without --cached) ends
up searching the cache when an entry matches the search pathspec and has
the SKIP_WORKTREE bit set. This is confusing both because the sparse
paths are not expected to be in a working tree search (as they are not
checked out), and because the output mixes working tree and cache
results without distinguishing them. (Note that grep also resorts to the
cache on working tree searches that include --assume-unchanged paths.
But the whole point in that case is to assume that the contents of the
index entry and the file are the same. This does not apply to the case
of sparse paths, where the file isn't even expected to be present.)
Fix that by teaching grep to honor the sparse-checkout rules for working
tree searches. If the user wants to grep paths outside the current
sparse-checkout definition, they may either update the sparsity rules to
materialize the files, or use --cached to search all blobs registered in
the index.
Note: it might also be interesting to add a configuration option that
allow users to search paths that are present despite having the
SKIP_WORKTREE bit set, and/or to restrict searches in the index and past
revisions too. These ideas are left as future improvements to avoid
conflicting with other sparse-checkout topics currently in flight.
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the 'maintenance.strategy' config option is set to 'incremental',
a default maintenance schedule is enabled. Add the 'pack-refs' task to
that strategy at the weekly cadence.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is valuable to collect loose refs into a more compressed form. This
is typically the packed-refs file, although this could be the reftable
in the future. Having packed refs can be extremely valuable in repos
with many tags or remote branches that are not modified by the local
user, but still are necessary for other queries.
For instance, with many exploded refs, commands such as
git describe --tags --exact-match HEAD
can be very slow (multiple seconds). This command in particular is used
by terminal prompts to show when a detatched HEAD is pointing to an
existing tag, so having it be slow causes significant delays for users.
Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all
--prune' to move loose refs into a packed form. For now, that is the
packed-refs file, but could adjust to other file formats in the future.
This is the first of several sub-tasks of the 'gc' task that could be
extracted to their own tasks. In this process, we should not change the
behavior of the 'gc' task since that remains the default way to keep
repositories maintained. Creating a new task for one of these sub-tasks
only provides more customization options for those choosing to not use
the 'gc' task. It is certainly possible to have both the 'gc' and
'pack-refs' tasks enabled and run regularly. While they may repeat
effort, they do not conflict in a destructive way.
The 'auto_condition' function pointer is left NULL for now. We could
extend this in the future to have a condition check if pack-refs should
be run during 'git maintenance run --auto'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The options --untracked and --cached are not compatible, but if they are
used together, grep just silently ignores --cached and searches the
working tree. Error out, instead, to avoid any potential confusion.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.
* ab/branch-sort:
branch: show "HEAD detached" first under reverse sort
branch: sort detached HEAD based on a flag
ref-filter: move ref_sorting flags to a bitfield
ref-filter: move "cmp_fn" assignment into "else if" arm
ref-filter: add braces to if/else if/else chain
branch tests: add to --sort tests
branch: change "--local" to "--list" in comment
When comparing commit ranges, one is frequently interested only in one
side, such as asking the question "Has this patch that I submitted to
the Git mailing list been applied?": one would only care about the part
of the output that corresponds to the commits in a local branch.
To make that possible, imitate the `git rev-list` options `--left-only`
and `--right-only`.
This addresses https://github.com/gitgitgadget/git/issues/206
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will make it easier to implement the `--left-only` and
`--right-only` options.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "pack-objects" command needs to iterate over all the tags when
automatic tag following is enabled, but it actually iterated over
all refs and then discarded everything outside "refs/tags/"
hierarchy, which was quite wasteful.
* jv/pack-objects-narrower-ref-iteration:
builtin/pack-objects.c: avoid iterating all refs
When removing many branches and tags, the code used to do so one
ref at a time. There is another API it can use to delete multiple
refs, and it makes quite a lot of performance difference when the
refs are packed.
* ph/use-delete-refs:
use delete_refs when deleting tags or branches
"git ls-files" can and does show multiple entries when the index is
unmerged, which is a source for confusion unless -s/-u option is in
use. A new option --deduplicate has been introduced.
* zh/ls-files-deduplicate:
ls-files.c: add --deduplicate option
ls_files.c: consolidate two for loops into one
ls_files.c: bugfix for --deleted and --modified
"git log" learned a new "--diff-merges=<how>" option.
* so/log-diff-merge: (32 commits)
t4013: add tests for --diff-merges=first-parent
doc/git-show: include --diff-merges description
doc/rev-list-options: document --first-parent changes merges format
doc/diff-generate-patch: mention new --diff-merges option
doc/git-log: describe new --diff-merges options
diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
diff-merges: add old mnemonic counterparts to --diff-merges
diff-merges: let new options enable diff without -p
diff-merges: do not imply -p for new options
diff-merges: implement new values for --diff-merges
diff-merges: make -m/-c/--cc explicitly mutually exclusive
diff-merges: refactor opt settings into separate functions
diff-merges: get rid of now empty diff_merges_init_revs()
diff-merges: group diff-merge flags next to each other inside 'rev_info'
diff-merges: split 'ignore_merges' field
diff-merges: fix -m to properly override -c/--cc
t4013: add tests for -m failing to override -c/--cc
t4013: support test_expect_failure through ':failure' magic
diff-merges: revise revs->diff flag handling
diff-merges: handle imply -p on -c/--cc logic for log.c
...
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.
* ds/for-each-repo-noopfix:
for-each-repo: do nothing on empty config
"git stash" did not work well in a sparsely checked out working
tree.
* en/stash-apply-sparse-checkout:
stash: fix stash application in sparse-checkouts
stash: remove unnecessary process forking
t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().
In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test clean-up plus UI improvement by hiding extra refs that
the prefetch task uses from "log --decorate" output.
* ds/maintenance-prefetch-cleanup:
t7900: clean up some broken refs
maintenance: set log.excludeDecoration durin prefetch
The `--check-and-set-terms` subcommand is no longer from the
git-bisect.sh shell script. Instead the function
`check_and_set_terms()` is called from the C implementation.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reimplement the `bisect_skip()` shell function in C and also add
`bisect-skip` subcommand to `git bisect--helper` to call it from
git-bisect.sh
Using `--bisect-skip` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.
Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>