Using the commit-graph file to walk commit history removes the large
cost of parsing commits during the walk. This exposes a performance
issue: lookup_tree() takes a large portion of the computation time,
even when Git never uses those trees.
In anticipation of lazy-loading these trees, rename the 'tree' member
of struct commit to 'maybe_tree'. This serves two purposes: it hints
at the future role of possibly being NULL even if the commit has a
valid tree, and it allows for unambiguous transformation from simple
member access (i.e. commit->maybe_tree) to method access.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff" family of commands learned "--find-object=<object-id>" option
to limit the findings to changes that involve the named object.
* sb/diff-blobfind-pickaxe:
diff: use HAS_MULTI_BITS instead of counting bits manually
diff: properly error out when combining multiple pickaxe options
diffcore: add a pickaxe option to find a specific blob
diff: introduce DIFF_PICKAXE_KINDS_MASK
diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit
diff.h: make pickaxe_opts an unsigned bit field
Currently the check whether to perform pickaxing is done via checking
`diffopt->pickaxe`, which contains the command line argument that we
want to pickaxe for. Soon we'll introduce a new type of pickaxing, that
will not store anything in the `.pickaxe` field, so let's migrate the
check to be dependent on pickaxe_opts.
It is not enough to just replace the check for pickaxe by pickaxe_opts,
because flags might be set, but pickaxing was not requested ('-i').
To cope with that, introduce a mask to check only for the bits indicating
the modes of operation.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A single-word "unsigned flags" in the diff options is being split
into a structure with many bitfields.
* bw/diff-opt-impl-to-bitfields:
diff: make struct diff_flags members lowercase
diff: remove DIFF_OPT_CLR macro
diff: remove DIFF_OPT_SET macro
diff: remove DIFF_OPT_TST macro
diff: remove touched flags
diff: add flag to indicate textconv was set via cmdline
diff: convert flags to be stored in bitfields
add, reset: use DIFF_OPT_SET macro to set a diff flag
Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_SET(&E, fld)
+ E.flags.fld = 1
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_SET(ptr, fld)
+ ptr->flags.fld = 1
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_TST(&E, fld)
+ E.flags.fld
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_TST(ptr, fld)
+ ptr->flags.fld
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the set of parallel touched flags are no longer being used,
remove them.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-show is unique in that it wants to use textconv by default except
for when it is showing blobs. When asked to show a blob, show doesn't
want to use textconv unless the user explicitly requested that it be
used by providing the command line flag '--textconv'.
Currently this is done by using a parallel set of 'touched' flags which
get set every time a particular flag is set or cleared. In a future
patch we want to eliminate this parallel set of flags so instead of
relying on if the textconv flag has been touched, add a new flag
'TEXTCONV_SET_VIA_CMDLINE' which is only set if textconv is set to true
via the command line.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We cannot add many more flags to the diff machinery due to the
limitations of the number of flags that can be stored in a single
unsigned int. In order to allow for more flags to be added to the diff
machinery in the future this patch converts the flags to be stored in
bitfields in 'struct diff_flags'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of these functions just pass the hash member of a
struct object_id, so convert them to use a pointer to struct object_id
directly. Insert a check for NULL in expand_ref on a temporary basis;
this check can be removed when resolve_ref_unsafe is converted as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to get rid of some write-only variables, among them seven
SHA1 buffers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase", especially when it is run by mistake and ends up
trying to replay many changes, spent long time in silence. The
command has been taught to show progress report when it spends
long time preparing these many changes to replay (which would give
the user a chance to abort with ^C).
* kw/rebase-progress:
rebase: turn on progress option by default for format-patch
format-patch: have progress option while generating patches
Code clean-up.
* ma/parse-maybe-bool:
parse_decoration_style: drop unused argument `var`
treewide: deprecate git_config_maybe_bool, use git_parse_maybe_bool
config: make git_{config,parse}_maybe_bool equivalent
config: introduce git_parse_maybe_bool_text
t5334: document that git push --signed=1 does not work
Doc/git-{push,send-pack}: correct --sign= to --signed=
When generating patches for the rebase command, if the user does
not realize the branch they are rebasing onto is thousands of
commits different, there is no progress indication after initial
rewinding message.
The progress meter as presented in this patch assumes the thousands of
patches to have a fine granularity as well as assuming to require all
the same amount of work/time for each, such that a steady progress bar
is achieved.
We do not want to estimate the time for each patch based e.g.
on their size or number of touched files (or parents) as that is too
expensive for just a progress meter.
This patch allows a progress option to be passed to format-patch
so that the user can be informed the progress of generating the
patch. This option is then used by the rebase command when
calling format-patch.
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Numerous bugs in walking of reflogs via "log -g" and friends have
been fixed.
* jk/reflog-walk:
reflog-walk: apply --since/--until to reflog dates
reflog-walk: stop using fake parents
rev-list: check reflog_info before showing usage
get_revision_1(): replace do-while with an early return
log: do not free parents when walking reflog
log: clarify comment about reflog cycles
revision: disallow reflog walking with revs->limited
t1414: document some reflog-walk oddities
The only difference between these is that the former takes an argument
`name` which it ignores completely. Still, the callers are quite careful
to provide reasonable values for it.
Once in-flight topics have landed, we should be able to remove
git_config_maybe_bool. In the meantime, document it as deprecated in the
technical documentation. While at it, document git_parse_maybe_bool.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the flags for get_oid_with_context and friends to use "OID"
instead of "SHA1" in their names.
This transform was made by running the following one-liner on the
affected files:
perl -pi -e 's/GET_SHA1/GET_OID/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert code that divides and rounds up to use DIV_ROUND_UP to make the
intent clearer and reduce the number of magic constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're doing a reflog walk (instead of walking the
actual parent pointers), we may see commits multiple times.
For this reason, we hold on to the commit buffer for each
commit rather than freeing it after we've showed the commit.
We should do the same for the parent list. Right now this is
just a minor optimization. But once we refactor how reflog
walks are performed, keeping the parents will avoid
confusing us the second time we see the commit.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're walking reflogs, we leave the commit buffer and
parents in place. A comment explains that this is due to
"cycles". But the interesting thing is the unsaid
implication: that the cycles (plus our clearing of the SEEN
flag) will cause us to show commits multiple times. Let's
spell it out.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The result from "git diff" that compares two blobs, e.g. "git diff
$commit1:$path $commit2:$path", used to be shown with the full
object name as given on the command line, but it is more natural to
use the $path in the output and use it to look up .gitattributes.
* jk/diff-blob:
diff: use blob path for blob/file diffs
diff: use pending "path" if it is available
diff: use the word "path" instead of "name" for blobs
diff: pass whole pending entry in blobinfo
handle_revision_arg: record paths for pending objects
handle_revision_arg: record modes for "a..b" endpoints
t4063: add tests of direct blob diffs
get_sha1_with_context: dynamically allocate oc->path
get_sha1_with_context: always initialize oc->symlink_path
sha1_name: consistently refer to object_context as "oc"
handle_revision_arg: add handle_dotdot() helper
handle_revision_arg: hoist ".." check out of range parsing
handle_revision_arg: stop using "dotdot" as a generic pointer
handle_revision_arg: simplify commit reference lookups
handle_revision_arg: reset "dotdot" consistently
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We often try to open a file for reading whose existence is
optional, and silently ignore errors from open/fopen; report such
errors if they are not due to missing files.
* nd/fopen-errors:
mingw_fopen: report ENOENT for invalid file names
mingw: verify that paths are not mistaken for remote nicknames
log: fix memory leak in open_next_file()
rerere.c: move error_errno() closer to the source system call
print errno when reporting a system call error
wrapper.c: make warn_on_inaccessible() static
wrapper.c: add and use fopen_or_warn()
wrapper.c: add and use warn_on_fopen_errors()
config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too
config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
clone: use xfopen() instead of fopen()
use xfopen() in more places
git_fopen: fix a sparse 'not declared' warning
Setting "log.decorate=false" in the configuration file did not take
effect in v2.13, which has been corrected.
* ah/log-decorate-default-to-auto:
builtin/log: honor log.decorate
The result from "git diff" that compares two blobs, e.g. "git diff
$commit1:$path $commit2:$path", used to be shown with the full
object name as given on the command line, but it is more natural to
use the $path in the output and use it to look up .gitattributes.
* jk/diff-blob:
diff: use blob path for blob/file diffs
diff: use pending "path" if it is available
diff: use the word "path" instead of "name" for blobs
diff: pass whole pending entry in blobinfo
handle_revision_arg: record paths for pending objects
handle_revision_arg: record modes for "a..b" endpoints
t4063: add tests of direct blob diffs
get_sha1_with_context: dynamically allocate oc->path
get_sha1_with_context: always initialize oc->symlink_path
sha1_name: consistently refer to object_context as "oc"
handle_revision_arg: add handle_dotdot() helper
handle_revision_arg: hoist ".." check out of range parsing
handle_revision_arg: stop using "dotdot" as a generic pointer
handle_revision_arg: simplify commit reference lookups
handle_revision_arg: reset "dotdot" consistently
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (53 commits)
object: convert parse_object* to take struct object_id
tree: convert parse_tree_indirect to struct object_id
sequencer: convert do_recursive_merge to struct object_id
diff-lib: convert do_diff_cache to struct object_id
builtin/ls-tree: convert to struct object_id
merge: convert checkout_fast_forward to struct object_id
sequencer: convert fast_forward_to to struct object_id
builtin/ls-files: convert overlay_tree_on_cache to object_id
builtin/read-tree: convert to struct object_id
sha1_name: convert internals of peel_onion to object_id
upload-pack: convert remaining parse_object callers to object_id
revision: convert remaining parse_object callers to object_id
revision: rename add_pending_sha1 to add_pending_oid
http-push: convert process_ls_object and descendants to object_id
refs/files-backend: convert many internals to struct object_id
refs: convert struct ref_update to use struct object_id
ref-filter: convert some static functions to struct object_id
Convert struct ref_array_item to struct object_id
Convert the verify_pack callback to struct object_id
Convert lookup_tag to struct object_id
...
When a sha1 lookup returns the tree path via "struct
object_context", it just copies it into a fixed-size buffer.
This means the result can be truncated, and it means our
"struct object_context" consumes a lot of stack space.
Instead, let's allocate a string on the heap. Because most
callers don't care about this information, we'll avoid doing
it by default (so they don't all have to start calling
free() on the result). There are basically two options for
the caller to signal to us that it's interested:
1. By setting a pointer to storage in the object_context.
2. By passing a flag in another parameter.
Doing (1) would match the way that sha1_object_info_extended()
works. But it would mean that every caller would have to
initialize the object_context, which they don't currently
have to do.
This patch does (2), and adds a new bit to the function's
flags field. All of the callers that look at the "path"
field are updated to pass the new flag.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Setting "log.decorate=false" in the configuration file did not take
effect in v2.13, which has been corrected.
* ah/log-decorate-default-to-auto:
builtin/log: honor log.decorate
The recent change that introduced autodecorating of refs accidentally
broke the ability of users to set log.decorate = false to override it.
When the git_log_config was traversed a second time with an option other
than log.decorate, the decoration style would be set to the automatic
style, even if the user had already overridden it. Instead of setting
the option in config parsing, set it in init_log_defaults instead.
Add a test for this case. The actual additional config option doesn't
matter, but it needs to be something not already set in the
configuration file.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.
Introduce a temporary in parse_object buffer in order to convert this
function. This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted. Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.
parse_object_buffer will lose this temporary in a later patch.
This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)
@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)
@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)
@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)
@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.
There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.
So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default behaviour of "git log" in an interactive session has
been changed to enable "--decorate".
* ah/log-decorate-default-to-auto:
log: if --decorate is not given, default to --decorate=auto
The prefix_filename() function returns a pointer to static
storage, which makes it easy to use dangerously. We already
fixed one buggy caller in hash-object recently, and the
calls in apply.c are suspicious (I didn't dig in enough to
confirm that there is a bug, but we call the function once
in apply_all_patches() and then again indirectly from
parse_chunk()).
Let's make it harder to get wrong by allocating the return
value. For simplicity, we'll do this even when the prefix is
empty (and we could just return the original file pointer).
That will cause us to allocate sometimes when we wouldn't
otherwise need to, but this function isn't called in
performance critical code-paths (and it already _might_
allocate on any given call, so a caller that cares about
performance is questionable anyway).
The downside is that the callers need to remember to free()
the result to avoid leaking. Most of them already used
xstrdup() on the result, so we know they are OK. The
remainder have been converted to use free() as appropriate.
I considered retaining a prefix_filename_unsafe() for cases
where we know the static lifetime is OK (and handling the
cleanup is awkward). This is only a handful of cases,
though, and it's not worth the mental energy in worrying
about whether the "unsafe" variant is OK to use in any
situation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function takes the prefix as a ptr/len pair, but in
every caller the length is exactly strlen(ptr). Let's
simplify the interface and just take the string. This saves
callers specifying it (and in some cases handling a NULL
prefix).
In a handful of cases we had the length already without
calling strlen, so this is technically slower. But it's not
likely to matter (after all, if the prefix is non-empty
we'll allocate and copy it into a buffer anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the email-style subject prefix (e.g. "Subject: [PATCH] ") directly
when it's needed instead of letting log_write_email_headers() prepare
it in a static buffer in advance. This simplifies storage ownership and
code flow.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>