"git pull --rebase" on a corrupt HEAD caused a segfault. In
general we substitute an empty tree object when running the in-core
equivalent of the diff-index command, and the codepath has been
corrected to do so as well to fix this issue.
* jk/has-uncommitted-changes-fix:
has_uncommitted_changes(): fall back to empty tree
If has_uncommitted_changes() can't resolve HEAD (e.g.,
because it's unborn or corrupt), then we end up calling
run_diff_index() with an empty revs.pending array. This
causes a segfault, as run_diff_index() blindly looks at the
first pending item.
Fixing this raises a question of fault: should
run_diff_index() handle this case, or is the caller wrong to
pass an empty pending list?
Looking at the other callers of run_diff_index(), they
handle this in one of three ways:
- they resolve the object themselves, and avoid doing the
diff if it's not valid
- they resolve the object themselves, and fall back to the
empty tree
- they use setup_revisions(), which will die() if the
object isn't valid
Since this is the only broken caller, that argues that the
fix should go there. Falling back to the empty tree makes
sense here, as we'd claim uncommitted changes if and only if
the index is non-empty. This may be a little funny in the
case of corruption (the corrupt HEAD probably _isn't_
empty), but:
- we don't actually know the reason here that HEAD didn't
resolve (the much more likely case is that we have an
unborn HEAD, in which case the empty tree comparison is
the right thing)
- this matches how other code, like "git diff", behaves
While we're thinking about it, let's add an assertion to
run_diff_index(). It should always be passed a single
object, and as this bug shows, it's easy to get it wrong
(and an assertion is easier to hunt down than a segfault, or
a quietly ignored extra tree).
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This option is supposed to fix the diff of "diff-files" (not reporting
ita entries as new files) and "diff-index --cached <tree>" (showing ita
entries as present in the index with empty content) but not
"diff-index <tree>".
When --ita-invisible-in-index is set on "git diff-index <tree>",
unpack_trees() will eventually call oneway_diff() on the ita entry
with the same code flow as "diff-index --cached <tree>". We want to
ignore the ita entry for "diff-index --cached <tree>" but not
"diff-index <tree>" since the latter will examine and produce a diff
based on worktree entry's (real) content, not ita index entry's
(empty) content.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the known heavy code blocks are measured (except object database
access). This should help identify if an optimization is effective or
not. An unoptimized git-status would give something like below:
0.001791141 s: read cache ...
0.004011363 s: preload index
0.000516161 s: refresh index
0.003139257 s: git command: ... 'status' '--porcelain=2'
0.006788129 s: diff-files
0.002090267 s: diff-index
0.001885735 s: initialize name hash
0.032013138 s: read directory
0.051781209 s: git command: './git' 'status'
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An infrastructure to define what hash function is used in Git is
introduced, and an effort to plumb that throughout various
codepaths has been started.
* bc/hash-algo:
repository: fix a sparse 'using integer as NULL pointer' warning
Switch empty tree and blob lookups to use hash abstraction
Integrate hash algorithm support with repo setup
Add structure representing hash algorithm
setup: expose enumerated repo info
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.
* bp/fsmonitor:
fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
fsmonitor: read entirety of watchman output
fsmonitor: MINGW support for watchman integration
fsmonitor: add a performance test
fsmonitor: add a sample integration script for Watchman
fsmonitor: add test cases for fsmonitor extension
split-index: disable the fsmonitor extension when running the split index test
fsmonitor: add a test tool to dump the index extension
update-index: add fsmonitor support to update-index
ls-files: Add support in ls-files to display the fsmonitor valid bit
fsmonitor: add documentation for the fsmonitor extension.
fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
update-index: add a new --force-write-index option
preload-index: add override to enable testing preload-index
bswap: add 64 bit endianness helper get_be64
Switch the uses of empty_tree_oid and empty_blob_oid to use the
current_hash abstraction that represents the current hash algorithm in
use.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A single-word "unsigned flags" in the diff options is being split
into a structure with many bitfields.
* bw/diff-opt-impl-to-bitfields:
diff: make struct diff_flags members lowercase
diff: remove DIFF_OPT_CLR macro
diff: remove DIFF_OPT_SET macro
diff: remove DIFF_OPT_TST macro
diff: remove touched flags
diff: add flag to indicate textconv was set via cmdline
diff: convert flags to be stored in bitfields
add, reset: use DIFF_OPT_SET macro to set a diff flag
Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_SET(&E, fld)
+ E.flags.fld = 1
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_SET(ptr, fld)
+ ptr->flags.fld = 1
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_TST(&E, fld)
+ E.flags.fld
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_TST(ptr, fld)
+ ptr->flags.fld
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We cannot add many more flags to the diff machinery due to the
limitations of the number of flags that can be stored in a single
unsigned int. In order to allow for more flags to be added to the diff
machinery in the future this patch converts the flags to be stored in
bitfields in 'struct diff_flags'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.
We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.
In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.
To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.
Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of freeing `foo.objects` for an object array `foo` (sometimes
conditionally), call `object_array_clear(&foo)`. This means we don't
poke as much into the implementation, which is already a good thing, but
also that we release the individual entries as well, thereby fixing at
least one memory-leak (in diff-lib.c).
If someone is holding on to a pointer to an element's `name` or `path`,
that is now a dangling pointer, i.e., we'd be turning an unpleasant
situation into an outright bug. To the best of my understanding no such
long-term pointers are being taken.
The way we handle `study` in builting/reflog.c still looks like it might
leak. That will be addressed in the next commit.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our code often opens a path to an optional file, to work on its
contents when we can successfully open it. We can ignore a failure
to open if such an optional file does not exist, but we do want to
report a failure in opening for other reasons (e.g. we got an I/O
error, or the file is there, but we lack the permission to open).
The exact errors we need to ignore are ENOENT (obviously) and
ENOTDIR (less obvious). Instead of repeating comparison of errno
with these two constants, introduce a helper function to do so.
* jc/noent-notdir:
treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
compat-util: is_missing_file_error()
Convert diff_change to take a struct object_id. In addition convert the
function pointer type 'change_fn_t' to also take a struct object_id.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert diff_addremove to take a struct object_id. In addtion convert
the function pointer type 'add_remove_fn_t' to also take a struct
object_id.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from
$ git grep -e ENOENT --and -e ENOTDIR
There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both. Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert parse_tree_indirect to take a pointer to struct object_id.
Update all the callers. This transformation was achieved using the
following semantic patch and manual updates to the declaration and
definition. Update builtin/checkout.c manually as well, since it uses a
ternary expression not handled by the semantic patch.
@@
expression E1;
@@
- parse_tree_indirect(E1.hash)
+ parse_tree_indirect(&E1)
@@
expression E1;
@@
- parse_tree_indirect(E1->hash)
+ parse_tree_indirect(E1)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is needed to convert parse_tree_indirect.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If i-t-a entries are present and there is no change between the index
and HEAD i-t-a entries, index_differs_from() still returns "dirty, new
entries" (aka, the resulting commit is not empty), but cache-tree will
skip i-t-a entries and produce the exact same tree of current
commit.
index_differs_from() is supposed to catch this so we can abort
git-commit (unless --no-empty is specified). Update it to optionally
ignore i-t-a entries when doing a diff between the index and HEAD so
that it would return "no change" in this case and abort commit.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When comparing the index and the working tree to show which paths are
new, and comparing the tree recorded in the HEAD and the index to see if
committing the contents recorded in the index would result in an empty
commit, we would want the former comparison to say "these are new paths"
and the latter to say "there is no change" for paths that are marked as
intent-to-add.
We made a similar attempt at d95d728a ("diff-lib.c: adjust position of
i-t-a entries in diff", 2015-03-16), which redefined the semantics of
these two comparison modes globally, which was a disaster and had to be
reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a
entries in diff"", 2015-06-23).
To make sure we do not repeat the same mistake, introduce a new internal
diffopt option so that this different semantics can be asked for only by
callers that ask it, while making sure other unaudited callers will get
the same comparison result.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:
@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object. This provides no
functional change, as it is essentially a macro substitution.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash. Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
This reverts commit d95d728aba.
It turns out that many other commands that need to interact with the
result of running diff-files and diff-index, e.g. "git apply", "git
rm", etc., need to be adjusted to the new world order it brings in.
For example, it would break this sequence to correct a whitespace
breakage in the parts you changed:
git add -N file
git diff --cached file | git apply --cached --whitespace=fix
git checkout file
In the old world order, "diff" showed a patch to modify an existing
empty file by adding its full contents, and "apply" updated the
index by modifying the existing empty blob (which is what an
Intent-to-Add entry records in the index) with that patch.
In the new world order, "diff" shows a patch to create a new file
with its full contents, but because "apply" thinks that the i-t-a
entry already exists in the index, it refused to accept a creation.
Adjusting "apply" to this new world order is easy, but we need to
assess the extent of the damage to the rest of the system the new
world order brought in before going forward and adjust them all,
after which we can resurrect the commit being reverted here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After "git add -N", the path appeared in output of "git diff HEAD"
and "git diff --cached HEAD", leading "git status" to classify it
as "Changes to be committed". Such a path, however, is not yet to
be scheduled to be committed. "git diff" showed the change to the
path as modification, not as a "new file", in the header of its
output.
Treat such paths as "yet to be added to the index but Git already
know about them"; "git diff HEAD" and "git diff --cached HEAD"
should not talk about them, and "git diff" should show them as new
files yet to be added to the index.
* nd/diff-i-t-a:
diff-lib.c: adjust position of i-t-a entries in diff
Identify parts of the code that knows that we use SHA-1 hash to
name our objects too much, and use (1) symbolic constants instead
of hardcoded 20 as byte count and/or (2) use struct object_id
instead of unsigned char [20] for object names.
* bc/object-id:
apply: convert threeway_stage to object_id
patch-id: convert to use struct object_id
commit: convert parts to struct object_id
diff: convert struct combine_diff_path to object_id
bulk-checkin.c: convert to use struct object_id
zip: use GIT_SHA1_HEXSZ for trailers
archive.c: convert to use struct object_id
bisect.c: convert leaf functions to use struct object_id
define utility functions for object IDs
define a structure for object IDs
Entries added by "git add -N" are reminder for the user so that they
don't forget to add them before committing. These entries appear in
the index even though they are not real. Their presence in the index
leads to a confusing "git status" like this:
On branch master
Changes to be committed:
new file: foo
Changes not staged for commit:
modified: foo
If you do a "git commit", "foo" will not be included even though
"status" reports it as "to be committed". This patch changes the
output to become
On branch master
Changes not staged for commit:
new file: foo
no changes added to commit
The two hunks in diff-lib.c adjust "diff-index" and "diff-files" so
that i-t-a entries appear as new files in diff-files and nothing in
diff-index.
Due to this change, diff-files may start to report "new files" for the
first time. "add -u" needs to be told about this or it will die in
denial, screaming "new files can't exist! Reality is wrong." Luckily,
it's the only one among run_diff_files() callers that needs fixing.
Now in the new world order, a hierarchy in the index that contain
i-t-a paths is written out as a tree object as if these i-t-a
entries do not exist, and comparing the index with such a tree
object that would result from writing out the hierarchy will result
in no difference. Update a test in t2203 that expected the i-t-a
entries to appear as "added to the index" in the comparison to
instead expect no output.
An earlier change eec3e7e4 (cache-tree: invalidate i-t-a paths after
generating trees, 2012-12-16) becomes an unnecessary pessimization
in the new world order---a cache-tree in the index that corresponds
to a hierarchy with i-t-a paths can now be marked as valid and
record the object name of the tree that results from writing a tree
object out of that hierarchy, as it will compare equal to that tree.
Reverting the commit is left for the future, though, as it is purely
a performance issue and no longer affects correctness.
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also, convert a constant to GIT_SHA1_HEXSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the need to have duplicated "if there is a change then feed
null_sha1 and otherwise sha1 from the cache entry" for the "new"
side of the diff by introducing two temporary variables to point
at the object name of the old and the new side of the blobs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we try to diff an index entry marked CE_VALID (because it
was marked with --assume-unchanged), we do not bother even
running stat() on the file to see if it was removed. This
started long ago with 540e694 (Prevent diff machinery from
examining assume-unchanged entries on worktree, 2009-08-11).
However, the subsequent code may look at our "struct stat"
and expect to find actual data; currently it will find
whatever cruft was left on the stack. This can cause
problems in two situations:
1. We call match_stat_with_submodule with the stat data,
so a submodule may be erroneously marked as changed.
2. If --find-copies-harder is in effect, we pass all
entries, even unchanged ones, to diff_change, so it can
list them as rename/copy sources. Since we found no
change, we assume that function will realize it and not
actually display any diff output. However, we end up
feeding it a bogus mode, leading it to sometimes claim
there was a mode change.
We can fix both by splitting the CE_VALID and regular code
paths, and making sure only to look at the stat information
in the latter. Furthermore, we push the declaration of our
"struct stat" down into the code paths that actually set it,
so we cannot accidentally access it uninitialized in future
code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach combine-diff to honour the path-output-order imposed by
diffcore-order, and optimize how matching paths are found in
the N-way diffs made with parents.
* ks/combine-diff:
tests: add checking that combine-diff emits only correct paths
combine-diff: simplify intersect_paths() further
combine-diff: combine_diff_path.len is not needed anymore
combine-diff: optimize combine_diff_path sets intersection
diff test: add tests for combine-diff with orderfile
diffcore-order: export generic ordering interface
The field was used in order to speed-up name comparison and also to
mark removed paths by setting it to 0.
Because the updated code does significantly less strcmp and also
just removes paths from the list and free right after we know a path
will not be needed, it is not needed anymore.
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps reduce the number of match_pathspec_depth() call sites and
show how match_pathspec_depth() is used.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.
* jl/submodule-mv: (53 commits)
rm: delete .gitmodules entry of submodules removed from the work tree
mv: update the path entry in .gitmodules for moved submodules
submodule.c: add .gitmodules staging helper functions
mv: move submodules using a gitfile
mv: move submodules together with their work trees
rm: do not set a variable twice without intermediate reading.
t6131 - skip tests if on case-insensitive file system
parse_pathspec: accept :(icase)path syntax
pathspec: support :(glob) syntax
pathspec: make --literal-pathspecs disable pathspec magic
pathspec: support :(literal) syntax for noglob pathspec
kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
parse_pathspec: make sure the prefix part is wildcard-free
rename field "raw" to "_raw" in struct pathspec
tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
remove match_pathspec() in favor of match_pathspec_depth()
remove init_pathspec() in favor of parse_pathspec()
remove diff_tree_{setup,release}_paths
convert common_prefix() to use struct pathspec
...