Add a new command "git check-ignore" for debugging .gitignore
files.
The variable names may want to get cleaned up but that can be done
in-tree.
* as/check-ignore:
clean.c, ls-files.c: respect encapsulation of exclude_list_groups
t0008: avoid brace expansion
add git-check-ignore sub-command
setup.c: document get_pathspec()
add.c: extract new die_if_path_beyond_symlink() for reuse
add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
pathspec.c: rename newly public functions for clarity
add.c: move pathspec matchers into new pathspec.c for reuse
add.c: remove unused argument from validate_pathspec()
dir.c: improve docs for match_pathspec() and match_pathspec_depth()
dir.c: provide clear_directory() for reclaiming dir_struct memory
dir.c: keep track of where patterns came from
dir.c: use a single struct exclude_list per source of excludes
Conflicts:
builtin/ls-files.c
dir.c
Refactor and generally clean up the directory traversal API
implementation.
* as/dir-c-cleanup:
dir.c: rename free_excludes() to clear_exclude_list()
dir.c: refactor is_path_excluded()
dir.c: refactor is_excluded()
dir.c: refactor is_excluded_from_list()
dir.c: rename excluded() to is_excluded()
dir.c: rename excluded_from_list() to is_excluded_from_list()
dir.c: rename path_excluded() to is_path_excluded()
dir.c: rename cryptic 'which' variable to more consistent name
Improve documentation and comments regarding directory traversal API
api-directory-listing.txt: update to match code
Previously each exclude_list could potentially contain patterns
from multiple sources. For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).
We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source. This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.
Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is clearer to use a 'clear_' prefix for functions which empty
and deallocate the contents of a data structure without freeing
the structure itself, and a 'free_' prefix for functions which
also free the structure itself.
http://article.gmane.org/gmane.comp.version-control.git/206128
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Continue adopting clearer names for exclude functions. This 'is_*'
naming pattern for functions returning booleans was discussed here:
http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924
Also adjust their callers as necessary.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start adopting clearer names for exclude functions. This 'is_*'
naming pattern for functions returning booleans was agreed here:
http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the subject line of 613f027 (read-tree -u one-way merge fix
to check out locally modified paths., 2006-05-15) mentions "read-tree
-u", it did not seem to check whether -u was in effect. Not checking
whether -u is in effect makes e.g. "read-tree --reset" lstat() the
worktree, even though the worktree stat should not matter for that
operation.
This speeds up e.g. "git reset" a little on the linux-2.6 repo (best
of five, warm cache):
Before After
real 0m0.288s 0m0.233s
user 0m0.190s 0m0.150s
sys 0m0.090s 0m0.080s
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split lower bits of ce_flags field and creates a new ce_namelen
field in the in-core index structure.
* tg/ce-namelen-field:
Strip namelen out of ce_flags into a ce_namelen field
"git ls-files --exclude=t -i" did not consider anything under t/ as
excluded, as it did not pay attention to exclusion of leading paths
while walking the index. Other two users of excluded() are also
updated.
* jc/ls-files-i-dir:
dir.c: make excluded() file scope static
unpack-trees.c: use path_excluded() in check_ok_to_remove()
builtin/add.c: use path_excluded()
path_excluded(): update API to less cache-entry centric
ls-files -i: micro-optimize path_excluded()
ls-files -i: pay attention to exclusion of leading paths
Strip the name length from the ce_flags field and move it
into its own ce_namelen field in struct cache_entry. This
will both give us a tiny bit of a performance enhancement
when working with long pathnames and is a refactoring for
more readability of the code.
It enhances readability, by making it more clear what
is a flag, and where the length is stored and make it clear
which functions use stages in comparisions and which only
use the length.
It also makes CE_NAMEMASK private, so that users don't
mistakenly write the name length in the flags.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace strlen(ce->name) with ce_namelen() in a couple
of places which gives us some additional bits of
performance.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files --exclude=t -i" did not consider anything under t/
as excluded, as it did not pay attention to exclusion of leading
paths while walking the index. Other two users of excluded() are
also updated.
* jc/ls-files-i-dir:
dir.c: make excluded() file scope static
unpack-trees.c: use path_excluded() in check_ok_to_remove()
builtin/add.c: use path_excluded()
path_excluded(): update API to less cache-entry centric
ls-files -i: micro-optimize path_excluded()
ls-files -i: pay attention to exclusion of leading paths
This function is responsible for determining if a path that is not
tracked is ignored and allow "checkout" to overwrite it as needed.
It used excluded() without checking if higher level directory in the
path is ignored; correct it to use path_excluded() for this check.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* There are uses of lower-level interface excluded_from_list() in
the codepath for narrow-checkout hack; they are supposed to be
already checking each level as they descend, and are not touched
with this patch.
By Jens Lehmann (1) and Johannes Sixt (1)
* maint:
Consistently use "superproject" instead of "supermodule"
t3404: begin "exchange commits with -p" test with correct preconditions
We fairly consistently say "superproject" and never "supermodule" these
days. But there are seven occurrences of "supermodule" left in the current
work tree. Three appear in Release Notes for 1.5.3 and 1.7.7, three in
test names and one in a C-code comment.
Replace all occurrences of "supermodule" outside of the Release Notes
(which shouldn't be changed after the fact) with "superproject" for
consistency.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many error/warning messages had extra trailing newlines that are
unnecessary.
By Pete Wyckoff
* pw/message-cleanup:
remove blank filename in error message
remove superfluous newlines in error messages
Trivially shrinks the on-disk size of the index file to save both I/O and
checksum overhead.
The topic should give a solid base to build on further updates, with the
code refactoring in its earlier parts, and the backward compatibility
mechanism in its later parts.
* jc/index-v4:
index-v4: document the entry format
unpack-trees: preserve the index file version of original
update-index: upgrade/downgrade on-disk index version
read-cache.c: write prefix-compressed names in the index
read-cache.c: read prefix-compressed names in index on-disk version v4
read-cache.c: move code to copy incore to ondisk cache to a helper function
read-cache.c: move code to copy ondisk to incore cache to a helper function
read-cache.c: report the header version we do not understand
read-cache.c: make create_from_disk() report number of bytes it consumed
read-cache.c: allow unaligned mapping of the index file
cache.h: hide on-disk index details
varint: make it available outside the context of pack
The error handling routines add a newline. Remove
the duplicate ones in error messages.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise "git checkout $other_branch" (or even "git checkout HEAD")
would end up writing the index out in the default format.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The allocations made by unpack_nondirectories() using create_ce_entry()
are never freed.
In the non-merge case, we duplicate them using add_entry() and later
only look at the first allocated element (src[0]), perhaps even only
by mistake. Split out the actual addition from add_entry() into the
new helper do_add_entry() and call this non-duplicating function
instead of add_entry() to avoid the leak.
Valgrind reports this for the command "git archive v1.7.9" without
the patch:
==13372== LEAK SUMMARY:
==13372== definitely lost: 230,986 bytes in 2,325 blocks
==13372== indirectly lost: 0 bytes in 0 blocks
==13372== possibly lost: 98 bytes in 1 blocks
==13372== still reachable: 2,259,198 bytes in 3,243 blocks
==13372== suppressed: 0 bytes in 0 blocks
And with the patch applied:
==13375== LEAK SUMMARY:
==13375== definitely lost: 65 bytes in 1 blocks
==13375== indirectly lost: 0 bytes in 0 blocks
==13375== possibly lost: 0 bytes in 0 blocks
==13375== still reachable: 2,364,417 bytes in 3,245 blocks
==13375== suppressed: 0 bytes in 0 blocks
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
src[0] points to the index entry in the merge case and to the first
tree to unpack in the non-merge case. We only want to mark the index
entry, so check first if we're merging.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
tree_entry_len() does not simply take two random arguments and return
a tree length. The two pointers must point to a tree item structure,
or struct name_entry. Passing random pointers will return incorrect
value.
Force callers to pass struct name_entry instead of two pointers (with
hope that they don't manually construct struct name_entry themselves)
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/maint-sparse-errors:
Add explanation why we do not allow to sparse checkout to empty working tree
sparse checkout: show error messages when worktree shaping fails
* mg/maint-doc-sparse-checkout:
git-read-tree.txt: correct sparse-checkout and skip-worktree description
git-read-tree.txt: language and typography fixes
unpack-trees: print "Aborting" to stderr
* jc/diff-index-unpack:
diff-index: pass pathspec down to unpack-trees machinery
unpack-trees: allow pruning with pathspec
traverse_trees(): allow pruning with pathspec
verify_* functions can queue errors up and to be printed later at
label return_failed. In case of errors, do not go to label "done"
directly because all queued messages would be dropped on the floor.
Found-by: Joshua Jensen <jjensen@workspacewhiz.com>
Tracked-down-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
display_error_msgs() prints all the errors to stderr already (if any),
followed by "Aborting" (if any) to stdout. Make the latter go to stderr
instead.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the pathspec pruning of traverse_trees() from unpack_trees(). Again,
the unpack_trees() machinery is primarily meant for merging two (or more)
trees, and because a merge is a full tree operation, it didn't support any
pruning with pathspec, and this codepath probably should not be enabled
while running a merge, but the caller in diff-lib.c::diff_cache() should
be able to take advantage of it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Break down no-lstat() condition checks in verify_uptodate()
t7400: fix bogus test failure with symlinked trash
Documentation: clarify the invalidated tree entry format
Make it easier to grok under what conditions we can skip lstat().
While at there, shorten ie_match_stat() line for the sake of my eyes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A negative return from the unpack callback function usually means unpack
failed for the entry and signals the unpack_trees() machinery to fail the
entire merge operation, immediately and there is no other way for the
callback to tell the machinery to exit early without reporting an error.
This is what we usually want to make a merge all-or-nothing operation, but
the machinery is also used for diff-index codepath by using a custom
unpack callback function. And we do sometimes want to exit early without
failing, namely when we are under --quiet and can short-cut the diff upon
finding the first difference.
Add "exiting_early" field to unpack_trees_options structure, to signal the
unpack_trees() machinery that the negative return value is not signaling
an error but an early return from the unpack_trees() machinery. As this by
definition hasn't unpacked everything, discard the resulting index just
like the failure codepath.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Until now there was no way to test if unpack_trees() with update=1 would
succeed without really updating the work tree. The reason for that is that
setting update to 0 does skip the tests for new files and deactivates the
sparse handling, thereby making that unsuitable as a dry run.
Add the new dry_run flag to struct unpack_trees_options unpack_trees().
Setting that together with the update flag will check if the work tree
update would be successful without doing it for real.
The only class of problems that is not detected at the moment are file
system conditions like ENOSPC or missing permissions. Also the index
entries of updated files are not as they would be after a real checkout
because lstat() isn't run as the files aren't updated for real.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sparse-setting code follows closely how files are excluded in
read_directory(), every entry (including directories) are fed to
excluded_from_list() to decide if the entry is suitable. Directories
are treated no different than files. If a directory is matched (or
not), the whole directory is considered matched (or not) and the
process moves on.
This generally works as long as there are no patterns to exclude parts
of the directory. In case of sparse checkout code, the following patterns
t
!t/t0000-basic.sh
will produce a worktree with full directory "t" even if t0000-basic.sh
is requested to stay out.
By the same reasoning, if a directory is to be excluded, any rules to
re-include certain files within that directory will be ignored.
Fix it by always checking files against patterns. If no pattern can be
used to decide whether an entry is in our out
(ie. excluded_from_list() returns -1), the entry will be
included/excluded the same as their parent directory.
Noticed-by: <skillzero@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix warnings from 'make check'.
- These files don't include 'builtin.h' causing sparse to complain that
cmd_* isn't declared:
builtin/clone.c:364, builtin/fetch-pack.c:797,
builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
builtin/merge-index.c:69, builtin/merge-recursive.c:22
builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
builtin/notes.c:822, builtin/pack-redundant.c:596,
builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
builtin/remote.c:1512, builtin/remote-ext.c:240,
builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
builtin/unpack-file.c:25, builtin/var.c:75
- These files have symbols which should be marked static since they're
only file scope:
submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48
- These files redeclare symbols to be different types:
builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
usage.c:49, usage.c:58, usage.c:63, usage.c:72
- These files use a literal integer 0 when they really should use a NULL
pointer:
daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362
While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When check_leading_path notices a file in the way of a new entry to be
checked out, verify_absent uses (1) the mode to determine whether it
is a directory (2) the rest of the stat information to check if this
is actually an old entry, disguised by a change in filename (e.g.,
README -> Readme) that is significant to git but insignificant to the
underlying filesystem. If lstat fails, these checks are performed
with an uninitialied stat structure, producing essentially random
results.
Better to just error out when lstat fails.
The easiest way to reproduce this is to remove a file after the
check_leading_path call and before the lstat in verify_absent. An
lstat failure other than ENOENT in check_leading_path would also
trigger the same code path.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When check_leading_path notices no file in the way of the new entry to
be checked out, verify_absent checks whether there is a directory
there or nothing at all. If that lstat call fails (for example due to
ENOMEM), it assumes ENOENT, meaning a directory with untracked files
would be clobbered in that case.
Check errno after calling lstat, and for conditions other than ENOENT,
just error out.
This is a theoretical race condition. lstat has to succeed moments
before it fails for there to be trouble.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
unpack_trees() merges two trees (the current HEAD and the destination
commit) when switching to another branch, checking and updating the index
entry where the destination differs from the current HEAD. It merges three
trees (the common ancestor, the current HEAD and the other commit) when
performing a three-way merge, checking and updating the index entry when
the merge result differs from the current HEAD. It does so by walking the
input trees in parallel all the way down to the leaves.
One common special case is a directory is identical across the trees
involved in the merge. In such a case, we do not have to descend into the
directory at all---we know that the end result is to keep the entries in
the current index.
This optimization cannot be applied in a few special cases in
unpack_trees(), though. We need to descend into the directory and update
the index entries from the target tree in the following cases:
- When resetting (e.g. "git reset --hard"); and
- When checking out a tree for the first time into an empty working tree
(e.g. "git read-tree -m -u HEAD HEAD" with missing .git/index).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/maint-fix-add-typo-detection:
Revert "excluded_1(): support exclude files in index"
unpack-trees: fix sparse checkout's "unable to match directories"
unpack-trees: move all skip-worktree checks back to unpack_trees()
dir.c: add free_excludes()
cache.h: realign and use (1 << x) form for CE_* constants
An aborted merge prints the list of rejected paths as part of the
error message. Since commit f66caaf9 (do not overwrite files in
leading path), some of those paths do not have static buffers, so
we have to keep a copy. Use string_list's to accomplish this.
This changes the order of the list to the order in which the paths
are processed. Previously, it was reversed.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the work tree contains an untracked file x, and
unpack-trees wants to checkout a path x/*, the
file x is removed unconditionally.
Instead, apply the same checks that are normally
used for untracked files, and abort if the file
cannot be removed.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This wraps some inline code into the function check_ok_to_remove(),
which will later be used for leading path components as well.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Matching index entries against an excludes file currently has two
problems.
First, there's no function to do it. Code paths (like sparse
checkout) that wanted to try it would iterate over index entries and
for each index entry pass that path to excluded_from_list(). But that
is not how excluded_from_list() works; one is supposed to feed in each
ancester of a path before a given path to find out if it was excluded
because of some parent or grandparent matching a
bigsubdirectory/
pattern despite the path not matching any .gitignore pattern directly.
Second, it's inefficient. The excludes mechanism is supposed to let
us block off vast swaths of the filesystem as uninteresting; separately
checking every index entry doesn't fit that model.
Introduce a new function to take care of both these problems. This
traverses the index in depth-first order (well, that's what order the
index is in) to mark un-excluded entries.
Maybe some day the in-core index format will be restructured to make
this sort of operation easier. Or maybe we will want to try some
binary search based thing. The interface is simple enough to allow
all those things. Example:
clear_ce_flags(the_index.cache, the_index.cache_nr,
CE_CANDIDATE, CE_CLEARME, exclude_list);
would clear the CE_CLEARME flag on all index entries with
CE_CANDIDATE flag and not matched by exclude_list.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, the will_have_skip_worktree() checks are done in various
places, which makes it hard to traverse the index tree-alike, required
by excluded_from_list(). This patch moves all the checks into two
loops in unpack_trees().
Entries in index in this operation can be classified into two
groups: ones already in index before unpack_trees() is called and ones
added to index after traverse_trees() is called.
In both groups, before checking file status on worktree, the future
skip-worktree bit must be checked, so that if an entry will be outside
worktree, worktree should not be checked.
For the first group, the future skip-worktree bit is precomputed and
stored as CE_NEW_SKIP_WORKTREE in the first loop before
traverse_trees() is called so that *way_merge() function does not need
to compute it again.
For the second group, because we don't know what entries will be in
this group until traverse_trees() finishes, operations that need
future skip-worktree check is delayed until CE_NEW_SKIP_WORKTREE is
computed in the second loop. CE_ADDED is used to mark entries in the
second group.
CE_ADDED and CE_NEW_SKIP_WORKTREE are temporary flags used in
unpack_trees(). CE_ADDED is only used by add_to_index(), which should
not be called while unpack_trees() is running.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>