Commit Graph

477 Commits

Author SHA1 Message Date
Nguyễn Thái Ngọc Duy
b673155074 dir.c: stop ignoring opendir() error in open_cached_dir()
A follow-up to the recently fixed bugs in the untracked
invalidation. If opendir() fails it should show a warning, perhaps
this should die, but if this ever happens the error is probably
recoverable for the user, and dying would just make things worse.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02 10:16:23 -08:00
Patryk Obara
f070faccc1 sha1_file: convert hash_sha1_file to object_id
Convert the declaration and definition of hash_sha1_file to use
struct object_id and adjust all function calls.

Rename this function to hash_object_file.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
Patryk Obara
4b33e60201 dir: convert struct sha1_stat to use object_id
Convert the declaration of struct sha1_stat. Adjust all usages of this
struct and replace hash{clr,cmp,cpy} with oid{clr,cmp,cpy} wherever
possible.  Rename it to struct oid_stat.

Rename static function load_sha1_stat to load_oid_stat.

Remove macro EMPTY_BLOB_SHA1_BIN, as it's no longer used.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
Nguyễn Thái Ngọc Duy
b640313110 dir.c: fix missing dir invalidation in untracked code
Let's start with how create a new directory cache after the last one
becomes invalid (e.g. because its dir mtime has changed...). In
open_cached_dir():

1. We start out with valid_cached_dir() returning false, which should
   call invalidate_directory() to put a directory state back to
   initial state, no untracked entries (untracked_nr zero), no sub
   directory traversal (dirs[].recurse zero).

2. Since the cache cannot be used, we go the slow path opendir() and
   go through items one by one via readdir(). All the directories on
   disk will be added back to the cache (if not already exist in
   dirs[]) and its flag "recurse" gets changed to one to note that
   it's part of the cached dir travesal next time.

3. By the time we reach close_cached_dir() we should have a good
   subdir list in dirs[]. Those with "recurse" flag set are the ones
   present in the on-disk directory. The directory is now marked
   "valid".

Next time read_directory() is called, since the directory is marked
valid, it will skip readdir(), go fast path and traverse through
dirs[] array instead.

Steps one and two need some tight cooperation. If a subdir is removed,
readdir() will not find it and of course we cannot examine/invalidate
it. To make sure removed directories on disk are gone from the cache,
step one must make sure recurse flag of all subdirs are zero.

But that's not true. If "valid" flag is already false, there is a
chance we go straight to the end of valid_cached_dir() without calling
invalidate_directory(). Or we fail to meet the "if (untracked-valid)"
condition and skip over the invalidate_directory().

After step 3, we mark the cache valid. Any stale subdir with incorrect
recurse flag becomes a real subdir next time we traverse the directory
using dirs[] array.

We could avoid this by making sure invalidate_directory() is always
called (therefore dirs[].recurse cleared) at the beginning of
open_cached_dir(). Which is what this patch does.

As to how we get into this situation, the key in the test is this
command

    git checkout master

where "one/file" is replaced with "one" in the index. This index
update triggers untracked_cache_invalidate_path(), which clears valid
flag of the root directory while keeping "recurse" flag on the subdir
"one" on. On the next git-status, we go through steps 1-3 above and
save an incorrect cache on disk. The second git-status blindly follows
the bad cache data and shows the problem.

This is arguably because of a bad design where "recurse" flag plays
double roles: whether a directory should be saved on disk, and whether
it is part of a directory traversal.

We need to keep recurse flag set at "checkout master" because of the
first role: we need to keep subdir caches (dir "two" for example has
not been touched at all, no reason to throw its cache away).

As long as we make sure to ignore/reset "recurse" flag at the
beginning of a directory traversal, we're good. But maybe eventually
we should separate these two roles.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-24 12:40:14 -08:00
Nguyễn Thái Ngọc Duy
2523c4be85 dir.c: avoid stat() in valid_cached_dir()
stat() may follow a symlink and return stat data of the link's target
instead of the link itself. We are concerned about the link itself.

It's kind of hard to demonstrate the bug. I think when path->buf is a
symlink, we most likely find that its target's stat data does not
match our cached one, which means we ignore the cache and fall back to
slow path.

This is performance issue, not correctness (though we could still
catch it by verifying test-dump-untracked-cache. The less unlikely
case is, link target stat data matches the cached version and we
incorrectly go fast path, ignoring real data on disk. A test for this
may involve manipulating stat data, which may be not portable.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-24 12:40:13 -08:00
SZEDER Gábor
f919ffebed Use MOVE_ARRAY
Use the helper macro MOVE_ARRAY to move arrays.  This is shorter and
safer, as it automatically infers the size of elements.

Patch generated by Coccinelle and contrib/coccinelle/array.cocci in
Travis CI's static analysis build job.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-22 11:32:51 -08:00
Junio C Hamano
61061abba7 Merge branch 'jh/object-filtering'
In preparation for implementing narrow/partial clone, the object
walking machinery has been taught a way to tell it to "filter" some
objects from enumeration.

* jh/object-filtering:
  rev-list: support --no-filter argument
  list-objects-filter-options: support --no-filter
  list-objects-filter-options: fix 'keword' typo in comment
  pack-objects: add list-objects filtering
  rev-list: add list-objects filtering support
  list-objects: filter objects in traverse_commit_list
  oidset: add iterator methods to oidset
  oidmap: add oidmap iterator methods
  dir: allow exclusions from blob in addition to file
2017-12-27 11:16:21 -08:00
Jeff Hostetler
578d81d0c4 dir: allow exclusions from blob in addition to file
Refactor add_excludes() to separate the reading of the
exclude file into a buffer and the parsing of the buffer
into exclude_list items.

Add add_excludes_from_blob_to_list() to allow an exclude
file be specified with an OID without assuming a local
worktree or index exists.

Refactor read_skip_worktree_file_from_index() and add
do_read_blob() to eliminate duplication of preliminary
processing of blob contents.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 14:11:56 +09:00
Junio C Hamano
e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Junio C Hamano
d8df70f273 Merge branch 'jm/status-ignored-files-list'
The set of paths output from "git status --ignored" was tied
closely with its "--untracked=<mode>" option, but now it can be
controlled more flexibly.  Most notably, a directory that is
ignored because it is listed to be ignored in the ignore/exclude
mechanism can be handled differently from a directory that ends up
to be ignored only because all files in it are ignored.

* jm/status-ignored-files-list:
  status: test ignored modes
  status: document options to show matching ignored files
  status: report matching ignored and normal untracked
  status: add option to show ignored files differently
2017-11-13 14:44:59 +09:00
Junio C Hamano
e7e456f500 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (25 commits)
  refs/files-backend: convert static functions to object_id
  refs: convert read_raw_ref backends to struct object_id
  refs: convert peel_object to struct object_id
  refs: convert resolve_ref_unsafe to struct object_id
  worktree: convert struct worktree to object_id
  refs: convert resolve_gitlink_ref to struct object_id
  Convert remaining callers of resolve_gitlink_ref to object_id
  sha1_file: convert index_path and index_fd to struct object_id
  refs: convert reflog_expire parameter to struct object_id
  refs: convert read_ref_at to struct object_id
  refs: convert peel_ref to struct object_id
  builtin/pack-objects: convert to struct object_id
  pack-bitmap: convert traverse_bitmap_commit_list to object_id
  refs: convert dwim_log to struct object_id
  builtin/reflog: convert remaining unsigned char uses to object_id
  refs: convert dwim_ref and expand_ref to struct object_id
  refs: convert read_ref and read_ref_full to object_id
  refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
  Convert check_connected to use struct object_id
  refs: update ref transactions to use struct object_id
  ...
2017-11-06 14:24:27 +09:00
Junio C Hamano
da7996aaf7 Merge branch 'js/submodule-in-excluded'
"git status --ignored -u" did not stop at a working tree of a
separate project that is embedded in an ignored directory and
listed files in that other project, instead of just showing the
directory itself as ignored.

* js/submodule-in-excluded:
  status: do not get confused by submodules in excluded directories
2017-11-06 13:11:26 +09:00
Jameson Miller
07966ed19e status: report matching ignored and normal untracked
Teach status command to handle `--ignored=matching` with
`--untracked-files=normal`

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-31 11:54:21 +09:00
Jameson Miller
eec0f7f2b7 status: add option to show ignored files differently
Teach the status command more flexibility in how ignored files are
reported. Currently, the reporting of ignored files and untracked
files are linked. You cannot control how ignored files are reported
independently of how untracked files are reported (i.e. `all` vs
`normal`). This makes it impossible to show untracked files with the
`all` option, but show ignored files with the `normal` option.

This work 1) adds the ability to control the reporting of ignored
files independently of untracked files and 2) introduces the concept
of status reporting ignored paths that explicitly match an ignored
pattern. There are 2 benefits to these changes: 1) if a consumer needs
all untracked files but not all ignored files, there is a performance
benefit to not scanning all contents of an ignored directory and 2)
returning ignored files that explicitly match a path allow a consumer
to make more informed decisions about when a status result might be
stale.

This commit implements --ignored=matching with --untracked-files=all.
The following commit will implement --ignored=matching with
--untracked=files=normal.

As an example of where this flexibility could be useful is that our
application (Visual Studio) runs the status command and presents the
output. It shows all untracked files individually (e.g. using the
'--untracked-files==all' option), and would like to know about which
paths are ignored. It uses information about ignored paths to make
decisions about when the status result might have changed.
Additionally, many projects place build output into directories inside
a repository's working directory (e.g. in "bin/" and "obj/"
directories). Normal usage is to explicitly ignore these 2 directory
names in the .gitignore file (rather than or in addition to the *.obj
pattern).If an application could know that these directories are
explicitly ignored, it could infer that all contents are ignored as
well and make better informed decisions about files in these
directories. It could infer that any changes under these paths would
not affect the output of status. Additionally, there can be a
significant performance benefit by avoiding scanning through ignored
directories.

When status is set to report matching ignored files, it has the
following behavior. Ignored files and directories that explicitly
match an exclude pattern are reported. If an ignored directory matches
an exclude pattern, then the path of the directory is returned. If a
directory does not match an exclude pattern, but all of its contents
are ignored, then the contained files are reported instead of the
directory.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-31 11:54:21 +09:00
Johannes Schindelin
fadb4820c4 status: do not get confused by submodules in excluded directories
We meticulously pass the `exclude` flag to the `treat_directory()`
function so that we can indicate that files in it are excluded rather
than untracked when recursing.

But we did not yet treat submodules the same way.

Because of that, `git status --ignored --untracked` with a submodule
`submodule` in a gitignored `tracked/` would show the submodule in the
"Untracked files" section, e.g.

	On branch master
	Untracked files:
	  (use "git add <file>..." to include in what will be committed)

		tracked/submodule/

	Ignored files:
	  (use "git add -f <file>..." to include in what will be committed)

		tracked/submodule/initial.t

Instead, we would want it to show the submodule in the "Ignored files"
section:

	On branch master
	Ignored files:
	  (use "git add -f <file>..." to include in what will be committed)

		tracked/submodule/

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-26 11:29:06 +09:00
brian m. carlson
a98e6101f0 refs: convert resolve_gitlink_ref to struct object_id
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
brian m. carlson
1053fe829c Convert remaining callers of resolve_gitlink_ref to object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
Ben Peart
883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Jameson Miller
5aaa7fd39a Improve performance of git status --ignored
Improve the performance of the directory listing logic when it wants to list
non-empty ignored directories. In order to show non-empty ignored directories,
the existing logic will recursively iterate through all contents of an ignored
directory. This change introduces the optimization to stop iterating through
the contents once it finds the first file. This can have a significant
improvement in 'git status --ignored' performance in repositories with a large
number of files in ignored directories.

For an example of the performance difference on an example repository with
196,000 files in 400 ignored directories:

| Command                    |  Time (s) |
| -------------------------- | --------- |
| git status                 |   1.2     |
| git status --ignored (old) |   3.9     |
| git status --ignored (new) |   1.4     |

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-19 12:28:06 +09:00
Junio C Hamano
0d824bc7f6 Merge branch 'rs/stat-data-unaligned-reads-fix' into maint
Code clean-up.

* rs/stat-data-unaligned-reads-fix:
  dir: support platforms that require aligned reads
2017-08-23 14:33:48 -07:00
Junio C Hamano
12deaf66d4 Merge branch 'rs/stat-data-unaligned-reads-fix'
Code clean-up.

* rs/stat-data-unaligned-reads-fix:
  dir: support platforms that require aligned reads
2017-08-11 13:26:58 -07:00
René Scharfe
268ba20110 dir: support platforms that require aligned reads
The untracked cache is stored on disk by concatenating its memory
structures without any padding.  Consequently some of the structs are
not aligned at a particular boundary when the whole extension is read
back in one go.  That's only OK on platforms without strict alignment
requirements, or for byte-aligned data like strings or hash values.

Decode struct ondisk_untracked_cache carefully from the extension
blob by using explicit pointer arithmetic with offsets, avoiding
alignment issues.  Use char pointers for passing stat_data objects to
stat_data_from_disk(), and use memcpy(3) in that function to  get the
contents into a properly aligned struct, then perform the byte-order
adjustment in place there.

Found with Clang's UBSan.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:56:16 -07:00
Junio C Hamano
0c6435a4d6 Merge branch 'ab/wildmatch'
Minor code cleanup.

* ab/wildmatch:
  wildmatch: remove unused wildopts parameter
2017-07-10 13:42:51 -07:00
Junio C Hamano
50f03c6676 Merge branch 'ab/free-and-null'
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.

* ab/free-and-null:
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro
  coccinelle: make use of the "expression" FREE_AND_NULL() rule
  coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
  coccinelle: make use of the "type" FREE_AND_NULL() rule
  coccinelle: add a rule to make "type" code use FREE_AND_NULL()
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24 14:28:41 -07:00
Junio C Hamano
f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Junio C Hamano
5812b3f73b Merge branch 'bw/ls-files-sans-the-index'
Code clean-up.

* bw/ls-files-sans-the-index:
  ls-files: factor out tag calculation
  ls-files: factor out debug info into a function
  ls-files: convert show_files to take an index
  ls-files: convert show_ce_entry to take an index
  ls-files: convert prune_cache to take an index
  ls-files: convert ce_excluded to take an index
  ls-files: convert show_ru_info to take an index
  ls-files: convert show_other_files to take an index
  ls-files: convert show_killed_files to take an index
  ls-files: convert write_eolinfo to take an index
  ls-files: convert overlay_tree_on_cache to take an index
  tree: convert read_tree to take an index parameter
  convert: convert renormalize_buffer to take an index
  convert: convert convert_to_git to take an index
  convert: convert convert_to_git_filter_fd to take an index
  convert: convert crlf_to_git to take an index
  convert: convert get_cached_convert_stats_ascii to take an index
2017-06-24 14:28:40 -07:00
Ævar Arnfjörð Bjarmason
55d3426929 wildmatch: remove unused wildopts parameter
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.

This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:27:07 -07:00
Junio C Hamano
52ab95cfea Merge branch 'pc/dir-count-slashes'
Three instances of the same helper function have been consolidated
to one.

* pc/dir-count-slashes:
  dir: create function count_slashes()
2017-06-22 14:15:21 -07:00
Ævar Arnfjörð Bjarmason
6a83d90207 coccinelle: make use of the "type" FREE_AND_NULL() rule
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:03 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano
b9a7d55d93 Merge branch 'nd/fopen-errors'
We often try to open a file for reading whose existence is
optional, and silently ignore errors from open/fopen; report such
errors if they are not due to missing files.

* nd/fopen-errors:
  mingw_fopen: report ENOENT for invalid file names
  mingw: verify that paths are not mistaken for remote nicknames
  log: fix memory leak in open_next_file()
  rerere.c: move error_errno() closer to the source system call
  print errno when reporting a system call error
  wrapper.c: make warn_on_inaccessible() static
  wrapper.c: add and use fopen_or_warn()
  wrapper.c: add and use warn_on_fopen_errors()
  config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too
  config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
  clone: use xfopen() instead of fopen()
  use xfopen() in more places
  git_fopen: fix a sparse 'not declared' warning
2017-06-13 13:47:09 -07:00
Junio C Hamano
93dd544f54 Merge branch 'jc/noent-notdir'
Our code often opens a path to an optional file, to work on its
contents when we can successfully open it.  We can ignore a failure
to open if such an optional file does not exist, but we do want to
report a failure in opening for other reasons (e.g. we got an I/O
error, or the file is there, but we lack the permission to open).

The exact errors we need to ignore are ENOENT (obviously) and
ENOTDIR (less obvious).  Instead of repeating comparison of errno
with these two constants, introduce a helper function to do so.

* jc/noent-notdir:
  treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
  compat-util: is_missing_file_error()
2017-06-13 13:47:07 -07:00
Junio C Hamano
f4683b4e9c Merge branch 'sl/clean-d-ignored-fix' into maint
"git clean -d" used to clean directories that has ignored files,
even though the command should not lose ignored ones without "-x".
"git status --ignored"  did not list ignored and untracked files
without "-uall".  These have been corrected.

* sl/clean-d-ignored-fix:
  clean: teach clean -d to preserve ignored paths
  dir: expose cmp_name() and check_contains()
  dir: hide untracked contents of untracked dirs
  dir: recurse into untracked dirs for ignored files
  t7061: status --ignored should search untracked dirs
  t7300: clean -d should skip dirs with ignored files
2017-06-13 13:27:02 -07:00
Brandon Williams
82b474e025 convert: convert convert_to_git to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Prathamesh Chavan
e0556a928f dir: create function count_slashes()
Similar functions exist in apply.c and builtin/show-branch.c for
counting the number of slashes in a string. Also in the later
patches, we introduce a third caller for the same. Hence, we unify
it now by cleaning the existing functions and declaring a common
function count_slashes in dir.h and implementing it in dir.c to
remove this code duplication.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-12 13:26:55 -07:00
Junio C Hamano
f4fd99bf6e Merge branch 'sl/clean-d-ignored-fix'
"git clean -d" used to clean directories that has ignored files,
even though the command should not lose ignored ones without "-x".
"git status --ignored"  did not list ignored and untracked files
without "-uall".  These have been corrected.

* sl/clean-d-ignored-fix:
  clean: teach clean -d to preserve ignored paths
  dir: expose cmp_name() and check_contains()
  dir: hide untracked contents of untracked dirs
  dir: recurse into untracked dirs for ignored files
  t7061: status --ignored should search untracked dirs
  t7300: clean -d should skip dirs with ignored files
2017-06-02 15:06:05 +09:00
Junio C Hamano
c7054209d6 treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from

  $ git grep -e ENOENT --and -e ENOTDIR

There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both.  Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-30 09:29:00 +09:00
Nguyễn Thái Ngọc Duy
11dc1fcb3f wrapper.c: add and use warn_on_fopen_errors()
In many places, Git warns about an inaccessible file after a fopen()
failed. To discern these cases from other cases where we want to warn
about inaccessible files, introduce a new helper specifically to test
whether fopen() failed because the current user lacks the permission to
open file in question.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:33:55 +09:00
Samuel Lijin
bbf504a995 dir: expose cmp_name() and check_contains()
We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 12:14:13 +09:00
Samuel Lijin
fb89888849 dir: hide untracked contents of untracked dirs
When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.

We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 12:14:09 +09:00
Samuel Lijin
df5bcdf83a dir: recurse into untracked dirs for ignored files
We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 12:06:52 +09:00
Brandon Williams
0d32c183b6 dir: convert fill_directory to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
2c1eb10454 dir: convert read_directory to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
0ef8e169aa dir: convert read_directory_recursive to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
207a06cea3 dir: convert open_cached_dir to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
a0bba65b10 dir: convert is_excluded to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
e799ed408e dir: convert prep_exclude to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
473e39307d dir: convert add_excludes to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
fba92be8f7 dir: convert is_excluded_from_list to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
2b70e88d36 dir: convert last_exclude_matching_from_list to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
9e58becab9 dir: convert dir_add* to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
98f2a687b9 dir: convert get_dtype to take index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:39 +09:00
Brandon Williams
ae520e3675 dir: convert directory_exists_in_index to take index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:38 +09:00
Brandon Williams
6f52b741a7 dir: convert read_skip_worktree_file_from_index to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:38 +09:00
Brandon Williams
12cd0bf9b0 dir: stop using the index compatibility macros
In order to make it clearer where the_index is being referenced, stop
using the index compatibility macros in dir.c.  This is to make it
easier to identify the functions which need to be convert to taking in a
'struct index_state' as a parameter.

The end goal would be to eliminate the need to reference global index
state in dir.c.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06 19:15:38 +09:00
Junio C Hamano
e394fa01d6 Merge branch 'sb/checkout-recurse-submodules'
"git checkout" is taught the "--recurse-submodules" option.

* sb/checkout-recurse-submodules:
  builtin/read-tree: add --recurse-submodules switch
  builtin/checkout: add --recurse-submodules switch
  entry.c: create submodules when interesting
  unpack-trees: check if we can perform the operation for submodules
  unpack-trees: pass old oid to verify_clean_submodule
  update submodules: add submodule_move_head
  submodule.c: get_super_prefix_or_empty
  update submodules: move up prepare_submodule_repo_env
  submodules: introduce check to see whether to touch a submodule
  update submodules: add a config option to determine if submodules are updated
  update submodules: add submodule config parsing
  make is_submodule_populated gently
  lib-submodule-update.sh: define tests for recursing into submodules
  lib-submodule-update.sh: replace sha1 by hash
  lib-submodule-update: teach test_submodule_content the -C <dir> flag
  lib-submodule-update.sh: do not use ./. as submodule remote
  lib-submodule-update.sh: reorder create_lib_submodule_repo
  submodule--helper.c: remove duplicate code
  connect_work_tree_and_git_dir: safely create leading directories
2017-03-28 14:05:58 -07:00
Junio C Hamano
f6c64c648a Merge branch 'bw/attr-pathspec'
The pathspec mechanism learned to further limit the paths that
match the pattern to those that have specified attributes attached
via the gitattributes mechanism.

* bw/attr-pathspec:
  pathspec: allow escaped query values
  pathspec: allow querying for attributes
2017-03-17 13:50:26 -07:00
Stefan Beller
365444a6a5 connect_work_tree_and_git_dir: safely create leading directories
In a later patch we'll use connect_work_tree_and_git_dir when the
directory for the gitlink file doesn't exist yet. This patch makes
connect_work_tree_and_git_dir safe to use for both cases of
either the git dir or the working dir missing.

To do so, we need to call safe_create_leading_directories[_const]
on both directories. However this has to happen before we construct
the absolute paths as real_pathdup assumes the directories to
be there already.

So for both the config file in the git dir as well as the .git link
file we need to
a) construct the name
b) call SCLD
c) get the absolute path
d) once a-c is done for both we can consume the absolute path
   to compute the relative path to each other and store those
   relative paths.

The implementation provided here puts a) and b) for both cases first,
and then performs c and d after.

One of the two users of 'connect_work_tree_and_git_dir' already checked
for the directory being there, so we can loose that check as
connect_work_tree_and_git_dir handles this functionality now.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-15 18:15:53 -07:00
Brandon Williams
b0db704652 pathspec: allow querying for attributes
The pathspec mechanism is extended via the new
":(attr:eol=input)pattern/to/match" syntax to filter paths so that it
requires paths to not just match the given pattern but also have the
specified attrs attached for them to be chosen.

Based on a patch by Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-13 15:28:54 -07:00
Johannes Schindelin
ce83eadd9a real_pathdup(): fix callsites that wanted it to die on error
In 4ac9006f83 (real_path: have callers use real_pathdup and
strbuf_realpath, 2016-12-12), we changed the xstrdup(real_path())
pattern to use real_pathdup() directly.

The problem with this change is that real_path() calls
strbuf_realpath() with die_on_error = 1 while real_pathdup() calls
it with die_on_error = 0. Meaning that in cases where real_path()
causes Git to die() with an error message, real_pathdup() is silent
and returns NULL instead.

The callers, however, are ill-prepared for that change, as they expect
the return value to be non-NULL (and otherwise the function died
with an appropriate error message).

Fix this by extending real_pathdup()'s signature to accept the
die_on_error flag and simply pass it through to strbuf_realpath(),
and then adjust all callers after a careful audit whether they would
handle NULLs well.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-08 14:38:41 -08:00
René Scharfe
bec5ab8997 dir: avoid allocation in fill_directory()
Pass the match member of the first pathspec item directly to
read_directory() instead of using common_prefix() to duplicate it first,
thus avoiding memory duplication, strlen(3) and free(3).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:38:41 -08:00
Junio C Hamano
1c16df23b1 Merge branch 'bw/realpath-wo-chdir'
The implementation of "real_path()" was to go there with chdir(2)
and call getcwd(3), but this obviously wouldn't be usable in a
threaded environment.  Rewrite it to manually resolve relative
paths including symbolic links in path components.

* bw/realpath-wo-chdir:
  real_path: set errno when max number of symlinks is exceeded
  real_path: prevent redefinition of MAXSYMLINKS
2017-01-18 15:12:16 -08:00
Junio C Hamano
fe9ec8bdf6 Merge branch 'bw/pathspec-cleanup'
Code clean-up in the pathspec API.

* bw/pathspec-cleanup:
  pathspec: rename prefix_pathspec to init_pathspec_item
  pathspec: small readability changes
  pathspec: create strip submodule slash helpers
  pathspec: create parse_element_magic helper
  pathspec: create parse_long_magic function
  pathspec: create parse_short_magic function
  pathspec: factor global magic into its own function
  pathspec: simpler logic to prefix original pathspec elements
  pathspec: always show mnemonic and name in unsupported_magic
  pathspec: remove unused variable from unsupported_magic
  pathspec: copy and free owned memory
  pathspec: remove the deprecated get_pathspec function
  ls-tree: convert show_recursive to use the pathspec struct interface
  dir: convert fill_directory to use the pathspec struct interface
  dir: remove struct path_simplify
  mv: remove use of deprecated 'get_pathspec()'
2017-01-18 15:12:15 -08:00
Brandon Williams
966de3028b dir: convert fill_directory to use the pathspec struct interface
Convert 'fill_directory()' to use the pathspec struct interface from
using the '_raw' entry in the pathspec struct.

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-08 18:04:17 -08:00
Brandon Williams
e1b8c7bdc0 dir: remove struct path_simplify
Teach simplify_away() and exclude_matches_pathspec() to handle struct
pathspec directly, eliminating the need for the struct path_simplify.

Also renamed the len parameter to pathlen in exclude_matches_pathspec()
to match the parameter names used in simplify_away().

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-08 18:04:17 -08:00
Stefan Beller
f6f8586140 submodule: add absorb-git-dir function
When a submodule has its git dir inside the working dir, the submodule
support for checkout that we plan to add in a later patch will fail.

Add functionality to migrate the git directory to be absorbed
into the superprojects git directory.

The newly added code in this patch is structured such that other areas of
Git can also make use of it. The code in the submodule--helper is a mere
wrapper and option parser for the function
`absorb_git_dir_into_superproject`, that takes care of embedding the
submodules git directory into the superprojects git dir. That function
makes use of the more abstract function for this use case
`relocate_gitdir`, which can be used by e.g. the worktree code eventually
to move around a git directory.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-12 15:15:07 -08:00
Stefan Beller
47e83eb3b7 move connect_work_tree_and_git_dir to dir.h
That function was primarily used by submodule code, but the function
itself is not inherently about submodules. In the next patch we'll
introduce relocate_git_dir, which can be used by worktrees as well,
so find a neutral middle ground in dir.h.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-12 15:15:07 -08:00
Jeff King
f0056f6419 read info/{attributes,exclude} only when in repository
The low-level attribute and gitignore code will try to look
in $GIT_DIR/info for any repo-level configuration files,
even if we have not actually determined that we are in a
repository (e.g., running "git grep --no-index"). In such a
case they end up looking for ".git/info/attributes", etc.

This is generally harmless, as such a file is unlikely to
exist outside of a repository, but it's still conceptually
the wrong thing to do.

Let's detect this situation explicitly and skip reading the
file (i.e., the same behavior we'd get if we were in a
repository and the file did not exist).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26 13:30:51 -07:00
Junio C Hamano
1c2b1f7018 Merge branch 'bw/ls-files-recurse-submodules'
"git ls-files" learned "--recurse-submodules" option that can be
used to get a listing of tracked files across submodules (i.e. this
only works with "--cached" option, not for listing untracked or
ignored files).  This would be a useful tool to sit on the upstream
side of a pipe that is read with xargs to work on all working tree
files from the top-level superproject.

* bw/ls-files-recurse-submodules:
  ls-files: add pathspec matching for submodules
  ls-files: pass through safe options for --recurse-submodules
  ls-files: optionally recurse into submodules
  git: make super-prefix option
2016-10-26 13:14:44 -07:00
Brandon Williams
75a6315f74 ls-files: add pathspec matching for submodules
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule.  The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.

This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally.  If any wildcard character is encountered we'll simply punt
and produce a false positive match.  More accurate matching will be done
once inside the submodule.  This is due to the superproject not knowing
what files could exist in the submodule.

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
René Scharfe
9ed0d8d6e6 use QSORT
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT.  The resulting code is
shorter and supports empty arrays with NULL pointers.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-29 15:42:18 -07:00
brian m. carlson
99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
Junio C Hamano
b4e8a847ba Merge branch 'rs/use-strbuf-addbuf'
Code cleanup.

* rs/use-strbuf-addbuf:
  strbuf: avoid calling strbuf_grow() twice in strbuf_addbuf()
  use strbuf_addbuf() for appending a strbuf to another
2016-07-25 14:13:47 -07:00
René Scharfe
8109984d61 use strbuf_addbuf() for appending a strbuf to another
Use strbuf_addbuf() where possible; it's shorter and more efficient.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-19 11:48:35 -07:00
Junio C Hamano
352d72a30e Merge branch 'nd/worktree-various-heads'
The experimental "multiple worktree" feature gains more safety to
forbid operations on a branch that is checked out or being actively
worked on elsewhere, by noticing that e.g. it is being rebased.

* nd/worktree-various-heads:
  branch: do not rename a branch under bisect or rebase
  worktree.c: check whether branch is bisected in another worktree
  wt-status.c: split bisect detection out of wt_status_get_state()
  worktree.c: check whether branch is rebased in another worktree
  worktree.c: avoid referencing to worktrees[i] multiple times
  wt-status.c: make wt_status_check_rebase() work on any worktree
  wt-status.c: split rebase detection out of wt_status_get_state()
  path.c: refactor and add worktree_git_path()
  worktree.c: mark current worktree
  worktree.c: make find_shared_symref() return struct worktree *
  worktree.c: store "id" instead of "git_dir"
  path.c: add git_common_path() and strbuf_git_common_path()
  dir.c: rename str(n)cmp_icase to fspath(n)cmp
2016-05-23 14:54:29 -07:00
Nguyễn Thái Ngọc Duy
ba0897e6ae dir.c: rename str(n)cmp_icase to fspath(n)cmp
These functions compare two paths that are taken from file system.
Depending on the running file system, paths may need to be compared
case-sensitively or not, and maybe even something else in future. The
current names do not convey that well.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-22 14:09:37 -07:00
Nguyễn Thái Ngọc Duy
423b592a06 dir.c: remove dead function fnmatch_icase()
It was largely replaced by fnmatch_icase_mem() and its last use was in
84b8b5d (remove match_pathspec() in favor of match_pathspec_depth() -
2013-07-14).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-22 14:07:45 -07:00
Junio C Hamano
9fabc70832 Merge branch 'ss/exc-flag-is-a-collection-of-bits' into maint
Code clean-up.

* ss/exc-flag-is-a-collection-of-bits:
  dir: store EXC_FLAG_* values in unsigned integers
2016-04-14 18:37:15 -07:00
Junio C Hamano
12508a8354 Merge branch 'ss/exc-flag-is-a-collection-of-bits'
Code clean-up.

* ss/exc-flag-is-a-collection-of-bits:
  dir: store EXC_FLAG_* values in unsigned integers
2016-04-06 11:38:59 -07:00
Junio C Hamano
5cee349370 Revert "Merge branch 'nd/exclusion-regression-fix'"
This reverts commit 5e57f9c3df, reversing
changes made to e79112d210.

We will be postponing nd/exclusion-regression-fix topic to later
cycle.
2016-03-18 11:06:15 -07:00
Saurav Sachidanand
f870899864 dir: store EXC_FLAG_* values in unsigned integers
The values defined by the macro EXC_FLAG_* (1, 4, 8, 16) are stored
in fields of the structs "pattern" and "exclude", some functions
arguments and a local variable.  None of these uses its most
significant bit in any special way and there is no good reason to
use a signed integer for them.

And while we're at it, document "flags" of "exclude" to explicitly
state the values it's supposed to take on.

Signed-off-by: Saurav Sachidanand <sauravsachidanand@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-01 10:20:22 -08:00
Junio C Hamano
11529ecec9 Merge branch 'jk/tighten-alloc'
Update various codepaths to avoid manually-counted malloc().

* jk/tighten-alloc: (22 commits)
  ewah: convert to REALLOC_ARRAY, etc
  convert ewah/bitmap code to use xmalloc
  diff_populate_gitlink: use a strbuf
  transport_anonymize_url: use xstrfmt
  git-compat-util: drop mempcpy compat code
  sequencer: simplify memory allocation of get_message
  test-path-utils: fix normalize_path_copy output buffer size
  fetch-pack: simplify add_sought_entry
  fast-import: simplify allocation in start_packfile
  write_untracked_extension: use FLEX_ALLOC helper
  prepare_{git,shell}_cmd: use argv_array
  use st_add and st_mult for allocation size computation
  convert trivial cases to FLEX_ARRAY macros
  use xmallocz to avoid size arithmetic
  convert trivial cases to ALLOC_ARRAY
  convert manual allocations to argv_array
  argv-array: add detach function
  add helpers for allocating flex-array structs
  harden REALLOC_ARRAY and xcalloc against size_t overflow
  tree-diff: catch integer overflow in combine_diff_path allocation
  ...
2016-02-26 13:37:16 -08:00
Jeff King
e0b8373510 write_untracked_extension: use FLEX_ALLOC helper
We perform unchecked additions when computing the size of a
"struct ondisk_untracked_cache". This is unlikely to have an
integer overflow in practice, but we'd like to avoid this
dangerous pattern to make further audits easier.

Note that there's one subtlety here, though.  We protect
ourselves against a NULL exclude_per_dir entry in our
source, and avoid calling strlen() on it, keeping "len" at
0. But later, we unconditionally memcpy "len + 1" bytes to
get the trailing NUL byte. If we did have a NULL
exclude_per_dir, we would read from bogus memory.

As it turns out, though, we always create this field
pointing to a string literal, so there's no bug. We can just
get rid of the pointless extra conditional.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
96ffc06f72 convert trivial cases to FLEX_ARRAY macros
Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
3733e69464 use xmallocz to avoid size arithmetic
We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
b32fa95fd8 convert trivial cases to ALLOC_ARRAY
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Nguyễn Thái Ngọc Duy
d589a67ece dir.c: don't exclude whole dir prematurely
If there is a pattern "!foo/bar", this patch makes it not exclude
"foo" right away. This gives us a chance to examine "foo" and
re-include "foo/bar".

Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Micha Wiedenmann <mw-u2@gmx.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-15 15:32:33 -08:00
Nguyễn Thái Ngọc Duy
c62a91736a dir.c: support marking some patterns already matched
Given path "a" and the pattern "a", it's matched. But if we throw path
"a/b" to pattern "a", the code fails to realize that if "a" matches
"a" then "a/b" should also be matched.

When the pattern is matched the first time, we can mark it "sticky", so
that all files and dirs inside the matched path also matches. This is a
simpler solution than modify all match scenarios to fix that.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-15 15:32:32 -08:00
Nguyễn Thái Ngọc Duy
bac65a2be5 dir.c: support tracing exclude
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-15 15:32:32 -08:00
Nguyễn Thái Ngọc Duy
a60ea8fb66 dir.c: fix match_pathname()
Given the pattern "1/2/3/4" and the path "1/2/3/4/f", the pattern
prefix is "1/2/3/4". We will compare and remove the prefix from both
pattern and path and come to this code

	/*
	 * If the whole pattern did not have a wildcard,
	 * then our prefix match is all we need; we
	 * do not need to call fnmatch at all.
	 */
	if (!patternlen && !namelen)
		return 1;

where patternlen is zero (full pattern consumed) and the remaining
path in "name" is "/f". We fail to realize it's matched in this case
and fall back to fnmatch(), which also fails to catch it. Fix it.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-15 15:32:32 -08:00
Junio C Hamano
0e35fcb412 Merge branch 'cc/untracked'
Update the untracked cache subsystem and change its primary UI from
"git update-index" to "git config".

* cc/untracked:
  t7063: add tests for core.untrackedCache
  test-dump-untracked-cache: don't modify the untracked cache
  config: add core.untrackedCache
  dir: simplify untracked cache "ident" field
  dir: add remove_untracked_cache()
  dir: add {new,add}_untracked_cache()
  update-index: move 'uc' var declaration
  update-index: add untracked cache notifications
  update-index: add --test-untracked-cache
  update-index: use enum for untracked cache options
  dir: free untracked cache when removing it
2016-02-10 14:20:06 -08:00
Junio C Hamano
9496acc144 Merge branch 'nd/exclusion-regression-fix' into maint
The ignore mechanism saw a few regressions around untracked file
listing and sparse checkout selection areas in 2.7.0; the change
that is responsible for the regression has been reverted.

* nd/exclusion-regression-fix:
  Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
2016-02-05 14:54:11 -08:00
Junio C Hamano
af3e464a60 Merge branch 'nd/dir-exclude-cleanup' into maint
The "exclude_list" structure has the usual "alloc, nr" pair of
fields to be used by ALLOC_GROW(), but clear_exclude_list() forgot
to reset 'alloc' to 0 when it cleared 'nr' to discard the managed
array.

* nd/dir-exclude-cleanup:
  dir.c: clean the entire struct in clear_exclude_list()
2016-02-05 14:54:08 -08:00
Christian Couder
0e0f761842 dir: simplify untracked cache "ident" field
It is not a good idea to compare kernel versions and disable
the untracked cache if it changes, as people may upgrade and
still want the untracked cache to work. So let's just
compare work tree locations and kernel name to decide if we
should disable it.

Also storing many locations in the ident field and comparing
to any of them can be dangerous if GIT_WORK_TREE is used with
different values. So let's just store one location, the
location of the current work tree.

The downside is that untracked cache can only be used by one
type of OS for now. Exporting a git repo to different clients
via a network to e.g. Linux and Windows means that only one
can use the untracked cache.

If the location changed in the ident field and we still want
an untracked cache, let's delete the cache and recreate it.

Note that if an untracked cache has been created by a
previous Git version, then the kernel version is stored in
the ident field. As we now compare with just the kernel
name the comparison will fail and the untracked cache will
be disabled until it's recreated.

Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:40:17 -08:00
Christian Couder
07b29bfd8d dir: add remove_untracked_cache()
Factor out code into remove_untracked_cache(), which will be used
in a later commit.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:40:11 -08:00
Christian Couder
4a4ca4796d dir: add {new,add}_untracked_cache()
Factor out code into new_untracked_cache() and
add_untracked_cache(), which will be used
in later commits.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-25 12:39:58 -08:00
Junio C Hamano
76b620d816 Merge branch 'nd/exclusion-regression-fix'
The ignore mechanism saw a few regressions around untracked file
listing and sparse checkout selection areas in 2.7.0; the change
that is responsible for the regression has been reverted.

* nd/exclusion-regression-fix:
  Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
2016-01-20 11:43:33 -08:00
Junio C Hamano
7a450b48e7 Merge branch 'nd/dir-exclude-cleanup'
The "exclude_list" structure has the usual "alloc, nr" pair of
fields to be used by ALLOC_GROW(), but clear_exclude_list() forgot
to reset 'alloc' to 0 when it cleared 'nr'to discard the managed
array.

* nd/dir-exclude-cleanup:
  dir.c: clean the entire struct in clear_exclude_list()
2016-01-20 11:43:24 -08:00
Nguyễn Thái Ngọc Duy
8c722360d1 Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
This reverts commit 57534ee77d. The
feature added in that commit requires that patterns behave the same way
from anywhere. But some patterns can behave differently depending on
current "working" directory. The conditions to catch and avoid these
patterns are too loose. The untracked listing[1] and sparse-checkout
selection[2] can become incorrect as a result.

  [1] http://article.gmane.org/gmane.comp.version-control.git/283520
  [2] http://article.gmane.org/gmane.comp.version-control.git/283532

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-08 11:24:14 -08:00