git-commit-vandalism

Author	SHA1	Message	Date
Nguyễn Thái Ngọc Duy	836ef2b69f	unpack-trees: reuse (still valid) cache-tree from src_index We do n-way merge by walking the source index and n trees at the same time and add merge results to a new temporary index called o->result. The merge result for any given path could be either - keep_entry(): same old index entry in o->src_index is reused - merged_entry(): either a new entry is added, or an existing one updated - deleted_entry(): one entry from o->src_index is removed For some reason [1] we keep making sure that the source index's cache-tree is still valid if used by o->result: for all those merged/deleted entries, we invalidate the same path in o->src_index, so only cache-trees covering the "keep_entry" parts remain good. Because of this, the cache-tree from o->src_index can be perfectly reused in o->result. And in fact we already rely on this logic to reuse untracked cache in `edf3b90553` (unpack-trees: preserve index extensions - 2017-05-08). Move the cache-tree to o->result before doing cache_tree_update() to reduce hashing cost. Since cache_tree_update() has risen up as one of the most expensive parts in unpack_trees() after the last few patches. This does help reduce unpack_trees() time significantly (on webkit.git): before after -------------------------------------------------------------------- 0.080394752 0.051258167 s: read cache .git/index 0.216010838 0.212106298 s: preload index 0.008534301 0.280521764 s: refresh index 0.251992198 0.218160442 s: traverse_trees 0.377031383 0.374948191 s: check_updates 0.372768105 0.037040114 s: cache_tree_update 1.045887251 0.672031609 s: unpack_trees 0.314983512 0.317456290 s: write index, changed mask = 2e 0.062572653 0.038382654 s: traverse_trees 0.000022544 0.000042731 s: check_updates 0.073795585 0.050930053 s: unpack_trees 0.073807557 0.051099735 s: diff-index 1.938191592 1.614241153 s: git command: git checkout - [1] I'm pretty sure the reason is an oversight in `34110cd4e3` (Make 'unpack_trees()' have a separate source and destination index - 2008-03-06). That patch aims to _not_ update the source index at all. The invalidation should have been done on o->result in that patch. But then there was no cache-tree on o->result even then so it's pointless to do so. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-18 09:47:46 -07:00
Nguyễn Thái Ngọc Duy	f1e11c6510	unpack-trees: reduce malloc in cache-tree walk This is a micro optimization that probably only shines on repos with deep directory structure. Instead of allocating and freeing a new cache_entry in every iteration, we reuse the last one and only update the parts that are new each iteration. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-18 09:47:46 -07:00
Nguyễn Thái Ngọc Duy	b4da37380b	unpack-trees: optimize walking same trees with cache-tree In order to merge one or many trees with the index, unpack-trees code walks multiple trees in parallel with the index and performs n-way merge. If we find out at start of a directory that all trees are the same (by comparing OID) and cache-tree happens to be available for that directory as well, we could avoid walking the trees because we already know what these trees contain: it's flattened in what's called "the index". The upside is of course a lot less I/O since we can potentially skip lots of trees (think subtrees). We also save CPU because we don't have to inflate and apply the deltas. The downside is of course more fragile code since the logic in some functions are now duplicated elsewhere. "checkout -" with this patch on webkit.git (275k files): baseline new -------------------------------------------------------------------- 0.056651714 0.080394752 s: read cache .git/index 0.183101080 0.216010838 s: preload index 0.008584433 0.008534301 s: refresh index 0.633767589 0.251992198 s: traverse_trees 0.340265448 0.377031383 s: check_updates 0.381884638 0.372768105 s: cache_tree_update 1.401562947 1.045887251 s: unpack_trees 0.338687914 0.314983512 s: write index, changed mask = 2e 0.411927922 0.062572653 s: traverse_trees 0.000023335 0.000022544 s: check_updates 0.423697246 0.073795585 s: unpack_trees 0.423708360 0.073807557 s: diff-index 2.559524127 1.938191592 s: git command: git checkout - Another measurement from Ben's running "git checkout" with over 500k trees (on the whole series): baseline new ---------------------------------------------------------------------- 0.535510167 0.556558733 s: read cache .git/index 0.3057373 0.3147105 s: initialize name hash 0.0184082 0.023558433 s: preload index 0.086910967 0.089085967 s: refresh index 7.889590767 2.191554433 s: unpack trees 0.120760833 0.131941267 s: update worktree after a merge 2.2583504 2.572663167 s: repair cache-tree 0.8916137 0.959495233 s: write index, changed mask = 28 3.405199233 0.2710663 s: unpack trees 0.000999667 0.0021554 s: update worktree after a merge 3.4063306 0.273318333 s: diff-index 16.9524923 9.462943133 s: git command: git.exe checkout This command calls unpack_trees() twice, the first time on 2way merge and the second 1way merge. In both times, "unpack trees" time is reduced to one third. Overall time reduction is not that impressive of course because index operations take a big chunk. And there's that repair cache-tree line. PS. A note about cache-tree invalidation and the use of it in this code. We do invalidate cache-tree in _source_ index when we add new entries to the (temporary) "result" index. But we also use the cache-tree from source index in this optimization. Does this mean we end up having no cache-tree in the source index to activate this optimization? The answer is twisted: the order of finding a good cache-tree and invalidating it matters. In this case we check for a good cache-tree first in all_trees_same_as_cache_tree(), then we start to merge things and potentially invalidate that same cache-tree in the process. Since cache-tree invalidation happens after the optimization kicks in, we're still good. But we may lose that cache-tree at the very first call_unpack_fn() call in traverse_by_cache_tree(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-18 09:47:46 -07:00
Nguyễn Thái Ngọc Duy	0d1ed5963d	unpack-trees: add performance tracing We're going to optimize unpack_trees() a bit in the following patches. Let's add some tracing to measure how long it takes before and after. This is the baseline ("git checkout -" on webkit.git, 275k files on worktree) performance: 0.056651714 s: read cache .git/index performance: 0.183101080 s: preload index performance: 0.008584433 s: refresh index performance: 0.633767589 s: traverse_trees performance: 0.340265448 s: check_updates performance: 0.381884638 s: cache_tree_update performance: 1.401562947 s: unpack_trees performance: 0.338687914 s: write index, changed mask = 2e performance: 0.411927922 s: traverse_trees performance: 0.000023335 s: check_updates performance: 0.423697246 s: unpack_trees performance: 0.423708360 s: diff-index performance: 2.559524127 s: git command: git checkout - Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-18 09:47:46 -07:00
Duy Nguyen	b878579ae7	clone: report duplicate entries on case-insensitive filesystems Paths that only differ in case work fine in a case-sensitive filesystems, but if those repos are cloned in a case-insensitive one, you'll get problems. The first thing to notice is "git status" will never be clean with no indication what exactly is "dirty". This patch helps the situation a bit by pointing out the problem at clone time. Even though this patch talks about case sensitivity, the patch makes no assumption about folding rules by the filesystem. It simply observes that if an entry has been already checked out at clone time when we're about to write a new path, some folding rules are behind this. In the case that we can't rely on filesystem (via inode number) to do this check, fall back to fspathcmp() which is not perfect but should not give false positives. This patch is tested with vim-colorschemes and Sublime-Gitignore repositories on a JFS partition with case insensitive support on Linux. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-17 12:10:37 -07:00
Nguyễn Thái Ngọc Duy	c4500e251f	attr: remove index from git_attr_set_direction() Since attr checking API now take the index, there's no need to set an index in advance with this call. Most call sites are straightforward because they either pass the_index or NULL (which defaults back to the_index previously). There's only one suspicious call site in unpack-trees.c where it sets a different index. This code in unpack-trees is about to check out entries from the new/temporary index after merging is done in it. The attributes will be used by entry.c code to do crlf conversion if needed. entry.c now respects struct checkout's istate field, and this field is correctly set in unpack-trees.c, there should be no regression from this change. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy	c7f3259d0d	unpack-trees: avoid the_index in verify_absent() Both functions that are updated in this commit are called by verify_absent(), which is part of the "unpack-trees" operation that is supposed to work on any index file specified by the caller. Thanks to Brandon [1] [2], an implicit dependency on the_index is exposed. This commit fixes it. In both functions, it makes sense to use src_index to check for exclusion because it's almost unchanged and should give us the same outcome as if running the exclude check before the unpack. It's "almost unchanged" because we do invalidate cache-tree and untracked cache in the source index. But this should not affect how exclude machinery uses the index: to see if a file is tracked, and to read a blob from the index instead of worktree if it's marked skip-worktree (i.e. it's not available in worktree) [1] `a0bba65b10` (dir: convert is_excluded to take an index - 2017-05-05 [2] `2c1eb10454` (dir: convert read_directory to take an index - 2017-05-05) Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy	27c82fb3b4	unpack-trees: convert clear_ce_flags* to avoid the_index Prior to `fba92be8f7`, this code implicitly (and incorrectly) assumes the_index when running the exclude machinery. `fba92be8f7` helps show this problem clearer because unpack-trees operation is supposed to work on whatever index the caller specifies... not specifically the_index. Update the code to use "istate" argument that's originally from mark_new_skip_worktree(). From the call sites, both in unpack_trees(), you can see that this function works on two separate indexes: o->src_index and o->result. The second mark_new_skip_worktree() so far has incorecctly applied exclude rules on o->src_index instead of o->result. It's unclear what is the consequences of this, but it's definitely wrong. [1] `fba92be8f7` (dir: convert is_excluded_from_list to take an index - 2017-05-05) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy	86016ec304	unpack-trees: don't shadow global var the_index This function mark_new_skip_worktree() has an argument named the_index which is also the name of a global variable. While they have different types (the global the_index is not a pointer) mistakes can easily happen and it's also confusing for readers. Rename the function argument to something other than the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy	383480ba4f	unpack-trees: add a note about path invalidation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:43 -07:00
Junio C Hamano	ae533c4a92	Merge branch 'jm/cache-entry-from-mem-pool' For a large tree, the index needs to hold many cache entries allocated on heap. These cache entries are now allocated out of a dedicated memory pool to amortize malloc(3) overhead. * jm/cache-entry-from-mem-pool: block alloc: add validations around cache_entry lifecyle block alloc: allocate cache entries from mem_pool mem-pool: fill out functionality mem-pool: add life cycle management functions mem-pool: only search head block for available space block alloc: add lifecycle APIs for cache_entry structs read-cache: teach make_cache_entry to take object_id read-cache: teach refresh_cache_entry to take istate	2018-08-02 15:30:43 -07:00
Junio C Hamano	284b444932	Merge branch 'mk/merge-in-sparse-checkout' "git reset --merge" (hence "git merge ---abort") and "git reset --hard" had trouble working correctly in a sparsely checked out working tree after a conflict, which has been corrected. * mk/merge-in-sparse-checkout: unpack-trees: do not fail reset because of unmerged skipped entry	2018-07-24 14:50:48 -07:00
Junio C Hamano	00624d608c	Merge branch 'sb/object-store-grafts' The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-07-18 12:20:28 -07:00
Max Kirillov	b33fdfc34c	unpack-trees: do not fail reset because of unmerged skipped entry After modify/delete merge conflict happens in a file skipped by sparse checkout, "git reset --merge", which implements the "--abort" actions, and "git reset --hard" fail with message "Entry * not uptodate. Cannot update sparse checkout." As explained in [1], the up-to-date checker mistakenly treats conflicted entry which does not exist in HEAD as still skipped by sparse checkout. Use the fix suggested in [1]. Also, add test case which verifies the issue is fixed. [1] https://public-inbox.org/git/20180616051444.GA29754@duynguyen.home/ Signed-off-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Max Kirillov <max@max630.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-11 09:35:41 -07:00
Jameson Miller	8e72d67529	block alloc: allocate cache entries from mem_pool When reading large indexes from disk, a portion of the time is dominated in malloc() calls. This can be mitigated by allocating a large block of memory and manage it ourselves via memory pools. This change moves the cache entry allocation to be on top of memory pools. Design: The index_state struct will gain a notion of an associated memory_pool from which cache_entries will be allocated from. When reading in the index from disk, we have information on the number of entries and their size, which can guide us in deciding how large our initial memory allocation should be. When an index is discarded, the associated memory_pool will be discarded as well - so the lifetime of a cache_entry is tied to the lifetime of the index_state that it was allocated for. In the case of a Split Index, the following rules are followed. 1st, some terminology is defined: Terminology: - 'the_index': represents the logical view of the index - 'split_index': represents the "base" cache entries. Read from the split index file. 'the_index' can reference a single split_index, as well as cache_entries from the split_index. `the_index` will be discarded before the `split_index` is. This means that when we are allocating cache_entries in the presence of a split index, we need to allocate the entries from the `split_index`'s memory pool. This allows us to follow the pattern that `the_index` can reference cache_entries from the `split_index`, and that the cache_entries will not be freed while they are still being referenced. Managing transient cache_entry structs: Cache entries are usually allocated for an index, but this is not always the case. Cache entries are sometimes allocated because this is the type that the existing checkout_entry function works with. Because of this, the existing code needs to handle cache entries associated with an index / memory pool, and those that only exist transiently. Several strategies were contemplated around how to handle this: Chosen approach: An extra field was added to the cache_entry type to track whether the cache_entry was allocated from a memory pool or not. This is currently an int field, as there are no more available bits in the existing ce_flags bit field. If / when more bits are needed, this new field can be turned into a proper bit field. Alternatives: 1) Do not include any information about how the cache_entry was allocated. Calling code would be responsible for tracking whether the cache_entry needed to be freed or not. Pro: No extra memory overhead to track this state Con: Extra complexity in callers to handle this correctly. The extra complexity and burden to not regress this behavior in the future was more than we wanted. 2) cache_entry would gain knowledge about which mem_pool allocated it Pro: Could (potentially) do extra logic to know when a mem_pool no longer had references to any cache_entry Con: cache_entry would grow heavier by a pointer, instead of int We didn't see a tangible benefit to this approach 3) Do not add any extra information to a cache_entry, but when freeing a cache entry, check if the memory exists in a region managed by existing mem_pools. Pro: No extra memory overhead to track state Con: Extra computation is performed when freeing cache entries We decided tracking and iterating over known memory pool regions was less desirable than adding an extra field to track this stae. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:27 -07:00
Jameson Miller	a849735bfb	block alloc: add lifecycle APIs for cache_entry structs It has been observed that the time spent loading an index with a large number of entries is partly dominated by malloc() calls. This change is in preparation for using memory pools to reduce the number of malloc() calls made to allocate cahce entries when loading an index. Add an API to allocate and discard cache entries, abstracting the details of managing the memory backing the cache entries. This commit does actually change how memory is managed - this will be done in a later commit in the series. This change makes the distinction between cache entries that are associated with an index and cache entries that are not associated with an index. A main use of cache entries is with an index, and we can optimize the memory management around this. We still have other cases where a cache entry is not persisted with an index, and so we need to handle the "transient" use case as well. To keep the congnitive overhead of managing the cache entries, there will only be a single discard function. This means there must be enough information kept with the cache entry so that we know how to discard them. A summary of the main functions in the API is: make_cache_entry: create cache entry for use in an index. Uses specified parameters to populate cache_entry fields. make_empty_cache_entry: Create an empty cache entry for use in an index. Returns cache entry with empty fields. make_transient_cache_entry: create cache entry that is not used in an index. Uses specified parameters to populate cache_entry fields. make_empty_transient_cache_entry: create cache entry that is not used in an index. Returns cache entry with empty fields. discard_cache_entry: A single function that knows how to discard a cache entry regardless of how it was allocated. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:27 -07:00
Junio C Hamano	e47dbece39	Merge branch 'ma/unpack-trees-free-msgs' Leak plugging. * ma/unpack-trees-free-msgs: unpack_trees_options: free messages when done argv-array: return the pushed string from argv_push*() merge-recursive: provide pair of `unpack_trees_{start,finish}()` merge: setup `opts` later in `checkout_fast_forward()`	2018-05-30 21:51:29 +09:00
Junio C Hamano	42c8ce1c49	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (42 commits) merge-one-file: compute empty blob object ID add--interactive: compute the empty tree value Update shell scripts to compute empty tree object ID sha1_file: only expose empty object constants through git_hash_algo dir: use the_hash_algo for empty blob object ID sequencer: use the_hash_algo for empty tree object ID cache-tree: use is_empty_tree_oid sha1_file: convert cached object code to struct object_id builtin/reset: convert use of EMPTY_TREE_SHA1_BIN builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX wt-status: convert two uses of EMPTY_TREE_SHA1_HEX submodule: convert several uses of EMPTY_TREE_SHA1_HEX sequencer: convert one use of EMPTY_TREE_SHA1_HEX merge: convert empty tree constant to the_hash_algo builtin/merge: switch tree functions to use object_id builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo sha1-file: add functions for hex empty tree and blob OIDs builtin/receive-pack: avoid hard-coded constants for push certs diff: specify abbreviation size in terms of the_hash_algo upload-pack: replace use of several hard-coded constants ...	2018-05-30 14:04:10 +09:00
Junio C Hamano	50f08db594	Merge branch 'js/use-bug-macro' Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths	2018-05-30 14:04:07 +09:00
Junio C Hamano	d658196f3c	Merge branch 'en/unpack-trees-split-index-fix' The split-index feature had a long-standing and dormant bug in certain use of the in-core merge machinery, which has been fixed. * en/unpack-trees-split-index-fix: unpack_trees: fix breakage when o->src_index != o->dst_index	2018-05-23 14:38:22 +09:00
Junio C Hamano	c67de747f4	Merge branch 'en/rename-directory-detection-reboot' Rename detection logic in "diff" family that is used in "merge" has learned to guess when all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is likely that x/d added in the meantime would also want to move to z/d by taking the hint that the entire directory 'x' moved to 'z'. A bug causing dirty files involved in a rename to be overwritten during merge has also been fixed as part of this work. Incidentally, this also avoids updating a file in the working tree after a (non-trivial) merge whose result matches what our side originally had. * en/rename-directory-detection-reboot: (36 commits) merge-recursive: fix check for skipability of working tree updates merge-recursive: make "Auto-merging" comment show for other merges merge-recursive: fix remainder of was_dirty() to use original index merge-recursive: fix was_tracked() to quit lying with some renamed paths t6046: testcases checking whether updates can be skipped in a merge merge-recursive: avoid triggering add_cacheinfo error with dirty mod merge-recursive: move more is_dirty handling to merge_content merge-recursive: improve add_cacheinfo error handling merge-recursive: avoid spurious rename/rename conflict from dir renames directory rename detection: new testcases showcasing a pair of bugs merge-recursive: fix remaining directory rename + dirty overwrite cases merge-recursive: fix overwriting dirty files involved in renames merge-recursive: avoid clobbering untracked files with directory renames merge-recursive: apply necessary modifications for directory renames merge-recursive: when comparing files, don't include trees merge-recursive: check for file level conflicts then get new name merge-recursive: add computation of collisions due to dir rename & merging merge-recursive: check for directory level conflicts merge-recursive: add get_directory_renames() merge-recursive: make a helper function for cleanup for handle_renames ...	2018-05-23 14:38:19 +09:00
Martin Ågren	1c41d2805e	unpack_trees_options: free messages when done The strings allocated in `setup_unpack_trees_porcelain()` are never freed. Provide a function `clear_unpack_trees_porcelain()` to do so and call it where we use `setup_unpack_trees_porcelain()`. The only non-trivial user is `unpack_trees_start()`, where we should place the new call in `unpack_trees_finish()`. We keep the string pointers in an array, mixing pointers to static memory and memory that we allocate on the heap. We also keep several copies of the individual pointers. So we need to make sure that we do not free what we must not free and that we do not double-free. Let a separate argv_array take ownership of all the strings we create so that we can easily free them. Zero the whole array of string pointers to make sure that we do not leave any dangling pointers. Note that we only take responsibility for the memory allocated in `setup_unpack_trees_porcelain()` and not any other members of the `struct unpack_trees_options`. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 11:59:31 +09:00
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:42:03 +09:00
Elijah Newren	64b1abe962	merge-recursive: fix overwriting dirty files involved in renames This fixes an issue that existed before my directory rename detection patches that affects both normal renames and renames implied by directory rename detection. Additional codepaths that only affect overwriting of dirty files that are involved in directory rename detection will be added in a subsequent commit. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-08 16:11:00 +09:00
Junio C Hamano	0c7ecb7c31	Merge branch 'sb/submodule-move-nested' Moving a submodule that itself has submodule in it with "git mv" forgot to make necessary adjustment to the nested sub-submodules; now the codepath learned to recurse into the submodules. * sb/submodule-move-nested: submodule: fixup nested submodules after moving the submodule submodule-config: remove submodule_from_cache submodule-config: add repository argument to submodule_from_{name, path} submodule-config: allow submodule_free to handle arbitrary repositories grep: remove "repo" arg from non-supporting funcs submodule.h: drop declaration of connect_work_tree_and_git_dir	2018-05-08 15:59:17 +09:00
Johannes Schindelin	033abf97fc	Replace all die("BUG: ...") calls by BUG() ones In `d8193743e0` (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in `588a538ae5` (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-06 19:06:13 +09:00
brian m. carlson	75691ea345	Update struct index_state to use struct object_id Adjust struct index_state to use struct object_id instead of unsigned char [20]. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:50 +09:00
Elijah Newren	7db118303a	unpack_trees: fix breakage when o->src_index != o->dst_index Currently, all callers of unpack_trees() set o->src_index == o->dst_index. The code in unpack_trees() does not correctly handle them being different. There are two separate issues: First, there is the possibility of memory corruption. Since unpack_trees() creates a temporary index in o->result and then discards o->dst_index and overwrites it with o->result, in the special case that o->src_index == o->dst_index, it is safe to just reuse o->src_index's split_index for o->result. However, when src and dst are different, reusing o->src_index's split_index for o->result will cause the split_index to be shared. If either index then has entries replaced or removed, it will result in the other index referring to free()'d memory. Second, we can drop the index extensions. Previously, we were moving index extensions from o->dst_index to o->result. Since o->src_index is the one that will have the necessary extensions (o->dst_index is likely to be a new index temporary index created to store the results), we should be moving the index extensions from there. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 10:21:28 +09:00
Junio C Hamano	8b026edac3	Revert "Merge branch 'en/rename-directory-detection'" This reverts commit `e4bb62fa1e`, reversing changes made to `468165c1d8`. The topic appears to inflict severe regression in renaming merges, even though the promise of it was that it would improve them. We do not yet know which exact change in the topic was wrong, but in the meantime, let's play it safe and revert it out of 'master' before real Git-using projects are harmed. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-11 18:07:11 +09:00
Junio C Hamano	e4bb62fa1e	Merge branch 'en/rename-directory-detection' Rename detection logic in "diff" family that is used in "merge" has learned to guess when all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is likely that x/d added in the meantime would also want to move to z/d by taking the hint that the entire directory 'x' moved to 'z'. A bug causing dirty files involved in a rename to be overwritten during merge has also been fixed as part of this work. * en/rename-directory-detection: (29 commits) merge-recursive: ensure we write updates for directory-renamed file merge-recursive: avoid spurious rename/rename conflict from dir renames directory rename detection: new testcases showcasing a pair of bugs merge-recursive: fix remaining directory rename + dirty overwrite cases merge-recursive: fix overwriting dirty files involved in renames merge-recursive: avoid clobbering untracked files with directory renames merge-recursive: apply necessary modifications for directory renames merge-recursive: when comparing files, don't include trees merge-recursive: check for file level conflicts then get new name merge-recursive: add computation of collisions due to dir rename & merging merge-recursive: check for directory level conflicts merge-recursive: add get_directory_renames() merge-recursive: make a helper function for cleanup for handle_renames merge-recursive: split out code for determining diff_filepairs merge-recursive: make !o->detect_rename codepath more obvious merge-recursive: fix leaks of allocated renames and diff_filepairs merge-recursive: introduce new functions to handle rename logic merge-recursive: move the get_renames() function directory rename detection: tests for handling overwriting dirty files directory rename detection: tests for handling overwriting untracked files ...	2018-04-10 08:25:43 +09:00
Junio C Hamano	c2a499e6c3	Merge branch 'jh/partial-clone' Hotfix. * jh/partial-clone: upload-pack: disable object filtering when disabled by config unpack-trees: release oid_array after use in check_updates()	2018-03-29 15:39:59 -07:00
Stefan Beller	f793b895fd	submodule-config: allow submodule_free to handle arbitrary repositories At some point we may want to rename the function so that it describes what it actually does as 'submodule_free' doesn't quite describe that this clears a repository's submodule cache. But that's beyond the scope of this series. While at it remove the extern key word from its declaration. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-29 09:44:50 -07:00
René Scharfe	9f242a1336	unpack-trees: release oid_array after use in check_updates() Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-25 10:51:46 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Elijah Newren	e0052f4613	merge-recursive: fix overwriting dirty files involved in renames This fixes an issue that existed before my directory rename detection patches that affects both normal renames and renames implied by directory rename detection. Additional codepaths that only affect overwriting of dirty files that are involved in directory rename detection will be added in a subsequent commit. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-27 14:11:58 -08:00
Junio C Hamano	cf44c1e0a2	Merge branch 'nd/fix-untracked-cache-invalidation' Some bugs around "untracked cache" feature have been fixed. * nd/fix-untracked-cache-invalidation: dir.c: ignore paths containing .git when invalidating untracked cache dir.c: stop ignoring opendir() error in open_cached_dir() dir.c: fix missing dir invalidation in untracked code dir.c: avoid stat() in valid_cached_dir() status: add a failing test showing a core.untrackedCache bug	2018-02-27 10:33:50 -08:00
Brandon Williams	69caed593e	unpack-trees: rename 'new' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Junio C Hamano	6bed209a20	Merge branch 'jh/partial-clone' The machinery to clone & fetch, which in turn involves packing and unpacking objects, have been told how to omit certain objects using the filtering mechanism introduced by the jh/object-filtering topic, and also mark the resulting pack as a promisor pack to tolerate missing objects, taking advantage of the mechanism introduced by the jh/fsck-promisors topic. * jh/partial-clone: t5616: test bulk prefetch after partial fetch fetch: inherit filter-spec from partial clone t5616: end-to-end tests for partial clone fetch-pack: restore save_commit_buffer after use unpack-trees: batch fetching of missing blobs clone: partial clone partial-clone: define partial clone settings in config fetch: support filters fetch: refactor calculation of remote list fetch-pack: test support excluding large blobs fetch-pack: add --no-filter fetch-pack, index-pack, transport: partial clone upload-pack: add object filtering for partial clone	2018-02-13 13:39:04 -08:00
Nguyễn Thái Ngọc Duy	0cacebf099	dir.c: ignore paths containing .git when invalidating untracked cache read_directory() code ignores all paths named ".git" even if it's not a valid git repository. See treat_path() for details. Since ".git" is basically invisible to read_directory(), when we are asked to invalidate a path that contains ".git", we can safely ignore it because the slow path would not consider it anyway. This helps when fsmonitor is used and we have a real ".git" repo at worktree top. Occasionally .git/index will be updated and if the fsmonitor hook does not filter it, untracked cache is asked to invalidate the path ".git/index". Without this patch, we invalidate the root directory unncessarily, which: - makes read_directory() fall back to slow path for root directory (slower) - makes the index dirty (because UNTR extension is updated). Depending on the index size, writing it down could also be slow. A note about the new "safe_path" knob. Since this new check could be relatively expensive, avoid it when we know it's not needed. If the path comes from the index, it can't contain ".git". If it does contain, we may be screwed up at many more levels, not just this one. Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-07 12:27:02 -08:00
Stefan Beller	ad17312e11	unpack-trees: oneway_merge to update submodules When there is a one way merge, each submodule needs to be one way merged as well, if we're asked to recurse into submodules. In case of a submodule, check if it is up-to-date, otherwise set the flag CE_UPDATE, which will trigger an update of it in the phase updating the tree later. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:35:35 -08:00
Jonathan Tan	c0c578b33c	unpack-trees: batch fetching of missing blobs When running checkout, first prefetch all blobs that are to be updated but are missing. This means that only one pack is downloaded during such operations, instead of one per missing blob. This operates only on the blob level - if a repository has a missing tree, they are still fetched one at a time. This does not use the delayed checkout mechanism introduced in commit `2841e8f` ("convert: add "status=delayed" to filter process protocol", 2017-06-30) due to significant conceptual differences - in particular, for partial clones, we already know what needs to be fetched based on the contents of the local repo alone, whereas for status=delayed, it is the filter process that tells us what needs to be checked in the end. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:51 -08:00
Junio C Hamano	e05336bdda	Merge branch 'bp/fsmonitor' We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64	2017-11-21 14:07:50 +09:00
brian m. carlson	a98e6101f0	refs: convert resolve_gitlink_ref to struct object_id Convert the declaration and definition of resolve_gitlink_ref to use struct object_id and apply the following semantic patch: @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3.hash) + resolve_gitlink_ref(E1, E2, &E3) @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3->hash) + resolve_gitlink_ref(E1, E2, E3) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:05:51 +09:00
brian m. carlson	1053fe829c	Convert remaining callers of resolve_gitlink_ref to object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:05:51 +09:00
Ben Peart	883e248b8a	fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:23:01 +09:00
Junio C Hamano	7fbbd3ec0f	Merge branch 'ls/convert-filter-progress' The codepath to call external process filter for smudge/clean operation learned to show the progress meter. * ls/convert-filter-progress: convert: display progress for filtered objects that have been delayed	2017-09-10 17:08:22 +09:00
Junio C Hamano	8e36002add	Merge branch 'ma/up-to-date' Message and doc updates. * ma/up-to-date: treewide: correct several "up-to-date" to "up to date" Documentation/user-manual: update outdated example output	2017-09-10 17:08:22 +09:00
Junio C Hamano	614ea03a71	Merge branch 'bw/submodule-config-cleanup' Code clean-up to avoid mixing values read from the .gitmodules file and values read from the .git/config file. * bw/submodule-config-cleanup: submodule: remove gitmodules_config unpack-trees: improve loading of .gitmodules submodule-config: lazy-load a repository's .gitmodules file submodule-config: move submodule-config functions to submodule-config.c submodule-config: remove support for overlaying repository config diff: stop allowing diff to have submodules configured in .git/config submodule: remove submodule_config callback routine unpack-trees: don't respect submodule.update submodule: don't rely on overlayed config when setting diffopts fetch: don't overlay config with submodule-config submodule--helper: don't overlay config in update-clone submodule--helper: don't overlay config in remote_submodule_branch add, reset: ensure submodules can be added or reset submodule: don't use submodule_from_name t7411: check configuration parsing errors	2017-08-26 22:55:08 -07:00
Lars Schneider	52f1d62eb4	convert: display progress for filtered objects that have been delayed In `2841e8f` ("convert: add "status=delayed" to filter process protocol", 2017-06-30) we taught the filter process protocol to delayed responses. These responses are processed after the "Checking out files" phase. If the processing takes noticeable time, then the user might think Git is stuck. Display the progress of the delayed responses to let the user know that Git is still processing objects. This works very well for objects that can be filtered quickly. If filtering of an individual object takes noticeable time, then the user might still think that Git is stuck. However, in that case the user would at least know what Git is doing. It would be technical more correct to display "Checking out files whose content filtering has been delayed". For brevity we only print "Filtering content". The finish_delayed_checkout() call was moved below the stop_progress() call in unpack-trees.c to ensure that the "Checking out files" progress is properly stopped before the "Filtering content" progress starts in finish_delayed_checkout(). Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Suggested-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-24 12:41:20 -07:00
Junio C Hamano	d33a433236	Merge branch 'jc/simplify-progress' The API to start showing progress meter after a short delay has been simplified. * jc/simplify-progress: progress: simplify "delayed" progress API	2017-08-24 10:20:02 -07:00
Junio C Hamano	11bd95604a	Merge branch 'rs/object-id' Conversion from uchar[20] to struct object_id continues. * rs/object-id: tree-walk: convert fill_tree_descriptor() to object_id	2017-08-24 10:20:02 -07:00
Martin Ågren	7560f547e6	treewide: correct several "up-to-date" to "up to date" Follow the Oxford style, which says to use "up-to-date" before the noun, but "up to date" after it. Don't change plumbing (specifically send-pack.c, but transport.c (git push) also has the same string). This was produced by grepping for "up-to-date" and "up to date". It turned out we only had to edit in one direction, removing the hyphens. Fix a typo in Documentation/git-diff-index.txt while we're there. Reported-by: Jeffrey Manian <jeffrey.manian@gmail.com> Reported-by: STEVEN WHITE <stevencharleswhitevoices@gmail.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 12:17:22 -07:00
Junio C Hamano	5aa0b6c506	Merge branch 'bw/grep-recurse-submodules' "git grep --recurse-submodules" has been reworked to give a more consistent output across submodule boundary (and do its thing without having to fork a separate process). * bw/grep-recurse-submodules: grep: recurse in-process using 'struct repository' submodule: merge repo_read_gitmodules and gitmodules_config submodule: check for unmerged .gitmodules outside of config parsing submodule: check for unstaged .gitmodules outside of config parsing submodule: remove fetch.recursesubmodules from submodule-config parsing submodule: remove submodule.fetchjobs from submodule-config parsing config: add config_from_gitmodules cache.h: add GITMODULES_FILE macro repository: have the_repository use the_index repo_read_index: don't discard the index	2017-08-22 10:29:01 -07:00
Junio C Hamano	8aade107dd	progress: simplify "delayed" progress API We used to expose the full power of the delayed progress API to the callers, so that they can specify, not just the message to show and expected total amount of work that is used to compute the percentage of work performed so far, the percent-threshold parameter P and the delay-seconds parameter N. The progress meter starts to show at N seconds into the operation only if we have not yet completed P per-cent of the total work. Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there are oddballs that chose more random-looking values like 95%. For a smoother workload, (50%, 1s) would allow us to start showing the progress meter earlier than (0%, 2s), while keeping the chance of not showing progress meter for long running operation the same as the latter. For a task that would take 2s or more to complete, it is likely that less than half of it would complete within the first second, if the workload is smooth. But for a spiky workload whose earlier part is easier, such a setting is likely to fail to show the progress meter entirely and (0%, 2s) is more appropriate. But that is merely a theory. Realistically, it is of dubious value to ask each codepath to carefully consider smoothness of their workload and specify their own setting by passing two extra parameters. Let's simplify the API by dropping both parameters and have everybody use (0%, 2s). Oh, by the way, the percent-threshold parameter and the structure member were consistently misspelled, which also is now fixed ;-) Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-19 14:01:34 -07:00
René Scharfe	5c377d3d59	tree-walk: convert fill_tree_descriptor() to object_id All callers of fill_tree_descriptor() have been converted to object_id already, so convert that function as well. As a nice side-effect we get rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already does them for us. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-14 12:38:54 -07:00
Junio C Hamano	51b8aecabe	Merge branch 'ls/filter-process-delayed' The filter-process interface learned to allow a process with long latency give a "delayed" response. * ls/filter-process-delayed: convert: add "status=delayed" to filter process protocol convert: refactor capabilities negotiation convert: move multiple file filter error handling to separate function convert: put the flags field before the flag itself for consistent style t0021: write "OUT <size>" only on success t0021: make debug log file name configurable t0021: keep filter log files on comparison	2017-08-11 13:27:00 -07:00
Brandon Williams	3302871320	unpack-trees: improve loading of .gitmodules When recursing submodules 'check_updates()' needs to have strict control over the submodule-config subsystem to ensure that the gitmodules file has been read before checking cache entries which are marked for removal as well ensuring the proper gitmodules file is read before updating cache entries. Because of this let's not rely on callers of 'check_updates()' to read the gitmodules file before calling 'check_updates()' and handle the reading explicitly. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-03 13:11:02 -07:00
Brandon Williams	7463e2ec3e	unpack-trees: don't respect submodule.update The 'submodule.update' config was historically used and respected by the 'submodule update' command because update handled a variety of different ways it updated a submodule. As we begin teaching other commands about submodules it makes more sense for the different settings of 'submodule.update' to be handled by the individual commands themselves (checkout, rebase, merge, etc) so it shouldn't be respected by the native checkout command. Also remove the overlaying of the repository's config (via using 'submodule_config()') from the commands which use the unpack-trees logic (checkout, read-tree, reset). Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-03 13:11:01 -07:00
Brandon Williams	4c0eeafe47	cache.h: add GITMODULES_FILE macro Add a macro to be used when specifying the '.gitmodules' file and convert any existing hard coded '.gitmodules' file strings to use the new macro. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-02 14:26:46 -07:00
Lars Schneider	2841e8f81c	convert: add "status=delayed" to filter process protocol Some `clean` / `smudge` filters may require a significant amount of time to process a single blob (e.g. the Git LFS smudge filter might perform network requests). During this process the Git checkout operation is blocked and Git needs to wait until the filter is done to continue with the checkout. Teach the filter process protocol, introduced in `edcc8581` ("convert: add filter.<driver>.process option", 2016-10-16), to accept the status "delayed" as response to a filter request. Upon this response Git continues with the checkout operation. After the checkout operation Git calls "finish_delayed_checkout" which queries the filter for remaining blobs. If the filter is still working on the completion, then the filter is expected to block. If the filter has completed all remaining blobs then an empty response is expected. Git has a multiple code paths that checkout a blob. Support delayed checkouts only in `clone` (in unpack-trees.c) and `checkout` operations for now. The optimization is most effective in these code paths as all files of the tree are processed. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:50:41 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	fa0624f79f	Merge branch 'dt/unpack-save-untracked-cache-extension' When "git checkout", "git merge", etc. manipulates the in-core index, various pieces of information in the index extensions are discarded from the original state, as it is usually not the case that they are kept up-to-date and in-sync with the operation on the main index. The untracked cache extension is copied across these operations now, which would speed up "git status" (as long as the cache is properly invalidated). * dt/unpack-save-untracked-cache-extension: unpack-trees: preserve index extensions	2017-05-30 11:16:45 +09:00
Junio C Hamano	4eeed27e16	Merge branch 'bw/dir-c-stops-relying-on-the-index' API update. * bw/dir-c-stops-relying-on-the-index: dir: convert fill_directory to take an index dir: convert read_directory to take an index dir: convert read_directory_recursive to take an index dir: convert open_cached_dir to take an index dir: convert is_excluded to take an index dir: convert prep_exclude to take an index dir: convert add_excludes to take an index dir: convert is_excluded_from_list to take an index dir: convert last_exclude_matching_from_list to take an index dir: convert dir_add* to take an index dir: convert get_dtype to take index dir: convert directory_exists_in_index to take index dir: convert read_skip_worktree_file_from_index to take an index dir: stop using the index compatibility macros	2017-05-29 12:34:41 +09:00
Junio C Hamano	5f074ca7e8	Merge branch 'sb/reset-recurse-submodules' "git reset" learned "--recurse-submodules" option. * sb/reset-recurse-submodules: builtin/reset: add --recurse-submodules switch submodule.c: submodule_move_head works with broken submodules submodule.c: uninitialized submodules are ignored in recursive commands entry.c: submodule recursing: respect force flag correctly	2017-05-29 12:34:40 +09:00
David Turner	edf3b90553	unpack-trees: preserve index extensions Make git checkout (and other unpack_tree operations) preserve the untracked cache. This is valuable for two reasons: 1. Often, an unpack_tree operation will not touch large parts of the working tree, and thus most of the untracked cache will continue to be valid. 2. Even if the untracked cache were entirely invalidated by such an operation, the user has signaled their intention to have such a cache, and we don't want to throw it away. [jes: backed out the watchman-specific parts] Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-20 18:26:45 +09:00
Brandon Williams	2c1eb10454	dir: convert read_directory to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-06 19:15:39 +09:00
Brandon Williams	a0bba65b10	dir: convert is_excluded to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-06 19:15:39 +09:00
Brandon Williams	473e39307d	dir: convert add_excludes to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-06 19:15:39 +09:00
Brandon Williams	fba92be8f7	dir: convert is_excluded_from_list to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-06 19:15:39 +09:00
Junio C Hamano	8868ba1962	Merge branch 'jh/unpack-trees-micro-optim' In a 2- and 3-way merge of trees, more than one source trees often end up sharing an identical subtree; optimize by not reading the same tree multiple times in such a case. * jh/unpack-trees-micro-optim: unpack-trees: avoid duplicate ODB lookups during checkout	2017-04-23 22:07:48 -07:00
Stefan Beller	cd279e2e1b	entry.c: submodule recursing: respect force flag correctly In case of a non-forced worktree update, the submodule movement is tested in a dry run first, such that it doesn't matter if the actual update is done via the force flag. However for correctness, we want to give the flag as specified by the user. All callers have been inspected and updated if needed. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 21:18:29 -07:00
Jeff Hostetler	d12a8cf0af	unpack-trees: avoid duplicate ODB lookups during checkout Teach traverse_trees_recursive() to not do redundant ODB lookups when both directories refer to the same OID. In operations such as read-tree and checkout, there will likely be many peer directories that have the same OID when the differences between the commits are relatively small. In these cases we can avoid hitting the ODB multiple times for the same OID. This patch handles n=2 and n=3 cases and simply copies the data rather than repeating the fill_tree_descriptor(). ================ On the Windows repo (500K trees, 3.1M files, 450MB index), this reduced the overall time by 0.75 seconds when cycling between 2 commits with a single file difference. (avg) before: 22.699 (avg) after: 21.955 =============== ================ On Linux using p0006-read-tree-checkout.sh with linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------- 0006.2: read-tree br_base br_ballast (57994) 0.24(0.20+0.03) 0.24(0.22+0.01) +0.0% 0006.3: switch between br_base br_ballast (57994) 10.58(6.23+2.86) 10.67(5.94+2.87) +0.9% 0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.60(0.44+0.17) 0.57(0.44+0.14) -5.0% 0006.5: switch between aliases (57994) 0.59(0.48+0.13) 0.57(0.44+0.15) -3.4% ================ Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-15 02:50:44 -07:00
Stefan Beller	6362ed0b29	unpack-trees.c: align submodule error message to the other error messages As the place holder in the error message is for multiple submodules, we don't want to encapsulate the string place holder in single quotes. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-29 15:47:36 -07:00
Stefan Beller	a7bc845a9a	unpack-trees: check if we can perform the operation for submodules In a later patch we'll support submodule entries to be in sync with the tree in working tree changing commands, such as checkout or read-tree. When a new submodule entry changes in the tree, we need to check if there are conflicts (directory/file conflicts) for the tree. Add this check for submodules to be performed before the working tree is touched. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-16 14:07:16 -07:00
Stefan Beller	d6b1230067	unpack-trees: pass old oid to verify_clean_submodule The check (which uses the old oid) is yet to be implemented, but this part is just a refactor, so it can go separately first. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-16 14:07:16 -07:00
Junio C Hamano	2243d229f7	Merge branch 'sb/unpack-trees-super-prefix' "git read-tree" and its underlying unpack_trees() machinery learned to report problematic paths prefixed with the --super-prefix option. * sb/unpack-trees-super-prefix: unpack-trees: support super-prefix option t1001: modernize style t1000: modernize style read-tree: use OPT_BOOL instead of OPT_SET_INT	2017-02-03 11:25:18 -08:00
Stefan Beller	3d415425c7	unpack-trees: support super-prefix option In the future we want to support working tree operations within submodules, e.g. "git checkout --recurse-submodules", which will update the submodule to the commit as recorded in its superproject. In the submodule the unpack-tree operation is carried out as usual, but the reporting to the user needs to prefix any path with the superproject. The mechanism for this is the super-prefix. (see `74866d757`, git: make super-prefix option) Add support for the super-prefix option for commands that unpack trees by wrapping any path output in unpacking trees in the newly introduced super_prefixed function. This new function prefixes any path with the super-prefix if there is one. Assuming the submodule case doesn't happen in the majority of the cases, we'd want to have a fast behavior for no super prefix, i.e. no reallocation/copying, but just returning path. Another aspect of introducing the `super_prefixed` function is to consider who owns the memory and if this is the right place where the path gets modified. As the super prefix ought to change the output behavior only and not the actual unpack tree part, it is fine to be that late in the line. As we get passed in 'const char *path', we cannot change the path itself, which means in case of a super prefix we have to copy over the path. We need two static buffers in that function as the error messages contain at most two paths. For testing purposes enable it in read-tree, which has no output of paths other than an unpack-trees.c. These are all converted in this patch. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-25 12:33:33 -08:00
Junio C Hamano	0c0e0fd0ca	Merge branch 'sb/unpack-trees-cleanup' Code cleanup. * sb/unpack-trees-cleanup: unpack-trees: factor progress setup out of check_updates unpack-trees: remove unneeded continue unpack-trees: move checkout state into check_updates	2017-01-18 15:12:17 -08:00
Stefan Beller	384f1a167b	unpack-trees: factor progress setup out of check_updates This makes check_updates shorter and easier to understand. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-10 11:53:33 -08:00
Stefan Beller	c4bfc7728b	unpack-trees: remove unneeded continue The continue is the last statement in the loop, so not needed. This situation arose in `700e66d66` (2010-07-30, unpack-trees: let read-tree -u remove index entries outside sparse area) when statements after the continue were removed. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-10 11:51:19 -08:00
Stefan Beller	30ac275b1c	unpack-trees: move checkout state into check_updates The checkout state was introduced via `16da134b1f` (read-trees: refactor the unpack_trees() part, 2006-07-30). An attempt to refactor the checkout state was done in `b56aa5b268` (unpack-trees: pass checkout state explicitly to check_updates(), 2016-09-13), but we can go even further. The `struct checkout state` is not used in unpack_trees apart from initializing it, so move it into the function that makes use of it, which is `check_updates`. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-10 11:51:14 -08:00
Junio C Hamano	0f30315ba1	Merge branch 'sb/unpack-trees-grammofix' * sb/unpack-trees-grammofix: unpack-trees: fix grammar for untracked files in directories	2016-12-19 14:45:31 -08:00
Stefan Beller	584f99c87b	unpack-trees: fix grammar for untracked files in directories Noticed-by: David Turner <dturner@twosigma.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-05 12:17:02 -08:00
René Scharfe	68e3d6292f	introduce CHECKOUT_INIT Add a static initializer for struct checkout and use it throughout the code base. It's shorter, avoids a memset(3) call and makes sure the base_dir member is initialized to a valid (empty) string. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-22 13:42:18 -07:00
Junio C Hamano	e9c6d3dcc2	Merge branch 'rs/unpack-trees-reduce-file-scope-global' Code cleanup. * rs/unpack-trees-reduce-file-scope-global: unpack-trees: pass checkout state explicitly to check_updates()	2016-09-21 15:15:26 -07:00
Junio C Hamano	4af9a7d344	Merge branch 'bc/object-id' The "unsigned char sha1[20]" to "struct object_id" conversion continues. Notable changes in this round includes that ce->sha1, i.e. the object name recorded in the cache_entry, turns into an object_id. It had merge conflicts with a few topics in flight (Christian's "apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's "status --porcelain-v2"). Extra sets of eyes double-checking for mismerges are highly appreciated. * bc/object-id: builtin/reset: convert to use struct object_id builtin/commit-tree: convert to struct object_id builtin/am: convert to struct object_id refs: add an update_ref_oid function. sha1_name: convert get_sha1_mb to struct object_id builtin/update-index: convert file to struct object_id notes: convert init_notes to use struct object_id builtin/rm: convert to use struct object_id builtin/blame: convert file to use struct object_id Convert read_mmblob to take struct object_id. notes-merge: convert struct notes_merge_pair to struct object_id builtin/checkout: convert some static functions to struct object_id streaming: make stream_blob_to_fd take struct object_id builtin: convert textconv_object to use struct object_id builtin/cat-file: convert some static functions to struct object_id builtin/cat-file: convert struct expand_data to use struct object_id builtin/log: convert some static functions to use struct object_id builtin/blame: convert struct origin to use struct object_id builtin/apply: convert static functions to struct object_id cache: convert struct cache_entry to use struct object_id	2016-09-19 13:47:19 -07:00
René Scharfe	b56aa5b268	unpack-trees: pass checkout state explicitly to check_updates() Add a parameter for the struct checkout variable to check_updates() instead of using a static global variable. Passing it explicitly makes object ownership and usage more easily apparent. And we get rid of a static variable; those can be problematic in library-like code. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-13 16:26:12 -07:00
Alex Henrie	a1c8044662	unpack-trees: do not capitalize "working" In English, only proper nouns are capitalized. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-08 12:17:23 -07:00
brian m. carlson	99d1a9861a	cache: convert struct cache_entry to use struct object_id Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-07 12:59:42 -07:00
Alex Henrie	c2691e2add	unpack-trees: fix English grammar in do-this-before-that messages Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-27 08:29:36 -07:00
Junio C Hamano	40cfc95856	Merge branch 'nd/error-errno' The code for warning_errno/die_errno has been refactored and a new error_errno() reporting helper is introduced. * nd/error-errno: (41 commits) wrapper.c: use warning_errno() vcs-svn: use error_errno() upload-pack.c: use error_errno() unpack-trees.c: use error_errno() transport-helper.c: use error_errno() sha1_file.c: use {error,die,warning}_errno() server-info.c: use error_errno() sequencer.c: use error_errno() run-command.c: use error_errno() rerere.c: use error_errno() and warning_errno() reachable.c: use error_errno() mailmap.c: use error_errno() ident.c: use warning_errno() http.c: use error_errno() and warning_errno() grep.c: use error_errno() gpg-interface.c: use error_errno() fast-import.c: use error_errno() entry.c: use error_errno() editor.c: use error_errno() diff-no-index.c: use error_errno() ...	2016-05-17 14:38:28 -07:00
Junio C Hamano	e5e7a9115d	Merge branch 'va/i18n-misc-updates' Mark several messages for translation. * va/i18n-misc-updates: i18n: unpack-trees: avoid substituting only a verb in sentences i18n: builtin/pull.c: split strings marked for translation i18n: builtin/pull.c: mark placeholders for translation i18n: git-parse-remote.sh: mark strings for translation i18n: branch: move comment for translators i18n: branch: unmark string for translation i18n: builtin/rm.c: remove a comma ',' from string i18n: unpack-trees: mark strings for translation i18n: builtin/branch.c: mark option for translation i18n: index-pack: use plural string instead of normal one	2016-05-17 14:38:23 -07:00
Vasco Almeida	2e3926b948	i18n: unpack-trees: avoid substituting only a verb in sentences Instead of reusing the same set of message templates for checkout and other actions and substituting the verb with "%s", prepare separate message templates for each known action. That would make it easier for translation into languages where the same verb may conjugate differently depending on the message we are giving. See gettext documentation for details: http://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-05-12 16:28:43 -07:00
Nguyễn Thái Ngọc Duy	43c728e2c2	unpack-trees.c: use error_errno() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-05-09 12:29:08 -07:00
brian m. carlson	7d924c9139	struct name_entry: use struct object_id instead of unsigned char sha1[20] Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-25 14:23:42 -07:00
Vasco Almeida	ed47fdf7fa	i18n: unpack-trees: mark strings for translation Mark strings seen by the user inside setup_unpack_trees_porcelain() and display_error_msgs() functions for translation. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-12 10:29:16 -07:00
David Turner	a6720955f1	unpack-trees: fix accidentally quadratic behavior While unpacking trees (e.g. during git checkout), when we hit a cache entry that's past and outside our path, we cut off iteration. This provides about a 45% speedup on git checkout between master and master^20000 on Twitter's monorepo. Speedup in general will depend on repostitory structure, number of changes, and packfile packing decisions. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-22 13:03:10 -08:00
David Turner	d9c2bd560e	do_compare_entry: use already-computed path In traverse_trees, we generate the complete traverse path for a traverse_info. Later, in do_compare_entry, we used to go do a bunch of work to compare the traverse_info to a cache_entry's name without computing that path. But since we already have that path, we don't need to do all that work. Instead, we can just put the generated path into the traverse_info, and do the comparison more directly. We copy the path because prune_traversal might mutate `base`. This doesn't happen in any codepaths where do_compare_entry is called, but it's better to be safe. This makes git checkout much faster -- about 25% on Twitter's monorepo. Deeper directory trees are likely to benefit more than shallower ones. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-05 13:39:46 -08:00
Jeff King	75faa45ae0	replace trivial malloc + sprintf / strcpy calls with xstrfmt It's a common pattern to do: foo = xmalloc(strlen(one) + strlen(two) + 1 + 1); sprintf(foo, "%s %s", one, two); (or possibly some variant with strcpy()s or a more complicated length computation). We can switch these to use xstrfmt, which is shorter, involves less error-prone manual computation, and removes many sprintf and strcpy calls which make it harder to audit the code for real buffer overflows. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-09-25 10:18:18 -07:00

1 2 3 4 5 ...

405 Commits