git-commit-vandalism

Author	SHA1	Message	Date
Nguyễn Thái Ngọc Duy	8c69c1f92e	Document limited recursion pathspec matching with wildcards It's actually unlimited recursion if wildcards are active regardless --max-depth Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-01-14 18:39:04 -08:00
Nguyễn Thái Ngọc Duy	9c6bebd142	tree_entry_interesting: make use of local pointer "item" Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-27 11:38:24 -07:00
Nguyễn Thái Ngọc Duy	d688cf07b1	tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-27 11:38:24 -07:00
Junio C Hamano	5fb8c05f2e	get_tree_entry(): do not call find_tree_entry() on an empty tree We know we will find nothing. This incidentally squelches false warning from gcc about potentially uninitialized usage of t.entry fields. For an empty tree, it is true that init_tree_desc() does not call decode_tree_entry() and the tree_desc is left uninitialized, but find_tree_entry() only calls tree_entry_extract() that uses the tree_desc while it has more things to read from the tree, so the uninitialized t.entry fields are never used in such a case anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-27 11:24:01 -07:00
Nguyễn Thái Ngọc Duy	0de1633783	tree-walk.c: do not leak internal structure in tree_entry_len() tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-27 11:08:26 -07:00
Junio C Hamano	4909bbe40a	Merge branch 'dm/tree-walk' * dm/tree-walk: tree-walk: micro-optimization in tree_entry_interesting tree-walk: drop unused parameter from match_dir_prefix	2011-10-10 15:56:20 -07:00
Ramsay Jones	27c0f76884	Fix some "variable might be used uninitialized" warnings In particular, gcc complains as follows: CC tree-walk.o tree-walk.c: In function `traverse_trees': tree-walk.c:347: warning: 'e' might be used uninitialized in this \ function CC builtin/revert.o builtin/revert.c: In function `verify_opt_mutually_compatible': builtin/revert.c:113: warning: 'opt2' might be used uninitialized in \ this function Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-09 13:28:04 -07:00
Dan McGee	1b74092373	tree-walk: micro-optimization in tree_entry_interesting In the case of a wide breadth top-level tree (~2400 entries, all trees in this case), we can see a noticeable cost in the profiler calling strncmp() here. Most of the time we are at the base level of the repository, so base is "" and baselen == 0, which means we will always test true. Break out this one tiny case so we can short circuit the strncmp() call. Test cases are as follows. packages.git is the Arch Linux git-svn clone of the packages repository which has the characteristics above. Commands: [1] packages.git, /usr/bin/time git log >/dev/null [2] packages.git, /usr/bin/time git log -- autogen/trunk pacman/trunk wget/trunk >/dev/null [3] linux.git, /usr/bin/time git log >/dev/null [4] linux.git, /usr/bin/time git log -- drivers/ata drivers/uio tools >/dev/null Results: before after %faster [1] 2.56 2.55 0.4% [2] 51.82 48.66 6.5% [3] 5.58 5.61 -0.5% [4] 1.55 1.51 0.2% The takeaway here is this doesn't matter in many operations, but it does for a certain style of repository and operation where it nets a 6.5% measured improvement. The other changes are likely not significant by reasonable statistics methods. Note: the measured improvement when originally submitted was ~11% (43 to 38 secs) for operation [2]. At the time, the repository had 117220 commits; it now has 137537 commits. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-09-28 18:26:12 -07:00
Dan McGee	6de96915c0	tree-walk: drop unused parameter from match_dir_prefix Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-09-28 18:24:36 -07:00
Junio C Hamano	2842c0f914	traverse_trees(): allow pruning with pathspec The traverse_trees() machinery is primarily meant for merging two (or more) trees, and because a merge is a full tree operation, it doesn't support any pruning with pathspec. Since `d1f2d7e` (Make run_diff_index() use unpack_trees(), not read_tree(), 2008-01-19), however, we use unpack_trees() to traverse_trees() callchain to perform "diff-index", which could waste a lot of work traversing trees outside the user-supplied pathspec, only to discard at the blob comparison level in diff-lib.c::oneway_diff() which is way too late. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-08-29 14:32:04 -07:00
Junio C Hamano	33e0f62ba9	pathspec: rename per-item field has_wildcard to use_wildcard As the point of the last change is to allow use of strings as literals no matter what characters are in them, "has_wildcard" does not match what we use this field for anymore. It is used to decide if the wildcard matching should be used, so rename it to match the usage better. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-04-05 09:30:36 -07:00
Nguyễn Thái Ngọc Duy	1376e50723	grep: drop pathspec_matches() in favor of tree_entry_interesting() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:31 -08:00
Nguyễn Thái Ngọc Duy	f1a2ddbbc2	tree_entry_interesting(): optimize wildcard matching when base is matched If base is already matched, skip that part when calling fnmatch(). This happens quite often if users start a command from worktree's subdirectory and prefix is usually prepended to all pathspecs. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy	d38f28093e	tree_entry_interesting(): support wildcard matching never_interesting optimization is disabled if there is any wildcard pathspec, even if it only matches exactly on trees. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy	86e4ca69e3	tree_entry_interesting(): fix depth limit with overlapping pathspecs Suppose we have two pathspecs 'a' and 'a/b' (both are dirs) and depth limit 1. In current code, pathspecs are checked in input order. When 'a/b' is checked against pathspec 'a', it fails depth limit and therefore is excluded, although it should match 'a/b' pathspec. This patch reorders all pathspecs alphabetically, then teaches tree_entry_interesting() to check against the deepest pathspec first, so depth limit of a shallower pathspec won't affect a deeper one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy	bc96cc87db	tree_entry_interesting(): support depth limit This is needed to replace pathspec_matches() in builtin/grep.c. max_depth == -1 means infinite depth. Depth limit is only effective when pathspec.recursive == 1. When pathspec.recursive == 0, the behavior depends on match functions: non-recursive for tree_entry_interesting() and recursive for match_pathspec{,_depth} Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy	58c4d66619	tree_entry_interesting(): refactor into separate smaller functions Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy	48932677d6	diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:16 -08:00
Nguyễn Thái Ngọc Duy	2c389fc8ec	Move tree_entry_interesting() to tree-walk.c and export it Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 13:22:58 -08:00
Matthieu Moy	e6c111b4c0	unpack_trees: group error messages by type When an error is encountered, it calls add_rejected_file() which either - directly displays the error message and stops if in plumbing mode (i.e. if show_all_errors is not initialized at 1) - or stores it so that it will be displayed at the end with display_error_msgs(), Storing the files by error type permits to have a list of files for which there is the same error instead of having a serie of almost identical errors. As each bind_overlap error combines a file and an old file, a list cannot be done, therefore, theses errors are not stored but directly displayed. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-11 10:36:06 -07:00
Junio C Hamano	b599672316	Merge branch 'maint-1.6.6' into maint * maint-1.6.6: fix minor memory leak in get_tree_entry()	2010-02-14 18:59:14 -08:00
René Scharfe	ef0065034a	fix minor memory leak in get_tree_entry() Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-14 03:04:20 -08:00
Junio C Hamano	1ee26571e9	traverse_trees(): handle D/F conflict case sanely traverse_trees() is supposed to call its callback with all the matching entries from the given trees. The current algorithm keeps a pointer to each of the tree being traversed, and feeds the entry with the earliest name to the callback. This breaks down if the trees being traversed looks like this: A B t-1 t t-2 u t/a v When we are currently looking at an entry "t-1" in tree A, and tree B has returned "t", feeding "t" from the B and not feeding anything from A, only because "t-1" sorts later than "t", will miss an entry for a subtree "t" behind the current entry in tree A. This introduces extended_entry_extract() helper function that gives what name is expected from the tree, and implements a mechanism to look-ahead in the tree object using it, to make sure such a case is handled sanely. Traversal in tree A in the above example will first return "t" to match that of B, and then the next request for an entry to A then returns "t-1". This roughly corresponds to what Linus's "prepare for one-entry lookahead" wanted to do, but because this does implement look ahead, t6035 and one more test in t1012 reveal that the approach would not work without adjusting the side that walks the index in unpack_trees() as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-03 23:21:32 -08:00
Linus Torvalds	bcbe5a515e	Fix tree-walking compare_entry() in the presense of --prefix When we make the "root" tree-walk info entry have a pathname in it, we need to have a ->prev pointer so that compare_entry will actually notice and traverse into the root. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-09 00:43:47 -08:00
Linus Torvalds	91e4f03604	Make 'traverse_trees()' traverse conflicting DF entries in parallel This makes the traverse_trees() entry comparator routine use the more relaxed form of name comparison that considers files and directories with the same name identical. We pass in a separate mask for just the directory entries, so that the callback routine can decide (if it wants to) to only handle one or the other type, but generally most (all?) users are expected to really want to see the case of a name 'foo' showing up in one tree as a file and in another as a directory at the same time. In particular, moving 'unpack_trees()' over to use this tree traversal mechanism requires this. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-09 00:43:47 -08:00
Linus Torvalds	5803c6f8a2	Add return value to 'traverse_tree()' callback This allows the callback to return an error value, but it can also specify which of the tree entries that it actually used up by returning a positive mask value. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-09 00:43:47 -08:00
Linus Torvalds	40d934df72	Make 'traverse_tree()' use linked structure rather than 'const char *base' This makes the calling convention a bit less obvious, but a lot more flexible. Instead of allocating and extending a new 'base' string, we just link the top-most name into a linked list of the 'info' structure when traversing a subdirectory, and we can generate the basename by following the list. Perhaps even more importantly, the linked list of info structures also gives us a place to naturally save off other information than just the directory name. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-09 00:43:47 -08:00
Martin Koegler	64cc1c0909	tree-walk: don't parse incorrect entries The current code can access memory outside of the tree buffer in the case of malformed tree entries. This patch prevents this by: * The rest of the buffer must be at least 24 bytes (at least 1 byte mode, 1 blank, at least one byte path name, 1 NUL, 20 bytes sha1). * Check that the last NUL (21 bytes before the end) is present. This ensures that strlen() and get_mode() calls stay within the buffer. * The mode may not be empty. We have only to reject a blank at the begin, as the rest is handled by if (c < '0' \|\| c > '7'). * The blank is ensured by get_mode(). * The path must contain at least one character. Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-06 18:41:44 -08:00
Junio C Hamano	a6080a0a44	War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-07 00:04:01 -07:00
Linus Torvalds	4651ece854	Switch over tree descriptors to contain a pre-parsed entry This makes the tree descriptor contain a "struct name_entry" as part of it, and it gets filled in so that it always contains a valid entry. On some benchmarks, it improves performance by up to 15%. That makes tree entry "extract" trivial, and means that we only actually need to decode each tree entry just once: we decode the first one when we initialize the tree descriptor, and each subsequent one when doing "update_tree_entry()". In particular, this means that we don't need to do strlen() both at extract time _and_ at update time. Finally, it also allows more sharing of code (entry_extract(), that wanted a "struct name_entry", just got totally trivial, along with the "tree_entry()" function). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-03-21 11:15:26 -07:00
Linus Torvalds	6fda5e5180	Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-03-21 10:21:57 -07:00
Linus Torvalds	a8c40471ab	Remove "pathlen" from "struct name_entry" Since we have the "tree_entry_len()" helper function these days, and don't need to do a full strlen(), there's no point in saving the path length - it's just redundant information. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-03-21 10:21:56 -07:00
Linus Torvalds	304de2d2d6	Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-03-18 15:36:59 -07:00
Jeff King	d93b7d1c30	get_tree_entry: map blank requested entry to tree root This means that git show HEAD: will now return HEAD^{tree}, which is logically consistent with git show HEAD:Documentation Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-01-09 14:08:41 -08:00
René Scharfe	4fe2cc0c89	Remove shadowing variable from traverse_trees() The variable named entry is allocated using malloc() and then forgotten, it being shadowed by an automatic variable of the same name. Fixing the array size at 3 worked so far because the only caller of traverse_trees() needed only as much entries. Simply remove the shadowing varaible and we're able to traverse more than three trees and save stack space at the same time! Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-01-04 22:28:46 -08:00
Shawn Pearce	e702496e43	Convert memcpy(a,b,20) to hashcpy(a,b). This abstracts away the size of the hash values when copying them from memory location to memory location, much as the introduction of hashcmp abstracted away hash value comparsion. A few call sites were using char* rather than unsigned char* so I added the cast rather than open hashcpy to be void. This is a reasonable tradeoff as most call sites already use unsigned char and the existing hashcmp is also declared to be unsigned char*. [jc: Splitted the patch to "master" part, to be followed by a patch for merge-recursive.c which is not in "master" yet. Fixed the cast in the latter hunk to combine-diff.c which was wrong in the original. Also converted ones left-over in combine-diff.c, diff-lib.c and upload-pack.c ] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-23 13:53:10 -07:00
Florian Forster	1d7f171c3a	Remove all void-pointer arithmetic. ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in various ways. Usually the strategy that required the least changes was used. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-06-20 01:59:46 -07:00
Linus Torvalds	4c068a9831	tree_entry(): new tree-walking helper function This adds a "tree_entry()" function that combines the common operation of doing a "tree_entry_extract()" + "update_tree_entry()". It also has a simplified calling convention, designed for simple loops that traverse over a whole tree: the arguments are pointers to the tree descriptor and a name_entry structure to fill in, and it returns a boolean "true" if there was an entry left to be gotten in the tree. This allows tree traversal with struct tree_desc desc; struct name_entry entry; desc.buf = tree->buffer; desc.size = tree->size; while (tree_entry(&desc, &entry) { ... use "entry.{path, sha1, mode, pathlen}" ... } which is not only shorter than writing it out in full, it's hopefully less error prone too. [ It's actually a tad faster too - we don't need to recalculate the entry pathlength in both extract and update, but need to do it only once. Also, some callers can avoid doing a "strlen()" on the result, since it's returned as part of the name_entry structure. However, by now we're talking just 1% speedup on "git-rev-list --objects --all", and we're definitely at the point where tree walking is no longer the issue any more. ] NOTE! Not everybody wants to use this new helper function, since some of the tree walkers very much on purpose do the descriptor update separately from the entry extraction. So the "extract + update" sequence still remains as the core sequence, this is just a simplified interface. We should probably add a silly two-line inline helper function for initializing the descriptor from the "struct tree" too, just to cut down on the noise from that common "desc" initializer. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-05-30 23:03:01 -07:00
Linus Torvalds	7d65848afd	Don't use "sscanf()" for tree mode scanning Doing an oprofile run on the result of my git rev-list memory leak fixes and tree parsing cleanups, I was surprised by the third-highest entry being samples % image name app name symbol name 179751 2.7163 libc-2.4.so libc-2.4.so _IO_vfscanf@@GLIBC_2.4 where that 2.7% is actually more than 5% of one CPU, because this was run on a dual CPU setup with the other CPU just being idle. That seems to all be from the use of 'sscanf(tree, "%o", &mode)' for the tree buffer parsing. So do the trivial octal parsing by hand, which also gives us where the first space in the string is (and thus where the pathname starts) so we can get rid of the "strchr(tree, ' ')" call too. This brings the "git rev-list --all --objects" time down from 63 seconds to 55 seconds on the historical kernel archive for me, so it's quite noticeable - tree parsing is a lot of what we end up doing when following all the objects. [ I also see a 5% speedup on a full "git fsck-objects" on the current kernel archive, so that sscanf() really does seem to have hurt our performance by a surprising amount ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-05-28 19:39:49 -07:00
Junio C Hamano	4dcff634e6	get_tree_entry(): make it available from tree-walk Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-04-19 14:05:47 -07:00
Peter Eriksen	8e44025925	Use blob_, commit_, tag_, and tree_type throughout. This replaces occurences of "blob", "commit", "tag", and "tree", where they're really used as type specifiers, which we already have defined global constants for. Signed-off-by: Peter Eriksen <s022018@student.dtu.dk> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-04-04 00:11:19 -07:00
Junio C Hamano	1b0c7174a1	tree/diff header cleanup. Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-03-29 23:54:13 -08:00

42 Commits