git-commit-vandalism

Author	SHA1	Message	Date
Jeff King	c9af708b1a	pack-objects: use mru list when iterating over packs In the original implementation of want_object_in_pack(), we always looked for the object in every pack, so the order did not matter for performance. As of the last few patches, however, we can now often break out of the loop early after finding the first instance, and avoid looking in the other packs at all. In this case, pack order can make a big difference, because we'd like to find the objects by looking at as few packs as possible. This patch switches us to the same packed_git_mru list that is now used by normal object lookups. Here are timings for p5303 on linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5303.3: rev-list (1) 31.31(31.07+0.23) 31.28(31.00+0.27) -0.1% 5303.4: repack (1) 40.35(38.84+2.60) 40.53(39.31+2.32) +0.4% 5303.6: rev-list (50) 31.37(31.15+0.21) 31.41(31.16+0.24) +0.1% 5303.7: repack (50) 58.25(68.54+2.03) 47.28(57.66+1.89) -18.8% 5303.9: rev-list (1000) 31.91(31.57+0.33) 31.93(31.64+0.28) +0.1% 5303.10: repack (1000) 304.80(376.00+3.92) 87.21(159.54+2.84) -71.4% The rev-list numbers are unchanged, which makes sense (they are not exercising this code at all). The 50- and 1000-pack repack cases show considerable improvement. The single-pack repack case doesn't, of course; there's nothing to improve. In fact, it gives us a baseline for how fast we could possibly go. You can see that though rev-list can approach the single-pack case even with 1000 packs, repack doesn't. The reason is simple: the loop we are optimizing is only part of what the repack is doing. After the "counting" phase, we do delta compression, which is much more expensive when there are multiple packs, because we have fewer deltas we can reuse (you can also see that these numbers come from a multicore machine; the CPU times are much higher than the wall-clock times due to the delta phase). So the good news is that in cases with many packs, we used to be dominated by the "counting" phase, and now we are dominated by the delta compression (which is faster, and which we have already parallelized). Here are similar numbers for git.git: Test HEAD^ HEAD --------------------------------------------------------------------- 5303.3: rev-list (1) 1.55(1.51+0.02) 1.54(1.53+0.00) -0.6% 5303.4: repack (1) 1.82(1.80+0.08) 1.82(1.78+0.09) +0.0% 5303.6: rev-list (50) 1.58(1.57+0.00) 1.58(1.56+0.01) +0.0% 5303.7: repack (50) 2.50(3.12+0.07) 2.31(2.95+0.06) -7.6% 5303.9: rev-list (1000) 2.22(2.20+0.02) 2.23(2.19+0.03) +0.5% 5303.10: repack (1000) 10.47(16.78+0.22) 7.50(13.76+0.22) -28.4% Not as impressive in terms of percentage, but still measurable wins. If you look at the wall-clock time improvements in the 1000-pack case, you can see that linux improved by roughly 10x as many seconds as git. That's because it has roughly 10x as many objects, and we'd expect this improvement to scale linearly with the number of objects (since the number of packs is kept constant). It's just that the "counting" phase is a smaller percentage of the total time spent for a git.git repack, and hence the percentage win is smaller. The implementation itself is a straightforward use of the MRU code. We only bother marking a pack as used when we know that we are able to break early out of the loop, for two reasons: 1. If we can't break out early, it does no good; we have to visit each pack anyway, so we might as well avoid even the minor overhead of managing the cache order. 2. The mru_mark() function reorders the list, which would screw up our traversal. So it is only safe to mark when we are about to break out of the loop. We could record the found pack and mark it after the loop finishes, of course, but that's more complicated and it doesn't buy us anything due to (1). Note that this reordering does have a potential impact on the final pack, as we store only a single "found" pack for each object, even if it is present in multiple packs. In principle, any copy is acceptable, as they all refer to the same content. But in practice, they may differ in whether they are stored as deltas, against which base, etc. This may have an impact on delta reuse, and even the delta search (since we skip pairs that were already in the same pack). It's not clear whether this change of order would hurt or even help average cases, though. The most likely reason to have duplicate objects is from the completion of thin packs (e.g., you have some objects in a "base" pack, then receive several pushes; the packs you receive may be thin on the wire, with deltas that refer to bases outside the pack, but we complete them with duplicate base objects when indexing them). In such a case the current code would always find the thin duplicates (because we currently walk the packs in reverse chronological order). Whereas with this patch, some of those duplicates would be found in the base pack instead. In my tests repacking a real-world case of linux.git with 3600 thin-pack pushes (on top of a large "base" pack), the resulting pack was about 0.04% larger with this patch. On the other hand, because we were more likely to hit the base pack, there were more opportunities for delta reuse, and we had 50,000 fewer objects to examine in the delta search. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:44:23 -07:00
Jeff King	4cf2143e02	pack-objects: break delta cycles before delta-search phase We do not allow cycles in the delta graph of a pack (i.e., A is a delta of B which is a delta of A) for the obvious reason that you cannot actually access any of the objects in such a case. There's a last-ditch attempt to notice cycles during the write phase, during which we issue a warning to the user and write one of the objects out in full. However, this is "last-ditch" for two reasons: 1. By this time, it's too late to find another delta for the object, so the resulting pack is larger than it otherwise could be. 2. The warning is there because this is something that _shouldn't_ ever happen. If it does, then either: a. a pack we are reusing deltas from had its own cycle b. we are reusing deltas from multiple packs, and we found a cycle among them (i.e., A is a delta of B in one pack, but B is a delta of A in another, and we choose to use both deltas). c. there is a bug in the delta-search code So this code serves as a final check that none of these things has happened, warns the user, and prevents us from writing a bogus pack. Right now, (2b) should never happen because of the static ordering of packs in want_object_in_pack(). If two objects have a delta relationship, then they must be in the same pack, and therefore we will find them from that same pack. However, a future patch would like to change that static ordering, which will make (2b) a common occurrence. In preparation, we should be able to handle those kinds of cycles better. This patch does by introducing a cycle-breaking step during the get_object_details() phase, when we are deciding which deltas can be reused. That gives us the chance to feed the objects into the delta search as if the cycle did not exist. We'll leave the detection and warning in the write_object() phase in place, as it still serves as a check for case (2c). This does mean we will stop warning for (2a). That case is caused by bogus input packs, and we ideally would warn the user about it. However, since those cycles show up after picking reusable deltas, they look the same as (2b) to us; our new code will break the cycles early and the last-ditch check will never see them. We could do analysis on any cycles that we find to distinguish the two cases (i.e., it is a bogus pack if and only if every delta in the cycle is in the same pack), but we don't need to. If there is a cycle inside a pack, we'll run into problems not only reusing the delta, but accessing the object data at all. So when we try to dig up the actual size of the object, we'll hit that same cycle and kick in our usual complain-and-try-another-source code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:44:13 -07:00
Jeff King	ca79c98572	sha1_file: make packed_object_info public Some code may have a pack/offset pair for an object, but would like to look up more information. Using sha1_object_info() is too heavy-weight; it starts from the sha1 and has to find the pack again (so not only does it waste time, it might not even find the same instance). In some cases, this problem is solved by helpers like get_size_from_delta(), which is used by pack-objects to take a shortcut for objects whose packed representation has already been found. But there's no similar function for getting the object type, for instance. Rather than introduce one, let's just make the whole packed_object_info() available. It is smart enough to spend effort only on the items the caller wants. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:43:24 -07:00
Jeff King	27b5c1a065	provide an initializer for "struct object_info" An all-zero initializer is fine for this struct, but because the first element is a pointer, call sites need to know to use "NULL" instead of "0". Otherwise some static checkers like "sparse" will complain; see `d099b71` (Fix some sparse warnings, 2013-07-18) for example. So let's provide an initializer to make this easier to get right. But let's also comment that memset() to zero is explicitly OK[1]. One of the callers embeds object_info in another struct which is initialized via memset (expand_data in builtin/cat-file.c). Since our subset of C doesn't allow assignment from a compound literal, handling this in any other way is awkward, so we'd like to keep the ability to initialize by memset(). By documenting this property, it should make anybody who wants to change the initializer think twice before doing so. There's one other caller of interest. In parse_sha1_header(), we did not initialize the struct fully in the first place. This turned out not to be a bug because the sub-function it calls does not look at any other fields except the ones we did initialize. But that assumption might not hold in the future, so it's a dangerous construct. This patch switches it to initializing the whole struct, which protects us against unexpected reads of the other fields. [1] Obviously using memset() to initialize a pointer violates the C standard, but we long ago decided that it was an acceptable tradeoff in the real world. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:42:23 -07:00
Jeff King	56dfeb6263	pack-objects: compute local/ignore_pack_keep early In want_object_in_pack(), we can exit early from our loop if neither "local" nor "ignore_pack_keep" are set. If they are, however, we must examine each pack to see if it has the object and is non-local or has a ".keep". It's quite common for there to be no non-local or .keep packs at all, in which case we know ahead of time that looking further will be pointless. We can pre-compute this by simply iterating over the list of packs ahead of time, and dropping the flags if there are no packs that could match. Another similar strategy would be to modify the loop in want_object_in_pack() to notice that we have already found the object once, and that we are looping only to check for "local" and "keep" attributes. If a pack has neither of those, we can skip the call to find_pack_entry_one(), which is the expensive part of the loop. This has two advantages: - it isn't all-or-nothing; we still get some improvement when there's a small number of kept or non-local packs, and a large number of non-kept local packs - it eliminates any possible race where we add new non-local or kept packs after our initial scan. In practice, I don't think this race matters; we already cache the packed_git information, so somebody who adds a new pack or .keep file after we've started will not be noticed at all, unless we happen to need to call reprepare_packed_git() because a lookup fails. In other words, we're already racy, and the race is not a big deal (losing the race means we might include an object in the pack that would not otherwise be, which is an acceptable outcome). However, it also has a disadvantage: we still loop over the rest of the packs for each object to check their flags. This is much less expensive than doing the object lookup, but still not free. So if we wanted to implement that strategy to cover the non-all-or-nothing cases, we could do so in addition to this one (so you get the most speedup in the all-or-nothing case, and the best we can do in the other cases). But given that the all-or-nothing case is likely the most common, it is probably not worth the trouble, and we can revisit this later if evidence points otherwise. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:08 -07:00
Jeff King	cd37996795	pack-objects: break out of want_object loop early When pack-objects collects the list of objects to pack (either from stdin, or via its internal rev-list), it filters each one through want_object_in_pack(). This function loops through each existing packfile, looking for the object. When we find it, we mark the pack/offset combo for later use. However, we can't just return "yes, we want it" at that point. If --honor-pack-keep is in effect, we must keep looking to find it in _all_ packs, to make sure none of them has a .keep. Likewise, if --local is in effect, we must make sure it is not present in any non-local pack. As a result, the sum effort of these calls is effectively O(nr_objects * nr_packs). In an ordinary repository, we have only a handful of packs, and this doesn't make a big difference. But in pathological cases, it can slow the counting phase to a crawl. This patch notices the case that we have neither "--local" nor "--honor-pack-keep" in effect and breaks out of the loop early, after finding the first instance. Note that our worst case is still "objects * packs" (i.e., we might find each object in the last pack we look in), but in practice we will often break out early. On an "average" repo, my git.git with 8 packs, this shows a modest 2% (a few dozen milliseconds) improvement in the counting-objects phase of "git pack-objects --all <foo" (hackily instrumented by sticking exit(0) right after list_objects). But in a much more pathological case, it makes a bigger difference. I ran the same command on a real-world example with ~9 million objects across 1300 packs. The counting time dropped from 413s to 45s, an improvement of about 89%. Note that this patch won't do anything by itself for a normal "git gc", as it uses both --honor-pack-keep and --local. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:07 -07:00
Jeff King	a73cdd21c4	find_pack_entry: replace last_found_pack with MRU cache Each pack has an index for looking up entries in O(log n) time, but if we have multiple packs, we have to scan through them linearly. This can produce a measurable overhead for some operations. We dealt with this long ago in `f7c22cc` (always start looking up objects in the last used pack first, 2007-05-30), which keeps what is essentially a 1-element most-recently-used cache. In theory, we should be able to do better by keeping a similar but longer cache, that is the same length as the pack-list itself. Since we now have a convenient generic MRU structure, we can plug it in and measure. Here are the numbers for running p5303 against linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5303.3: rev-list (1) 31.56(31.28+0.27) 31.30(31.08+0.20) -0.8% 5303.4: repack (1) 40.62(39.35+2.36) 40.60(39.27+2.44) -0.0% 5303.6: rev-list (50) 31.31(31.06+0.23) 31.23(31.00+0.22) -0.3% 5303.7: repack (50) 58.65(69.12+1.94) 58.27(68.64+2.05) -0.6% 5303.9: rev-list (1000) 38.74(38.40+0.33) 31.87(31.62+0.24) -17.7% 5303.10: repack (1000) 367.20(441.80+4.62) 342.00(414.04+3.72) -6.9% The main numbers of interest here are the rev-list ones (since that is exercising the normal object lookup code path). The single-pack case shouldn't improve at all; the 260ms speedup there is just part of the run-to-run noise (but it's important to note that we didn't make anything worse with the overhead of maintaining our cache). In the 50-pack case, we see similar results. There may be a slight improvement, but it's mostly within the noise. The 1000-pack case does show a big improvement, though. That carries over to the repack case, as well. Even though we haven't touched its pack-search loop yet, it does still do a lot of normal object lookups (e.g., for the internal revision walk), and so improves. As a point of reference, I also ran the 1000-pack test against a version of HEAD^ with the last_found_pack optimization disabled. It takes ~60s, so that gives an indication of how much even the single-element cache is helping. For comparison, here's a smaller repository, git.git: Test HEAD^ HEAD --------------------------------------------------------------------- 5303.3: rev-list (1) 1.56(1.54+0.01) 1.54(1.51+0.02) -1.3% 5303.4: repack (1) 1.84(1.80+0.10) 1.82(1.80+0.09) -1.1% 5303.6: rev-list (50) 1.58(1.55+0.02) 1.59(1.57+0.01) +0.6% 5303.7: repack (50) 2.50(3.18+0.04) 2.50(3.14+0.04) +0.0% 5303.9: rev-list (1000) 2.76(2.71+0.04) 2.24(2.21+0.02) -18.8% 5303.10: repack (1000) 13.21(19.56+0.25) 11.66(18.01+0.21) -11.7% You can see that the percentage improvement is similar. That's because the lookup we are optimizing is roughly O(nr_objects * nr_packs). Since the number of packs is constant in both tests, we'd expect the improvement to be linear in the number of objects. But the whole process is also linear in the number of objects, so the improvement is a constant factor. The exact improvement does also depend on the contents of the packs. In p5303, the extra packs all have 5 first-parent commits in them, which is a reasonable simulation of a pushed-to repository. But it also means that only 250 first-parent commits are in those packs (compared to almost 50,000 total in linux.git), and the rest are in the huge "base" pack. So once we start looking at history in taht big pack, that's where we'll find most everything, and even the 1-element cache gets close to 100% cache hits. You could almost certainly show better numbers with a more pathological case (e.g., distributing the objects more evenly across the packs). But that's simply not that realistic a scenario, so it makes more sense to focus on these numbers. The implementation itself is a straightforward application of the MRU code. We provide an MRU-ordered list of packs that shadows the packed_git list. This is easy to do because we only create and revise the pack list in one place. The "reprepare" code path actually drops the whole MRU and replaces it for simplicity. It would be more efficient to just add new entries, but there's not much point in optimizing here; repreparing happens rarely, and only after doing a lot of other expensive work. The key things to keep optimized are traversal (which is just a normal linked list, albeit with one extra level of indirection over the regular packed_git list), and marking (which is a constant number of pointer assignments, though slightly more than the old last_found_pack was; it doesn't seem to create a measurable slowdown, though). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:07 -07:00
Jeff King	002f206faf	add generic most-recently-used list There are a few places in Git that would benefit from a fast most-recently-used cache (e.g., the list of packs, which we search linearly but would like to order based on locality). This patch introduces a generic list that can be used to store arbitrary pointers in most-recently-used order. The implementation is just a doubly-linked list, where "marking" an item as used moves it to the front of the list. Insertion and marking are O(1), and iteration is O(n). There's no lookup support provided; if you need fast lookups, you are better off with a different data structure in the first place. There is also no deletion support. This would not be hard to do, but it's not necessary for handling pack structs, which are created and never removed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:07 -07:00
Jeff King	3157c880f6	sha1_file: drop free_pack_by_name The point of this function is to drop an entry from the "packed_git" cache that points to a file we might be overwriting, because our contents may not be the same (and hence the only caller was pack-objects as it moved a temporary packfile into place). In older versions of git, this could happen because the names of packfiles were derived from the set of objects they contained, not the actual bits on disk. But since `1190a1a` (pack-objects: name pack files after trailer hash, 2013-12-05), the name reflects the actual bits on disk, and any two packfiles with the same name can be used interchangeably. Dropping this function not only saves a few lines of code, it makes the lifetime of "struct packed_git" much easier to reason about: namely, we now do not ever free these structs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:06 -07:00
Jeff King	77023ea3c3	t/perf: add tests for many-pack scenarios Git's pack storage does efficient (log n) lookups in a single packfile's index, but if we have multiple packfiles, we have to linearly search each for a given object. This patch introduces some timing tests for cases where we have a large number of packs, so that we can measure any improvements we make in the following patches. The main thing we want to time is object lookup. To do this, we measure "git rev-list --objects --all", which does a fairly large number of object lookups (essentially one per object in the repository). However, we also measure the time to do a full repack, which is interesting for two reasons. One is that in addition to the usual pack lookup, it has its own linear iteration over the list of packs. And two is that because it it is the tool one uses to go from an inefficient many-pack situation back to a single pack, we care about its performance not only at marginal numbers of packs, but at the extreme cases (e.g., if you somehow end up with 5,000 packs, it is the only way to get back to 1 pack, so we need to make sure it performs well). We measure the performance of each command in three scenarios: 1 pack, 50 packs, and 1,000 packs. The 1-pack case is a baseline; any optimizations we do to handle multiple packs cannot possibly perform better than this. The 50-pack case is as far as Git should generally allow your repository to go, if you have auto-gc enabled with the default settings. So this represents the maximum performance improvement we would expect under normal circumstances. The 1,000-pack case is hopefully rare, though I have seen it in the wild where automatic maintenance was broken for some time (and the repository continued to receive pushes). This represents cases where we care less about general performance, but want to make sure that a full repack command does not take excessively long. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:06 -07:00
Junio C Hamano	08bb3500a2	Sixth batch of topics for 2.10 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-19 13:26:16 -07:00
Junio C Hamano	36cafe4444	Merge branch 'ls/p4-tmp-refs' "git p4" used a location outside $GIT_DIR/refs/ to place its temporary branches, which has been moved to refs/git-p4-tmp/. * ls/p4-tmp-refs: git-p4: place temporary refs used for branch import under refs/git-p4-tmp	2016-07-19 13:22:24 -07:00
Junio C Hamano	3d55eea805	Merge branch 'js/am-call-theirs-theirs-in-fallback-3way' One part of "git am" had an oddball helper function that called stuff from outside "his" as opposed to calling what we have "ours", which was not gender-neutral and also inconsistent with the rest of the system where outside stuff is usuall called "theirs" in contrast to "ours". * js/am-call-theirs-theirs-in-fallback-3way: am: counteract gender bias	2016-07-19 13:22:23 -07:00
Junio C Hamano	2b6456b808	Merge branch 'jk/write-file' General code clean-up around a helper function to write a single-liner to a file. * jk/write-file: branch: use write_file_buf instead of write_file use write_file_buf where applicable write_file: add format attribute write_file: add pointer+len variant write_file: use xopen write_file: drop "gently" form branch: use non-gentle write_file for branch description am: ignore return value of write_file() config: fix bogus fd check when setting up default config	2016-07-19 13:22:23 -07:00
Junio C Hamano	96e08010ee	Merge branch 'jk/printf-format' Code clean-up to avoid using a variable string that compilers may feel untrustable as printf-style format given to write_file() helper function. * jk/printf-format: commit.c: remove print_commit_list() avoid using sha1_to_hex output as printf format walker: let walker_say take arbitrary formats	2016-07-19 13:22:22 -07:00
Junio C Hamano	f5236a776f	Merge branch 'rs/help-c-source-with-gitattributes' The .c/.h sources are marked as such in our .gitattributes file so that "git diff -W" and friends would work better. * rs/help-c-source-with-gitattributes: .gitattributes: set file type for C files	2016-07-19 13:22:21 -07:00
Junio C Hamano	566fdaf611	Merge branch 'nd/fetch-ref-summary' Improve the look of the way "git fetch" reports what happened to each ref that was fetched. * nd/fetch-ref-summary: fetch: reduce duplicate in ref update status lines with placeholder fetch: align all "remote -> local" output fetch: change flag code for displaying tag update and deleted ref fetch: refactor ref update status formatting code git-fetch.txt: document fetch output	2016-07-19 13:22:21 -07:00
Junio C Hamano	39cadeec0d	Merge branch 'jk/test-match-signal' The test framework learned a new helper test_match_signal to check an exit code from getting killed by an expected signal. * jk/test-match-signal: t/lib-git-daemon: use test_match_signal test_must_fail: use test_match_signal t0005: use test_match_signal as appropriate tests: factor portable signal check out of t0005	2016-07-19 13:22:20 -07:00
Junio C Hamano	d4c6375fd8	Merge branch 'jk/common-main' There are certain house-keeping tasks that need to be performed at the very beginning of any Git program, and programs that are not built-in commands had to do them exactly the same way as "git" potty does. It was easy to make mistakes in one-off standalone programs (like test helpers). A common "main()" function that calls cmd_main() of individual program has been introduced to make it harder to make mistakes. * jk/common-main: mingw: declare main()'s argv as const common-main: call git_setup_gettext() common-main: call restore_sigpipe_to_default() common-main: call sanitize_stdfds() common-main: call git_extract_argv0_path() add an extra level of indirection to main()	2016-07-19 13:22:19 -07:00
Junio C Hamano	df9da64a7c	Merge branch 'ak/lazy-prereq-mktemp' A test that unconditionally used "mktemp" learned that the command is not necessarily available everywhere. * ak/lazy-prereq-mktemp: t7610: test for mktemp before test execution	2016-07-19 13:22:18 -07:00
Junio C Hamano	a883c31af6	Merge branch 'nd/icase' "git grep -i" has been taught to fold case in non-ascii locales correctly. * nd/icase: grep.c: reuse "icase" variable diffcore-pickaxe: support case insensitive match on non-ascii diffcore-pickaxe: Add regcomp_or_die() grep/pcre: support utf-8 gettext: add is_utf8_locale() grep/pcre: prepare locale-dependent tables for icase matching grep: rewrite an if/else condition to avoid duplicate expression grep/icase: avoid kwsset when -F is specified grep/icase: avoid kwsset on literal non-ascii strings test-regex: expose full regcomp() to the command line test-regex: isolate the bug test code grep: break down an "if" stmt in preparation for next changes	2016-07-19 13:22:17 -07:00
Junio C Hamano	a63d31b4d3	Merge branch 'bc/cocci' Conversion from unsigned char sha1[20] to struct object_id continues. * bc/cocci: diff: convert prep_temp_blob() to struct object_id merge-recursive: convert merge_recursive_generic() to object_id merge-recursive: convert leaf functions to use struct object_id merge-recursive: convert struct merge_file_info to object_id merge-recursive: convert struct stage_data to use object_id diff: rename struct diff_filespec's sha1_valid member diff: convert struct diff_filespec to struct object_id coccinelle: apply object_id Coccinelle transformations coccinelle: convert hashcpy() with null_sha1 to hashclr() contrib/coccinelle: add basic Coccinelle transforms hex: add oid_to_hex_r()	2016-07-19 13:22:16 -07:00
Junio C Hamano	63641fb071	Merge branch 'js/log-to-diffopt-file' The commands in the "log/diff" family have had an FILE* pointer in the data structure they pass around for a long time, but some codepaths used to always write to the standard output. As a preparatory step to make "git format-patch" available to the internal callers, these codepaths have been updated to consistently write into that FILE* instead. * js/log-to-diffopt-file: mingw: fix the shortlog --output=<file> test diff: do not color output when --color=auto and --output=<file> is given t4211: ensure that log respects --output=<file> shortlog: respect the --output=<file> setting format-patch: use stdout directly format-patch: avoid freopen() format-patch: explicitly switch off color when writing to files shortlog: support outputting to streams other than stdout graph: respect the diffopt.file setting line-log: respect diffopt's configured output file stream log-tree: respect diffopt's configured output file stream log: prepare log/log-tree to reuse the diffopt.close_file attribute	2016-07-19 13:22:15 -07:00
Junio C Hamano	7725bebe21	Merge branch 'sb/submodule-parallel-fetch' Fix recently introduced codepaths that are involved in parallel submodule operations, which gave up on reading too early, and could have wasted CPU while attempting to write under a corner case condition. * sb/submodule-parallel-fetch: hoist out handle_nonblock function for xread and xwrite xwrite: poll on non-blocking FDs xread: retry after poll on EAGAIN/EWOULDBLOCK	2016-07-19 13:22:15 -07:00
Junio C Hamano	e0e56cbf7f	Merge branch 'lf/recv-sideband-cleanup' Code simplification. * lf/recv-sideband-cleanup: sideband.c: small optimization of strbuf usage sideband.c: refactor recv_sideband()	2016-07-19 13:22:14 -07:00
Junio C Hamano	7418a6b1a0	Merge branch 'dk/blame-move-no-reason-for-1-line-context' "git blame -M" missed a single line that was moved within the file. * dk/blame-move-no-reason-for-1-line-context: blame: require 0 context lines while finding moved lines with -M	2016-07-19 13:22:13 -07:00
Junio C Hamano	dc21164e66	Merge branch 'nd/connect-ssh-command-config' A new configuration variable core.sshCommand has been added to specify what value for GIT_SSH_COMMAND to use per repository. * nd/connect-ssh-command-config: connect: read $GIT_SSH_COMMAND from config file	2016-07-19 13:22:12 -07:00
Junio C Hamano	29493589e9	archive-tar: huge offset and future timestamps would not work on 32-bit As we are not yet moving everything to size_t but still using ulong internally when talking about the size of object, platforms with 32-bit long will not be able to produce tar archive with 4GB+ file, and cannot grok 077777777777UL as a constant. Disable the extended header feature and do not test it on them. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-15 10:51:55 -07:00
Junio C Hamano	82246e075e	Sync with 2.9.2 * maint: Git 2.9.2 t0006: skip "far in the future" test when unsigned long is not long enough	2016-07-15 10:49:23 -07:00
Junio C Hamano	e634160bf4	Git 2.9.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-15 10:48:16 -07:00
Junio C Hamano	33eacd3ff4	Merge branch 'jk/tzoffset-fix' into maint Skip tests that are unrunnable on platforms without 64-bit long to avoid unnecessary test failures. * jk/tzoffset-fix: t0006: skip "far in the future" test when unsigned long is not long enough	2016-07-15 09:43:42 -07:00
Jeff King	6b9c38e14c	t0006: skip "far in the future" test when unsigned long is not long enough Git's source code refers to timestamps as unsigned longs. On 32-bit platforms, as well as on Windows, unsigned long is not large enough to capture dates that are "absurdly far in the future". While we can fix this issue properly by replacing unsigned long with a larger type, we want to be a bit more conservative and just skip those tests on the maint track. Signed-off-by: Jeff King <peff@peff.net> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-15 09:05:53 -07:00
Junio C Hamano	75676c8c8b	Merge branch 'jk/upload-pack-hook' A hot-fix to make a test working in mingw again. * jk/upload-pack-hook: mingw: fix regression in t1308-config-set	2016-07-14 10:38:57 -07:00
Johannes Schindelin	b738396cfd	mingw: fix regression in t1308-config-set When we tried to fix in `58461bd` (t1308: do not get fooled by symbolic links to the source tree, 2016-06-02) an obscure case where the user cd's into Git's source code via a symbolic link, a regression was introduced that affects all test runs on Windows. The original patch introducing the test case in question was careful to use `$(pwd)` instead of `$PWD`. This was done to account for the fact that Git's test suite uses shell scripting even on Windows, where the shell's Unix-y paths are incompatible with the main Git executable's idea of paths: it only accepts Windows paths. It is an awkward but necessary thing, then, to use `$(pwd)` (which gives us a Windows path) when interacting with the Git executable and `$PWD` (which gives the shell's idea of the current working directory in Unix-y form) for shell scripts, including the test suite itself. Obviously this broke the use case of the Git maintainer when changing the working directory into Git's source code directory via a symlink, i.e. when `$(pwd)` does not agree with `$PWD`. However, we must not fix that use case at the expense of regressing another use case. Let's special-case Windows here, even if it is ugly, for lack of a more elegant solution. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-14 10:38:28 -07:00
Junio C Hamano	79ed43c28f	Fifth batch of topics for 2.10 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-13 11:30:25 -07:00
Junio C Hamano	7a23f7367d	Merge branch 'jk/big-and-future-archive-tar' "git archive" learned to handle files that are larger than 8GB and commits far in the future than expressible by the traditional US-TAR format. * jk/big-and-future-archive-tar: archive-tar: drop return value archive-tar: write extended headers for far-future mtime archive-tar: write extended headers for file sizes >= 8GB t5000: test tar files that overflow ustar headers t9300: factor out portable "head -c" replacement	2016-07-13 11:24:18 -07:00
Junio C Hamano	42bd66816b	Merge branch 'nd/ita-cleanup' Git does not know what the contents in the index should be for a path added with "git add -N" yet, so "git grep --cached" should not show hits (or show lack of hits, with -L) in such a path, but that logic does not apply to "git grep", i.e. searching in the working tree files. But we did so by mistake, which has been corrected. * nd/ita-cleanup: grep: fix grepping for "intent to add" files t7810-grep.sh: fix a whitespace inconsistency t7810-grep.sh: fix duplicated test name	2016-07-13 11:24:18 -07:00
Junio C Hamano	5eb1e9f1a0	Merge branch 'ps/rebase-i-auto-unstash-upon-abort' "git rebase -i --autostash" did not restore the auto-stashed change when the operation was aborted. * ps/rebase-i-auto-unstash-upon-abort: rebase -i: restore autostash on abort	2016-07-13 11:24:17 -07:00
Junio C Hamano	6c35952a08	Merge branch 'js/t3404-grammo-fix' Grammofix. * js/t3404-grammo-fix: t3404: fix a grammo (commands are ran -> commands are run)	2016-07-13 11:24:16 -07:00
Junio C Hamano	c510926691	Merge branch 'js/sign-empty-commit-fix' "git commit --amend --allow-empty-message -S" for a commit without any message body could have misidentified where the header of the commit object ends. * js/sign-empty-commit-fix: commit -S: avoid invalid pointer with empty message	2016-07-13 11:24:15 -07:00
Junio C Hamano	ce18123cec	Merge branch 'mm/doc-tt' More mark-up updates to typeset strings that are expected to literally typed by the end user in fixed-width font. * mm/doc-tt: doc: typeset HEAD and variants as literal CodingGuidelines: formatting HEAD in documentation doc: typeset long options with argument as literal doc: typeset '--' as literal doc: typeset long command-line options as literal doc: typeset short command-line options as literal Documentation/git-mv.txt: fix whitespace indentation	2016-07-13 11:24:14 -07:00
Junio C Hamano	fc8a3a6072	Merge branch 'dg/subtree-rebase-test' Add a test to specify the desired behaviour that currently is not available in "git rebase -Xsubtree=...". * dg/subtree-rebase-test: contrib/subtree: Add a test for subtree rebase that loses commits	2016-07-13 11:24:13 -07:00
Junio C Hamano	7aa46d2bc8	Merge branch 'nd/doc-new-command' Typofix in a doc. * nd/doc-new-command: new-command.txt: correct the command description file	2016-07-13 11:24:12 -07:00
Junio C Hamano	97865e83c7	Merge branch 'ew/gc-auto-pack-limit-fix' "gc.autoPackLimit" when set to 1 should not trigger a repacking when there is only one pack, but the code counted poorly and did so. * ew/gc-auto-pack-limit-fix: gc: fix off-by-one error with gc.autoPackLimit	2016-07-13 11:24:12 -07:00
Junio C Hamano	67166a8da6	Merge branch 'ah/unpack-trees-advice-messages' Grammofix. * ah/unpack-trees-advice-messages: unpack-trees: fix English grammar in do-this-before-that messages	2016-07-13 11:24:11 -07:00
Junio C Hamano	2703572b3a	Merge branch 'va/i18n-even-more' More markings of messages for i18n, with updates to various tests to pass GETTEXT_POISON tests. One patch from the original submission dropped due to conflicts with jk/upload-pack-hook, which is still in flux. * va/i18n-even-more: (38 commits) t5541: become resilient to GETTEXT_POISON i18n: branch: mark comment when editing branch description for translation i18n: unmark die messages for translation i18n: submodule: escape shell variables inside eval_gettext i18n: submodule: join strings marked for translation i18n: init-db: join message pieces i18n: remote: allow translations to reorder message i18n: remote: mark URL fallback text for translation i18n: standardise messages i18n: sequencer: add period to error message i18n: merge: change command option help to lowercase i18n: merge: mark messages for translation i18n: notes: mark options for translation i18n: notes: mark strings for translation i18n: transport-helper.c: change N_() call to _() i18n: bisect: mark strings for translation t5523: use test_i18ngrep for negation t4153: fix negated test_i18ngrep call t9003: become resilient to GETTEXT_POISON tests: unpack-trees: update to use test_i18n* functions ...	2016-07-13 11:24:10 -07:00
Johannes Schindelin	bac233f2c2	mingw: fix the shortlog --output=<file> test Adjust t4201 to pass on Windows; a couple of test cases need to be skipped on Windows which leads to a different shortlog than on Linux. Let's just fix that by limiting the shortlog's commit range to traverse only one commit: that guarantees that it does not matter how many test cases were skipped. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-11 12:32:02 -07:00
Junio C Hamano	b1ec08fda8	Sync with v2.9.1 * maint: Git 2.9.1	2016-07-11 10:46:39 -07:00
Junio C Hamano	5c9159de87	Git 2.9.1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-11 10:45:50 -07:00
Junio C Hamano	3a30c14b9b	Merge branch 'jc/t2300-setup' into maint Portability fix for Windows. * jc/t2300-setup: t2300: "git --exec-path" is not usable in $PATH on Windows as-is	2016-07-11 10:44:19 -07:00

1 2 3 4 5 ...

43668 Commits