git-commit-vandalism

Author	SHA1	Message	Date
Martin Ågren	9e9f132f53	doc: convert \--option to --option Rather than using a backslash in \--foo, with or without ''-quoting, write `--foo` for better rendering. As explained in commit `1c262bb7b` (doc: convert \--option to --option, 2015-05-13), the backslash is not needed for the versions of AsciiDoc that we support, but is rendered literally by Asciidoctor. Signed-off-by: Martin Ågren <martin.agren@gmail.com>	2018-04-18 12:49:26 +09:00
SZEDER Gábor	bed21a8ad6	docs/git-gc: fix minor rendering issue An unwanted single quote character in the paragraph documenting the 'gc.aggressiveWindow' config variable prevented the name of that config variable from being rendered correctly, ever since that piece of docs was added in `0d7566a5ba` (Add --aggressive option to 'git gc', 2007-05-09). Remove that single quote. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-18 12:07:23 +09:00
Stefan Beller	d228eea514	worktree: accept -f as short for --force for removal Many commands support a "--force" option, frequently abbreviated as "-f", however, "git worktree remove"'s hand-rolled OPT_BOOL forgets to recognize the short form, despite git-worktree.txt documenting "-f" as supported. Replace OPT_BOOL with OPT__FORCE, which provides "-f" for free, and makes 'remove' consistent with 'add' option parsing (which also specifies the PARSE_OPT_NOCOMPLETE flag). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-18 09:19:05 +09:00
SZEDER Gábor	94408dc71c	completion: reduce overhead of clearing cached --options To get the names of all '$__git_builtin_*' variables caching --options of builtin commands in order to unset them, `8b0eaa41f2` (completion: clear cached --options when sourcing the completion script, 2018-03-22) runs a 'set \|sed s///' pipeline. This works both in Bash and in ZSH, but has a higher than necessary overhead with the extra processes. In Bash we can do better: run the 'compgen -v __gitcomp_builtin_' builtin command, which lists the same variables, but without a pipeline and 'sed' it can do so with lower overhead. ZSH will still continue to run that pipeline. This change also happens to work around an issue in the default Bash version shipped in macOS (3.2.57), reported by users of the Powerline shell prompt, which was triggered by the same commit `8b0eaa41f2` as well. Powerline uses several Unicode Private Use Area code points to represent some of its pretty text UI elements (arrows and what not), and these are stored in the $PS1 variable. Apparently the 'set' builtin of said Bash version on macOS has issues with these code points, and produces garbled output where Powerline's special symbols should be in the $PS1 variable. This, in turn, triggers the following error message in the downstream 'sed' process: sed: RE error: illegal byte sequence Other Bash versions, notably 4.4.19 on macOS via homebrew (i.e. a newer version on the same platform) and 3.2.25 on CentOS (i.e. a slightly earlier version, though on a different platform) are not affected. ZSH in macOS (the versions shipped by default or installed via homebrew) or on other platforms isn't affected either. With this patch neither the 'set' builtin is invoked to print garbage, nor 'sed' to choke on it. Issue-on-macOS-reported-by: Stephon Harris <theonestep4@gmail.com> Issue-on-macOS-explained-by: Matthew Coleman <matt@1eanda.com> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-18 08:43:31 +09:00
SZEDER Gábor	7b00342068	completion: fill COMPREPLY directly when completing paths During git-aware path completion, when a lot of path components have to be listed, a significant amount of time is spent in __gitcomp_file(), or more accurately in the shell loop of __gitcompappend(), iterating over all the path components filtering path components matching the current word to be completed, adding prefix path components, and placing the resulting matching paths into the COMPREPLY array. Now, a previous patch in this series made 'git ls-files' and 'git diff-index' list only paths matching the current word to be completed, so an additional filtering in __gitcomp_file() is not necessary anymore. Adding the prefix path components could be done much more efficiently in __git_index_files()'s 'awk' script while stripping trailing path components and removing duplicates and quoting. And then the resulting paths won't require any more filtering or processing before being handed over to Bash, so we could fill the COMPREPLY array directly. Unfortunately, we can't simply use the __gitcomp_direct() helper function to do that, because __gitcomp_file() does one additional thing: it tells Bash that we are doing filename completion, so the shell will kindly do four important things for us: 1. Append a trailing space to all filenames. 2. Append a trailing '/' to all directory names. 3. Escape any meta, globbing, separator, etc. characters. 4. List only the current path component when listing possible completions (i.e. 'dir/subdir/f<TAB>' will list 'file1', 'file2', etc. instead of the whole 'dir/subdir/file1', 'dir/subdir/file2'). While we could let __git_index_files()'s 'awk' script take care of the first two points, the third one gets tricky, and we absolutely need the shell's support for the fourth. Add the helper function __gitcomp_file_direct(), which, just like __gitcomp_direct(), fills the COMPREPLY array with prefiltered and preprocessed paths without any additional processing, without a shell loop, with just one single compound assignment, and, similar to __gitcomp_file(), tells Bash and ZSH that we are doing filename completion. Extend __git_index_files()'s 'awk' script a bit to prepend any prefix path components to all listed paths. Finally, modify __git_complete_index_file() to feed __git_index_files()'s output to ___gitcomp_file_direct() instead of __gitcomp_file(). After this patch there is no shell loop left in the path completion code path. This speeds up path completion when there are a lot of paths matching the current word to be completed. In a pathological repository with 100k files in a single directory, listing all those files: Before this patch, best of five, using GNU awk on Linux: $ time cur=dir/ __git_complete_index_file real 0m0.983s user 0m1.004s sys 0m0.033s After: real 0m0.313s user 0m0.341s sys 0m0.029s Difference: -68.2% Speedup: 3.1x To see the benefits of the whole patch series, the same command with v2.17.0: real 0m2.736s user 0m2.472s sys 0m0.610s Difference: -88.6% Speedup: 8.7x Note that this patch changes the output of the __git_index_files() helper function by unconditionally prepending the prefix path components to every listed path. This would break users' completion scriptlets that directly run: __gitcomp_file "$(__git_index_files ...)" "$pfx" "$cur_" because that would add the prefix path components once more. However, __git_index_files() is kind of a "helper function of a helper function", and users' completion scriptlets should have been using __git_complete_index_file() for git-aware path completion in the first place, so this is likely doesn't worth worrying about. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:37 +09:00
SZEDER Gábor	193757f806	completion: improve handling quoted paths in 'git ls-files's output If any pathname contains backslash, double quote, tab, newline, or any control characters, 'git ls-files' and 'git diff-index' will enclose that pathname in double quotes and escape those special characters using C-style one-character escape sequences or \nnn octal values. This prevents those files from being listed during git-aware path completion, because due to the quoting they will never match the current word to be completed. Extend __git_index_files()'s 'awk' script to remove all that quoting and escaping from unique path components, so even paths containing (almost all) such special characters can be completed. Paths containing newline characters are still an issue, though. We use newlines as separator character when filling the COMPREPLY array, so a path with one or more newline will end up split to two or more elements in COMPREPLY, basically breaking completion. There is nothing we can do about it without a significant performance hit, so let's just ignore such paths for now. As far as paths with newlines are concerned, this isn't any different from the previous behavior, because those paths were always omitted, though in the past they were omitted because due to the quoting they didn't match the current word to be completed. Anyway, Bash's own filename completion (Meta-/) can complete even those paths, if need be. Note: - We don't dequote path components right away as they are coming in, because then we would have to dequote each directory name repeatedly, as many times as it appears in the input, i.e. as many times as the number of listed paths it contains. Instead, we dequote them at the end, as we print unique path components. - Even when a directory name itself does not contain any special characters, it will still be quoted if any of its trailing path components do. If a directory contains paths both with and without special characters, then the name of that directory will appear both quoted and unquoted in the output of 'git ls-files' and 'git diff-index'. Consequently, we will add such a directory name to the deduplicating associative array twice: once quoted and once unquoted. This means that we have to be careful after dequoting a directory name, and only print it if we haven't seen the same directory name unquoted. - It would be wonderful if we could just pass '-z' to those git commands to output \0-separated unquoted paths, and use \0 as record separator in the 'awk' script processing their output... this patch would be so much simpler, almost trivial even. Unfortunately, however, POSIX and most 'awk' implementations don't support \0 as record separator (GNU awk does support it). - This patch makes the earlier change to list paths with 'core.quotePath=false' basically redundant, because this could decode any \nnn-escaped non-ASCII character just fine, as well. However, I suspect that 'git ls-files' can deal with those non-ASCII characters faster than this updated 'awk' script; just in case someone is burdened with tons of pathnames containing non-ASCII characters. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	c1bc0a0e92	completion: remove repeated dirnames with 'awk' during path completion During git-aware path completion, after all the trailing path components have been removed from the output of 'git ls-files' and 'git diff-index' (see previous patch), each directory name is repeated as many times as the number of listed paths it contains. This can be a lot of repetitions, especially when invoking path completion close to the root of a big worktree, which would cause a considerable overhead downstream of __git_index_files(), in particular in the shell loop that fills the COMPREPLY array. To reduce this overhead, __git_index_files() runs the classic '... \|sort \|uniq' pattern to remove those repetitions from the function's output. While removing repeated directory names is effective in reducing the number of iterations in that shell loop, it still imposes the overhead of fork()+exec()ing two external processes, and two additional stages in the pipeline, where potentially relatively large amount of data can be passed between two subsequent pipeline stages. Extend __git_index_files()'s 'awk' script to remove repeated path components by first creating and filling an associative array indexed by all encountered path components (after the trailing path components have been removed), and then iterating over this array and printing the indices, i.e. unique path components. This way we can remove the '\|sort \|uniq' pipeline stages, and their eliminated overhead results in faster path completion. Listing all tracked files (12) and directories (23) at the top of the worktree in linux.git (over 62k files), i.e. what's doing all the hard work behind 'git rm <TAB>': Before this patch, best of five, using GNU awk on Linux: real 0m0.069s user 0m0.089s sys 0m0.026s After: real 0m0.052s user 0m0.072s sys 0m0.014s Difference: -24.6% Note that this changes order of elements in __git_index_files()'s output. This is not an issue, because this function was only ever intended to feed paths into the COMPREPLY array, and Bash will sort its elements (according to the users locale) anyway. Note also that using 'awk' to remove repeated path components is also beneficial for the performance of the next two patches: - The first will extend this 'awk' script to dequote quoted paths in the output of 'git ls-files' and 'git diff-index'. With this patch it will only have to dequote unique path components, not all. - The second will, among other things, extend this 'awk' script to prepend prefix path components from the command line to the currently completed path component. Consequently, each line in 'awk's output will grow longer. Without this patch that '\|sort \|uniq' would have to exchange and process that much more data. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	9703797c9d	t9902-completion: ignore COMPREPLY element order in some tests The order or possible completion words in the COMPREPLY array doesn't actually matter, as long as all the right words are in there, because Bash will sort them anyway. Yet, our tests looking at the elements of COMPREPLY always expect them to be in a specific order. Now, this hasn't been an issue before, but the next patch is about to optimize a bit more our git-aware path completion, and as a harmless side effect the order of elements in COMPREPLY will change. Worse, the order will be downright undefined, because after the next patch path components will come directly from iterating through an associative array in 'awk', and the order of iteration over the elements in those arrays is undefined, and indeed different 'awk' implementations produce different order. Consequently, we can't get away with simply adjusting the expected results in the affected tests. Modify the 'test_completion' helper function to sort both the expected and the actual results, i.e. the elements in COMPREPLY, before comparing them, so the tests using this helper function will work regardless of the order of elements. Note that this change still leaves a bunch of tests depending on the order of elements in COMPREPLY, tests that focus on a specific helper function and therefore don't use the 'test_completion' helper. I would rather deal with those later, when (if ever) the need actually arises, than create unnecessary code churn now. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	105c0efff3	completion: use 'awk' to strip trailing path components During git-aware path completion we complete one path component at a time, i.e. 'git add <TAB>' offers only 'dir/' at first, not 'dir/subdir/file' right away, just like Bash's own filename completion. However, since both 'git ls-files' and 'git diff-index' dive deep into subdirectories, we have to strip all trailing path components from the listed paths, keeping only the leading path component. This stripping is currently done in a shell loop in __git_index_files(), which can take a significant amount of time when it has to iterate through a large number of paths. Replace this shell loop with a little 'awk' script using '/' as input field separator and printing the first field, which produces the same output much faster. Listing all tracked files (12) and directories (23) at the top of the worktree in linux.git (over 62k files), i.e. what's doing all the hard work behind 'git rm <TAB>': Before this patch, best of five, using GNU awk on Linux: $ time cur= __git_complete_index_file real 0m2.149s user 0m1.307s sys 0m1.086s After: real 0m0.067s user 0m0.089s sys 0m0.023s Difference: -96.9% Speedup: 32.1x Note that this could be done with 'sed', or even with 'cut', just as well, but the upcoming patches require 'awk's scriptability. Note also that this change means one more fork()+exec()ed process during path completion, adding more overhead especially on Windows, but a later patch will more than make up for it by eliminating two other processes in the same function. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	a364e984d1	completion: let 'ls-files' and 'diff-index' filter matching paths During git-aware path completion, e.g. 'git rm dir/fil<TAB>', both 'git ls-files' and 'git diff-index' list all paths in the given 'dir/' matching certain criteria (cached, modified, untracked, etc.) appropriate for the given git command, even paths whose names don't begin with 'fil'. This comes with a considerable performance penalty when the directory in question contains a lot of paths, but the current word can be uniquely completed or when only a handful of those paths match the current word. Reduce the number of iterations in this codepath from the number of paths to the number of matching paths by specifying an appropriate globbing pattern to 'git ls-files' and 'git diff-index' to list only paths that match the current word to be completed. Note that both commands treat backslashes as escape characters in their file arguments, e.g. to preserve the literal meaning of globbing characters, so we have to double every backslash in the globbing pattern. This is why one of the path completion tests specifically checks the completion of a path containing a literal backslash character (that test still fails, though, because both commands output such paths enclosed in double quotes and the special characters escaped; a later patch in this series will deal with those). This speeds up path completion considerably when there are a lot of non-matching paths to be filtered out. Uniquely completing a tracked filename at the top of the worktree in linux.git (over 62k files), i.e. what's doing all the hard work behind 'git rm Mak<TAB>' to complete 'Makefile': Before this patch, best of five, on Linux: $ time cur=Mak __git_complete_index_file real 0m2.159s user 0m1.299s sys 0m1.089s After: real 0m0.033s user 0m0.023s sys 0m0.015s Difference: -98.5% Speedup: 65.4x Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	f12785a3a7	completion: improve handling quoted paths on the command line Our git-aware path completion doesn't work when it has to complete a word already containing quoted and/or backslash-escaped characters on the command line. The root cause of the issue is that completion functions see all words on the command line verbatim, i.e. including all backslash, single and double quote characters that the shell would eventually remove when executing the finished command. These quoting/escaping characters cause different issues depending on which path component of the word to be completed contains them: - The quoting/escaping is in the prefix path component(s). Let's suppose we have a directory called 'New Dir', containing two untracked files 'file.c' and 'file.o', and we have a gitignore rule ignoring object files. In this case all of these: git add New\ Dir/<TAB> git add "New Dir/<TAB> git add 'New Dir/<TAB> should uniquely complete 'file.c' right away, but Bash offers both 'file.c' and 'file.o' instead. The reason for this behavior is that our completion script uses the prefix directory name like 'git -C "New\ Dir/" ls-files ...", i.e. with the backslash inside double quotes. Git then tries to enter a directory called 'New\ Dir', which (most likely) fails because such a directory doesn't exists. As a result our completion script doesn't list any files, leaves the COMPREPLY array empty, which in turn causes Bash to fall back to its simple filename completion and lists all files in that directory, i.e. both 'file.c' and 'file.o'. - The quoting/escaping is in the path component to be completed. Let's suppose we have two untracked files 'New File.c' and 'New File.o', and we have a gitignore rule ignoring object files. In this case all of these: git add New\ Fi<TAB> git add "New Fi<TAB> git add 'New Fi<TAB> should uniquely complete 'New File.c' right away, but Bash offers both 'New File.c' and 'New File.o' instead. The reason for this behavior is that our completion script uses this 'New\ Fi' or '"New Fi' etc. word to filter matching paths, and of course none of the potential filenames will match because of the included backslash or double quote. The end result is the same as above: the completion script doesn't list any files, Bash falls back to its filename completion, which then lists the matching object file as well. Add the new helper function __git_dequote() [1], which removes (most of[2]) the quoting and escaping from the word it gets as argument. To minimize the overhead of calling this function, store its result in the variable $dequoted_word, supposed to be declared local in the caller; simply printing the result would require a command substitution imposing the overhead of fork()ing a subshell. Use this function in __git_complete_index_file() to dequote the current word, i.e. the path, to be completed, to avoid the above described quoting-related issues, thereby fixing two of the failing quoted path completion tests. [1] The bash-completion project already has a dequote() function, which I hoped I could borrow to deal with this, but unfortunately it doesn't work quite well for this purpose (perhaps that's why even the bash-completion project only rarely uses it). The main issue is that their dequote() is implemented as: eval printf %s "$1" 2> /dev/null where $1 would contain the word to be completed. While it's a short and sweet one-liner, the use of 'eval' requires that $1 is a syntactically valid string, which is not the case when quoting the path like 'git add "New Dir/<TAB>'. This causes 'eval' to fail, because it can't find the matching closing double quote, and the function returns nothing. The result is totally broken behavior, as if the current word were empty, and the completion script would then list all files from the current directory. This is why one of the quoted path completion tests specifically checks the completion of a path with an opening but without a corresponding closing double quote character. Furthermore, the 'eval' performs all kinds of expansions, which may or may not be desired; I think it's the latter. Finally, using this function would require a command substitution. [2] Bash understands the $'string' quoting as well, which "expands to 'string', with backslash-escaped characters replaced as specified by the ANSI C standard" (quoted from Bash manpage). Since shell metacharacters, field separators, globbing, etc. can all be easily entered using standard shell escaping or quoting, this type of quoting comes in handly when dealing with control characters that are otherwise difficult both to "type" and to see on the command line. Because of this difficulty I would assume that people do avoid pathnames with such control characters anyway, so I didn't bother implementing it. This function is already way too long as it is. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	3dfe23ba51	completion: support completing non-ASCII pathnames Unless the user has 'core.quotePath=false' somewhere in the configuration, both 'git ls-files' and 'git diff-index' will by default quote any pathnames that contain bytes with values higher than 0x80, and escape those bytes as '\nnn' octal values. This prevents completing paths when the current path component to be completed contains any non-ASCII, most notably UTF-8, characters, because none of the listed quoted paths will match the current word on the command line. Set 'core.quotePath=false' for those 'git ls-files' and 'git diff-index' invocations, so they won't consider bytes higher than 0x80 as "unusual", and won't quote pathnames containing such characters. Note that pathnames containing backslash, double quote, or control characters will still be quoted; a later patch in this series will deal with those. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	6bf0ced4e2	completion: simplify prefix path component handling during path completion Once upon a time 'git -C "" cmd' errored out with "Cannot change to '': No such file or directory", therefore the completion script took extra steps to run 'git -C "." cmd' instead; see `fca416a41e` (completion: use "git -C $there" instead of (cd $there && git ...), 2014-10-09). Those extra steps are not needed since `6a536e2076` (git: treat "git -C '<path>'" as a no-op when <path> is empty, 2015-03-06), so remove them. While at it, also simplify how the trailing '/' is appended to the variable holding the prefix path components. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	722e31c713	completion: move __git_complete_index_file() next to its helpers It's much easier to read, understand and modify the functions related to git-aware path completion when they are right next to each other. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
SZEDER Gábor	5bb534a620	t9902-completion: add tests demonstrating issues with quoted pathnames Completion functions see all words on the command line verbatim, including any backslash-escapes, single and double quotes that might be there. Furthermore, git commands quote pathnames if they contain certain special characters. All these create various issues when doing git-aware path completion. Add a couple of failing tests to demonstrate these issues. Later patches in this series will discuss these issues in detail as they fix them. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 12:49:36 +09:00
Sergey Organov	a279b74c68	glossary: substitute "ancestor" for "direct ancestor" in 'push' description. Even though "direct ancestor" is not defined in the glossary, the common meaning of the term is simply "parent", parents being the only direct ancestors, and the rest of ancestors being indirect ancestors. As "parent" is obviously wrong in this place in the description, we should simply say "ancestor", as everywhere else. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 11:23:23 +09:00
Tao Qingyun	adc887221f	t1510-repo-setup.sh: remove useless mkdir Signed-off-by: Tao Qingyun <845767657@qq.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-17 10:55:17 +09:00
Ævar Arnfjörð Bjarmason	6d5ed4836d	git{,-blame}.el: remove old bitrotting Emacs code The git-blame.el mode has been superseded by Emacs's own vc-annotate (invoked by C-x v g). Users of the git.el mode are now much better off using either Magit or the Git backend for Emacs's own VC mode. These modes were added over 10 years ago when Emacs's own Git support was much less mature, and there weren't other mature modes in the wild or shipped with Emacs itself. These days these modes have few if any users, and users of git aren't well served by us shipping these (some OS's install them alongside git by default, which is confusing and leads users astray). So let's remove these per Alexandre Julliard's message to the ML[1]. If someone still wants these for some reason they're better served by hosting these elsewhere (e.g. on ELPA), instead of us distributing them with git. However, since downstream packagers such as Debian are packaging this as git-el it's less disruptive to still carry these files as Elisp code that'll error out with a message suggesting alternatives, rather than drop the files entirely[2]. Then rather than receive a cryptic load error when they upgrade existing users will get an error directing them to the README file, or to just stop requiring these modes. I think it makes sense to link to GitHub's hosting of contrib/emacs/README (which'll be updated by the time users see this) so they don't have to hunt down the packaged README on their local system. 1. "Re: [PATCH] git.el: handle default excludesfile properly" (87muzlwhb0.fsf@winehq.org) -- https://public-inbox.org/git/87muzlwhb0.fsf@winehq.org/ 2. "Re: [PATCH v3] git{,-blame}.el: remove old bitrotting Emacs code" (20180327165751.GA4343@aiede.svl.corp.google.com) -- https://public-inbox.org/git/20180327165751.GA4343@aiede.svl.corp.google.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 17:25:49 +09:00
Jeff King	8b44b2be89	gpg-interface: find the last gpg signature line A signed tag has a detached signature like this: object ... [...more header...] This is the tag body. -----BEGIN PGP SIGNATURE----- [opaque gpg data] -----END PGP SIGNATURE----- Our parser finds the _first_ line that appears to start a PGP signature block, meaning we may be confused by a signature (or a signature-like line) in the actual body. Let's keep parsing and always find the final block, which should be the detached signature over all of the preceding content. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:03 +09:00
Jeff King	f68f2dd57f	gpg-interface: extract gpg line matching helper Let's separate the actual line-by-line parsing of signatures from the notion of "is this a gpg signature line". That will make it easier to do more refactoring of this loop in future patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:03 +09:00
Jeff King	17ef3a421e	gpg-interface: fix const-correctness of "eol" pointer We accidentally shed the "const" of our buffer by passing it through memchr. Let's fix that, and while we're at it, move our variable declaration inside the loop, which is the only place that uses it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:03 +09:00
Jeff King	e6fa6cde5b	gpg-interface: use size_t for signature buffer size Even though our object sizes (from which these buffers would come) are typically "unsigned long", this is something we'd like to eventually fix (since it's only 32-bits even on 64-bit Windows). It makes more sense to use size_t when taking an in-memory buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:03 +09:00
Jeff King	f80bee27e3	gpg-interface: modernize function declarations Let's drop "extern" from our declarations, which brings us in line with our modern style guidelines. While we're here, let's wrap some of the overly long lines, and move docstrings for public functions to their declarations, since they document the interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:03 +09:00
Jeff King	1b0eeec3f3	gpg-interface: handle bool user.signingkey The config handler for user.signingkey does not check for a boolean value, and thus: git -c user.signingkey tag will segfault. We could fix this and even shorten the code by using git_config_string(). But our set_signing_key() helper is used by other code outside of gpg-interface.c, so we must keep it (and we may as well use it, because unlike git_config_string() it does not leak when we overwrite an old value). Ironically, the handler for gpg.program just below _could_ use git_config_string() but doesn't. But since we're going to touch that in a future patch, we'll leave it alone for now. We will add some whitespace and returns in preparation for adding more config keys, though. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:02 +09:00
Jeff King	cf98a52ba4	t7004: fix mistaken tag name We have a series of tests which create signed tags with various properties, but one test accidentally verifies a tag from much earlier in the series. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ben Toews <mastahyeti@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 14:15:02 +09:00
Ævar Arnfjörð Bjarmason	26d2e4fb22	Makefile: add a DEVOPTS to get all of -Wextra Change DEVOPTS to understand a "extra-all" option. When the DEVELOPER flag is enabled we turn on -Wextra, but manually switch some of the warnings it turns on off. This is because we have many existing occurrences of them in the code base. This mode will stop the suppression, let the developer see and decide whether to fix them. This change is a slight alteration of Nguyễn Thái Ngọc Duy EAGER_DEVELOPER mode patch[1] 1. "[PATCH v3 3/3] Makefile: add EAGER_DEVELOPER mode" (<20180329150322.10722-4-pclouds@gmail.com>; https://public-inbox.org/git/20180329150322.10722-4-pclouds@gmail.com/) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:54:53 +09:00
Ævar Arnfjörð Bjarmason	99f763baf5	Makefile: add a DEVOPTS to suppress -Werror under DEVELOPER Add a DEVOPTS variable that'll be used to tweak the behavior of DEVELOPER. I've long wanted to use DEVELOPER=1 in my production builds, but on some old systems I still get warnings, and thus the build would fail. However if the build/tests fail for some other reason, it would still be useful to scroll up and see what the relevant code is warning about. This change allows for that. Now setting DEVELOPER will set -Werror as before, but if DEVOPTS=no-error is provided is set you'll get the same warnings, but without -Werror. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:54:53 +09:00
Nguyễn Thái Ngọc Duy	1da1580e4c	Makefile: detect compiler and enable more warnings in DEVELOPER=1 The set of extra warnings we enable when DEVELOPER has to be conservative because we can't assume any compiler version the developer may use. Detect the compiler version so we know when it's safe to enable -Wextra and maybe more. These warning settings are mostly from my custom config.mak a long time ago when I tried to enable as many warnings as possible that can still build without showing warnings. Some of those warnings are probably worth fixing instead of just suppressing in future. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:54:53 +09:00
Nguyễn Thái Ngọc Duy	d2bff22c23	connect.c: mark die_initial_contact() NORETURN There is a series running in parallel with this one that adds code like this switch (...) { case ...: die_initial_contact(); case ...: There is nothing wrong with this. There is no actual falling through. But since gcc is not that smart and gcc 7.x introduces -Wimplicit-fallthrough, it raises a false alarm in this case. This class of warnings may be useful elsewhere, so instead of suppressing the whole class, let's try to fix just this code. gcc is smart enough to realize that no execution can continue after a NORETURN function call and no longer raises the warning. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:54:53 +09:00
Nguyễn Thái Ngọc Duy	5af050437a	pack-objects: show some progress when counting kept objects We only show progress when there are new objects to be packed. But when --keep-pack is specified on the base pack, we will exclude most of objects. This makes 'pack-objects' stay silent for a long time while the counting phase is going. Let's show some progress whenever we visit an object instead. The old "Counting objects" is renamed to "Enumerating objects" and a new progress "Counting objects" line is added. This new "Counting objects" line should progress pretty quick when the system is beefy. But when the system is under pressure, the reading object header done in this phase could be slow and showing progress is an improvement over staying silent in the current code. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	9806f5a7bf	gc --auto: exclude base pack if not enough mem to "repack -ad" pack-objects could be a big memory hog especially on large repos, everybody knows that. The suggestion to stick a .keep file on the giant base pack to avoid this problem is also known for a long time. Recent patches add an option to do just this, but it has to be either configured or activated manually. This patch lets `git gc --auto` activate this mode automatically when it thinks `repack -ad` will use a lot of memory and start affecting the system due to swapping or flushing OS cache. gc --auto decides to do this based on an estimation of pack-objects memory usage, which is quite accurate at least for the heap part, and whether that fits in half of system memory (the assumption here is for desktop environment where there are many other applications running). This mechanism only kicks in if gc.bigBasePackThreshold is not configured. If it is, it is assumed that the user already knows what they want. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	8fc6776247	gc: handle a corner case in gc.bigPackThreshold This config allows us to keep <N> packs back if their size is larger than a limit. But if this N >= gc.autoPackLimit, we may have a problem. We are supposed to reduce the number of packs after a threshold because it affects performance. We could tell the user that they have incompatible gc.bigPackThreshold and gc.autoPackLimit, but it's kinda hard when 'git gc --auto' runs in background. Instead let's fall back to the next best stategy: try to reduce the number of packs anyway, but keep the base pack out. This reduces the number of packs to two and hopefully won't take up too much resources to repack (the assumption still is the base pack takes most resources to handle). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	55dfe13df9	gc: add gc.bigPackThreshold config The --keep-largest-pack option is not very convenient to use because you need to tell gc to do this explicitly (and probably on just a few large repos). Add a config key that enables this mode when packs larger than a limit are found. Note that there's a slight behavior difference compared to --keep-largest-pack: all packs larger than the threshold are kept, not just the largest one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	ae4e89e549	gc: add --keep-largest-pack option This adds a new repack mode that combines everything into a secondary pack, leaving the largest pack alone. This could help reduce memory pressure. On linux-2.6.git, valgrind massif reports 1.6GB heap in "pack all" case, and 535MB in "pack all except the base pack" case. We save roughly 1GB memory by excluding the base pack. This should also lower I/O because we don't have to rewrite a giant pack every time (e.g. for linux-2.6.git that's a 1.4GB pack file).. PS. The use of string_list here seems overkill, but we'll need it in the next patch... Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	ed7e5fc3a2	repack: add --keep-pack option We allow to keep existing packs by having companion .keep files. This is helpful when a pack is permanently kept. In the next patch, git-gc just wants to keep a pack temporarily, for one pack-objects run. git-gc can use --keep-pack for this use case. A note about why the pack_keep field cannot be reused and pack_keep_in_core has to be added. This is about the case when --keep-pack is specified together with either --keep-unreachable or --unpack-unreachable, but --honor-pack-keep is NOT specified. In this case, we want to exclude objects from the packs specified on command line, not from ones with .keep files. If only one bit flag is used, we have to clear pack_keep on pack files with the .keep file. But we can't make any assumption about unreachable objects in .keep packs. If "pack_keep" field is false for .keep packs, we could potentially pull lots of unreachable objects into the new pack, or unpack them loose. The safer approach is ignore all packs with either .keep file or --keep-pack. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	e9e33ab0fb	t7700: have closing quote of a test at the beginning of line The closing quote of a test body by convention is always at the start of line. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy	f6a5576d52	ci: exercise the whole test suite with uncommon code in pack-objects Some recent optimizations have been added to pack-objects to reduce memory usage and some code paths are split into two: one for common use cases and one for rare ones. Make sure the rare cases are tested with Travis since it requires manual test configuration that is unlikely to be done by developers. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:59 +09:00
Nguyễn Thái Ngọc Duy	3b13a5f263	pack-objects: reorder members to shrink struct object_entry Previous patches leave lots of holes and padding in this struct. This patch reorders the members and shrinks the struct down to 80 bytes (from 136 bytes on 64-bit systems, before any field shrinking is done) with 16 bits to spare (and a couple more in in_pack_header_size when we really run out of bits). This is the last in a series of memory reduction patches (see "pack-objects: a bit of document about struct object_entry" for the first one). Overall they've reduced repack memory size on linux-2.6.git from 3.747G to 3.424G, or by around 320M, a decrease of 8.5%. The runtime of repack has stayed the same throughout this series. Ævar's testing on a big monorepo he has access to (bigger than linux-2.6.git) has shown a 7.9% reduction, so the overall expected improvement should be somewhere around 8%. See 87po42cwql.fsf@evledraar.gmail.com on-list (https://public-inbox.org/git/87po42cwql.fsf@evledraar.gmail.com/) for more detailed numbers and a test script used to produce the numbers cited above. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:59 +09:00
Nguyễn Thái Ngọc Duy	0aca34e826	pack-objects: shrink delta_size field in struct object_entry Allowing a delta size of 64 bits is crazy. Shrink this field down to 20 bits with one overflow bit. If we find an existing delta larger than 1MB, we do not cache delta_size at all and will get the value from oe_size(), potentially from disk if it's larger than 4GB. Note, since DELTA_SIZE() is used in try_delta() code, it must be thread-safe. Luckily oe_size() does guarantee this so we it is thread-safe. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:59 +09:00
Nguyễn Thái Ngọc Duy	ac77d0c370	pack-objects: shrink size field in struct object_entry It's very very rare that an uncompressed object is larger than 4GB (partly because Git does not handle those large files very well to begin with). Let's optimize it for the common case where object size is smaller than this limit. Shrink size field down to 31 bits and one overflow bit. If the size is too large, we read it back from disk. As noted in the previous patch, we need to return the delta size instead of canonical size when the to-be-reused object entry type is a delta instead of a canonical one. Add two compare helpers that can take advantage of the overflow bit (e.g. if the file is 4GB+, chances are it's already larger than core.bigFileThreshold and there's no point in comparing the actual value). Another note about oe_get_size_slow(). This function MUST be thread safe because SIZE() macro is used inside try_delta() which may run in parallel. Outside parallel code, no-contention locking should be dirt cheap (or insignificant compared to i/o access anyway). To exercise this code, it's best to run the test suite with something like make test GIT_TEST_OE_SIZE=4 which forces this code on all objects larger than 3 bytes. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:59 +09:00
Nguyễn Thái Ngọc Duy	27a7d0679f	pack-objects: clarify the use of object_entry::size While this field most of the time contains the canonical object size, there is one case it does not: when we have found that the base object of the delta in question is also to be packed, we will very happily reuse the delta by copying it over instead of regenerating the new delta. "size" in this case will record the delta size, not canonical object size. Later on in write_reuse_object(), we reconstruct the delta header and "size" is used for this purpose. When this happens, the "type" field contains a delta type instead of a canonical type. Highlight this in the code since it could be tricky to see. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:59 +09:00
Nguyễn Thái Ngọc Duy	660b373542	pack-objects: don't check size when the object is bad sha1_object_info() in check_objects() may fail to locate an object in the pack and return type OBJ_BAD. In that case, it will likely leave the "size" field untouched. We delay error handling until later in prepare_pack() though. Until then, do not touch "size" field. This field should contain the default value zero, but we can't say sha1_object_info() cannot damage it. This becomes more important later when the object size may have to be retrieved back from the (non-existing) pack. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	0cb3c1427a	pack-objects: shrink z_delta_size field in struct object_entry We only cache deltas when it's smaller than a certain limit. This limit defaults to 1000 but save its compressed length in a 64-bit field. Shrink that field down to 20 bits, so you can only cache 1MB deltas. Larger deltas must be recomputed at when the pack is written down. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	898eba5e63	pack-objects: refer to delta objects by index instead of pointer These delta pointers always point to elements in the objects[] array in packing_data struct. We can only hold maximum 4G of those objects because the array size in nr_objects is uint32_t. We could use uint32_t indexes to address these elements instead of pointers. On 64-bit architecture (8 bytes per pointer) this would save 4 bytes per pointer. Convert these delta pointers to indexes. Since we need to handle NULL pointers as well, the index is shifted by one [1]. [1] This means we can only index 2^32-2 objects even though nr_objects could contain 2^32-1 objects. It should not be a problem in practice because when we grow objects[], nr_alloc would probably blow up long before nr_objects hits the wall. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	43fa44fa3b	pack-objects: move in_pack out of struct object_entry Instead of using 8 bytes (on 64 bit arch) to store a pointer to a pack. Use an index instead since the number of packs should be relatively small. This limits the number of packs we can handle to 1k. Since we can't be sure people can never run into the situation where they have more than 1k pack files. Provide a fall back route for it. If we find out they have too many packs, the new in_pack_by_idx[] array (which has at most 1k elements) will not be used. Instead we allocate in_pack[] array that holds nr_objects elements. This is similar to how the optional in_pack_pos field is handled. The new simple test is just to make sure the too-many-packs code path is at least executed. The true test is running make test GIT_TEST_FULL_IN_PACK_ARRAY=1 to take advantage of other special case tests. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	06af3bba41	pack-objects: move in_pack_pos out of struct object_entry This field is only need for pack-bitmap, which is an optional feature. Move it to a separate array that is only allocated when pack-bitmap is used (like objects[], it is not freed, since we need it until the end of the process) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	b5c0cbd808	pack-objects: use bitfield for object_entry::depth Because of struct packing from now on we can only handle max depth 4095 (or even lower when new booleans are added in this struct). This should be ok since long delta chain will cause significant slow down anyway. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	0c6804ab4e	pack-objects: use bitfield for object_entry::dfs_state Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	fd9b1baef8	pack-objects: turn type and in_pack_type to bitfields An extra field type_valid is added to carry the equivalent of OBJ_BAD in the original "type" field. in_pack_type always contains a valid type so we only need 3 bits for it. A note about accepting OBJ_NONE as "valid" type. The function read_object_list_from_stdin() can pass this value [1] and it eventually calls create_object_entry() where current code skip setting "type" field if the incoming type is zero. This does not have any bad side effects because "type" field should be memset()'d anyway. But since we also need to set type_valid now, skipping oe_set_type() leaves type_valid zero/false, which will make oe_type() return OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger fatal: unable to get type of object ... Accepting OBJ_NONE [2] does sound wrong, but this is how it is has been for a very long time and I haven't time to dig in further. [1] See `5c49c11686` (pack-objects: better check_object() performances - 2007-04-16) [2] `21666f1aae` (convert object type handling from a string to a number - 2007-02-26) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Nguyễn Thái Ngọc Duy	8d6ccce14f	pack-objects: a bit of document about struct object_entry The role of this comment block becomes more important after we shuffle fields around to shrink this struct. It will be much harder to see what field is related to what. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00

... 7 8 9 10 11 ...

51548 Commits