git-commit-vandalism

Author	SHA1	Message	Date
Thomas Ackermann	6eda9ac9e5	doc: use https links Use only https links for lore.kernel.org. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:10 -08:00
Johannes Schindelin	06d531486e	t[01]: adjust the references to the default branch name "main" Carefully excluding t1309, which sees independent development elsewhere at the time of writing, we transition above-mentioned tests to the default branch name `main`. This trick was performed via $ (cd t && sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \ -e 's/Master/Main/g' -e 's/naster/nain/g' -- t[01].sh && git checkout HEAD -- t1309\*) Note that t5533 contains a variation of the name `master` (`naster`) that we rename here, too. This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main` for those tests. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:18 -08:00
Johannes Schindelin	334afbc76f	tests: mark tests relying on the current default for `init.defaultBranch` In addition to the manual adjustment to let the `linux-gcc` CI job run the test suite with `master` and then with `main`, this patch makes sure that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts that currently rely on the initial branch name being `master by default. To determine which test scripts to mark up, the first step was to force-set the default branch name to `master` in - all test scripts that contain the keyword `master`, - t4211, which expects `t/t4211/history.export` with a hard-coded ref to initialize the default branch, - t5560 because it sources `t/t556x_common` which uses `master`, - t8002 and t8012 because both source `t/annotate-tests.sh` which also uses `master`) This trick was performed by this command: $ sed -i '/^ \. \.\/$test-lib\\|lib-\(bash\\|cvs\\|git-svn$\\|gitweb-lib\)\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' $(git grep -l master t/t[0-9].sh) \ t/t4211.sh t/t5560.sh t/t8002.sh t/t8012.sh After that, careful, manual inspection revealed that some of the test scripts containing the needle `master` do not actually rely on a specific default branch name: either they mention `master` only in a comment, or they initialize that branch specificially, or they do not actually refer to the current default branch. Therefore, the aforementioned modification was undone in those test scripts thusly: $ git checkout HEAD -- \ t/t0027-auto-crlf.sh t/t0060-path-utils.sh \ t/t1011-read-tree-sparse-checkout.sh \ t/t1305-config-include.sh t/t1309-early-config.sh \ t/t1402-check-ref-format.sh t/t1450-fsck.sh \ t/t2024-checkout-dwim.sh \ t/t2106-update-index-assume-unchanged.sh \ t/t3040-subprojects-basic.sh t/t3301-notes.sh \ t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \ t/t3436-rebase-more-options.sh \ t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \ t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \ t/t5511-refspec.sh t/t5526-fetch-submodules.sh \ t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \ t/t5548-push-porcelain.sh \ t/t5552-skipping-fetch-negotiator.sh \ t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \ t/t5614-clone-submodules-shallow.sh \ t/t7508-status.sh t/t7606-merge-custom.sh \ t/t9302-fast-import-unpack-limit.sh We excluded one set of test scripts in these commands, though: the range of `git p4` tests. The reason? `git p4` stores the (foreign) remote branch in the branch called `p4/master`, which is obviously not the default branch. Manual analysis revealed that only five of these tests actually require a specific default branch name to pass; They were modified thusly: $ sed -i '/^ \. \.\/lib-git-p4\.sh$/i\ GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\ ' t/t980[0167].sh t/t9811*.sh Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 15:44:17 -08:00
Johannes Schindelin	53b67a801b	tests: consolidate the `file_size` function into `test-lib-functions.sh` In `8de7eeb54b` (compression: unify pack.compression configuration parsing, 2016-11-15), we introduced identical copies of the `file_size` helper into three test scripts, with the plan to eventually consolidate them into a single copy. Let's do that, and adjust the function name to adhere to the `test_*` naming convention. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-06 22:05:08 -08:00
brian m. carlson	0c0f8a7f28	t0021: test filter metadata for additional cases Check that we get the expected data when performing a merges or generating archives. Note that we don't expect a ref for merges, because we won't be checking out any particular ref, but instead a tree of the merged data. For archives, however, we expect a ref as normal if we have one. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
brian m. carlson	4cf76f6bbf	builtin/reset: compute checkout metadata for reset Pass the commit, and if we have it, the ref to the filters when we perform a checkout. This should only be the case when we invoke git reset --hard; the metadata will be unused otherwise. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
brian m. carlson	3f26785624	builtin/rebase: compute checkout metadata for rebases Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
brian m. carlson	dfc8cdc677	builtin/clone: compute checkout metadata for clones When checking out a commit, provide metadata to the filter process including the ref we're using. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
brian m. carlson	13e7ed6a3a	builtin/checkout: compute checkout metadata for checkouts Provide commit metadata for checkout code paths that use unpack_trees and friends. When we're checking out a commit, use the commit information, but don't provide commit information if we're checking out from the index, since there need not be any particular commit associated with the index, and even if there is one, we can't know what it is. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
Martin Ågren	3c29e21eb0	t: drop debug `cat` calls We `cat` files, but don't inspect or grab the contents in any way. Unlike in an earlier commit, there is no reason to suspect that these files could be missing, so `cat`-ing them is just wasted effort. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 11:18:25 -08:00
Junio C Hamano	3b3d9ea6a8	Merge branch 'jk/lore-is-the-archive' Doc update for the mailing list archiving and nntp service. * jk/lore-is-the-archive: doc: replace public-inbox links with lore.kernel.org doc: recommend lore.kernel.org over public-inbox.org	2019-12-06 15:09:23 -08:00
Jeff King	3eae30e464	doc: replace public-inbox links with lore.kernel.org Since we're now recommending lore.kernel.org (and because the public-inbox.org domain might eventually go away), let's update our internal references to use it, too. That future-proofs our references, and sets the example we want people to follow. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-30 09:12:26 -08:00
Thomas Gummerer	58166c2e9d	t0021: make sure clean filter runs In t0021.15 one of the things we are checking is that the clean filter is run when checking out empty-branch. The clean filter needs to be run to make sure there are no modifications on the file system for the test.r file, and thus it isn't dangerous to overwrite it. However in the current test setup it is not always necessary to run the clean filter, and thus the test sometimes fails, as debug.log isn't written. This happens when test.r has an older mtime than the index itself. That mtime is also recorded as stat data for test.r in the index, and based on the heuristic we're using for index entries, git correctly assumes this file is up-to-date. Usually this test succeeds because the mtime of test.r is the same as the mtime of the index. In this case test.r is racily clean, so git actually checks the contents, for which the clean filter is run. Fix the test by updating the mtime of test.r, so git is forced to check the contents of the file, and the clean filter is run as the test expects. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-22 12:32:59 -07:00
Johannes Schindelin	5868bd8620	tests: avoid calling Perl just to determine file sizes It is a bit ridiculous to spin up a full-blown Perl instance (especially on Windows, where that means spinning up a full POSIX emulation layer, AKA the MSYS2 runtime) just to tell how large a given file is. So let's just use the test-tool to do that job instead. This command will also be used over the next commits, to allow for cutting out individual test cases' verbose log from the file generated via --verbose-log. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-29 09:26:47 -08:00
Matthew DeVore	dcbaa0b361	t/*: fix ordering of expected/observed arguments Fix various places where the ordering was obviously wrong, meaning it was easy to find with grep. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:18 +09:00
Junio C Hamano	4bea8485e3	Merge branch 'nd/i18n' Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...	2018-08-15 15:08:23 -07:00
Nguyễn Thái Ngọc Duy	d26a328eaf	convert.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Eric Sunshine	75651fd783	t0000-t0999: fix broken &&-chains Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-16 14:38:47 -07:00
Nguyễn Thái Ngọc Duy	c680668d1a	t/helper: merge test-genrandom into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Jonathan Tan	fa64a2fdbe	sub-process: refactor handshake to common function Refactor, into a common function, the version and capability negotiation done when invoking a long-running process as a clean or smudge filter. This will be useful for other Git code that needs to interact similarly with a long-running process. As you can see in the change to t0021, this commit changes the error message reported when the long-running process does not introduce itself with the expected "server"-terminated line. Originally, the error message reports that the filter "does not support filter protocol version 2", differentiating between the old single-file filter protocol and the new multi-file filter protocol - I have updated it to something more generic and useful. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-07-26 13:00:40 -07:00
Lars Schneider	2841e8f81c	convert: add "status=delayed" to filter process protocol Some `clean` / `smudge` filters may require a significant amount of time to process a single blob (e.g. the Git LFS smudge filter might perform network requests). During this process the Git checkout operation is blocked and Git needs to wait until the filter is done to continue with the checkout. Teach the filter process protocol, introduced in `edcc8581` ("convert: add filter.<driver>.process option", 2016-10-16), to accept the status "delayed" as response to a filter request. Upon this response Git continues with the checkout operation. After the checkout operation Git calls "finish_delayed_checkout" which queries the filter for remaining blobs. If the filter is still working on the completion, then the filter is expected to block. If the filter has completed all remaining blobs then an empty response is expected. Git has a multiple code paths that checkout a blob. Support delayed checkouts only in `clone` (in unpack-trees.c) and `checkout` operations for now. The optimization is most effective in these code paths as all files of the tree are processed. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:50:41 -07:00
Lars Schneider	1757333410	t0021: write "OUT <size>" only on success "rot13-filter.pl" always writes "OUT <size>" to the debug log at the end of a response. This works perfectly for the existing responses "abort", "error", and "success". A new response "delayed", that will be introduced in a subsequent patch, accepts the input without giving the filtered result right away. At this point we cannot know the size of the response. Therefore, we do not write "OUT <size>" for "delayed" responses. To simplify the code we do not write "OUT <size>" for "abort" and "error" responses either as their size is always zero. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-29 11:23:47 -07:00
Lars Schneider	e1ec4721d6	t0021: make debug log file name configurable The "rot13-filter.pl" helper wrote its debug logs always to "rot13-filter.log". Make this configurable by defining the log file as first parameter of "rot13-filter.pl". This is useful if "rot13-filter.pl" is configured multiple times similar to the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-26 10:36:00 -07:00
Lars Schneider	58ec9cb35b	t0021: keep filter log files on comparison The filter log files are modified on comparison. That might be unexpected by the caller. It would be even undesirable if the caller wants to reuse the original log files. Address these issues by using temp files for modifications. This is useful for the subsequent patch 'convert: add "status=delayed" to filter process protocol'. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-26 10:34:42 -07:00
Junio C Hamano	08721a056b	Merge branch 'ls/filter-process' Doc update. * ls/filter-process: t0021: fix flaky test docs: warn about possible '=' in clean/smudge filter process values	2016-12-27 00:11:42 -08:00
Junio C Hamano	63d6a9c962	Merge branch 'ls/t0021-fixup' * ls/t0021-fixup: t0021: minor filter process test cleanup	2016-12-19 14:45:30 -08:00
Lars Schneider	7eeda8b821	t0021: fix flaky test t0021.15 creates files, adds them to the index, and commits them. All this usually happens in a test run within the same second and Git cannot know if the files have been changed between `add` and `commit`. Thus, Git has to run the clean filter in both operations. Sometimes these invocations spread over two different seconds and Git can infer that the files were not changed between `add` and `commit` based on their modification timestamp. The test would fail as it expects the filter invocation. Remove this expectation to make the test stable. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-18 13:01:20 -08:00
Lars Schneider	c6b0831c9c	docs: warn about possible '=' in clean/smudge filter process values A pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in `edcc858`). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-06 11:29:52 -08:00
Lars Schneider	9c48b4fb23	t0021: minor filter process test cleanup Remove superfluous .gitignore pattern and invalid '.' in `git commit` calls. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-05 14:59:03 -08:00
Junio C Hamano	b18f6a0066	Merge branch 'ls/filter-process' Test portability improvements and optimization for an already-graduated topic. * ls/filter-process: t0021: remove debugging cruft	2016-11-11 13:56:30 -08:00
Junio C Hamano	7f2a3921fb	Merge branch 'js/pwd-var-vs-pwd-cmd-fix' Last minute fixes to two fixups merged to 'master' recently. * js/pwd-var-vs-pwd-cmd-fix: t0021, t5615: use $PWD instead of $(pwd) in PATH-like shell variables	2016-11-11 13:56:30 -08:00
Junio C Hamano	a0d8b60da8	t0021: remove debugging cruft The redirection of the standard error stream to a temporary file is a leftover cruft during debugging. Remove it. Besides, it is reported by folks on the Windows that the test is flaky with this redirection; somebody gets confused and this merely-redirected-to file gets marked as delete-pending by git.exe and makes it finish with a non-zero exit status when "git checkout" finishes. Windows folks may want to figure that one out, but for the purpose of this test, it shouldn't become a show-stopper. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-11 13:09:24 -08:00
Johannes Sixt	71dd50472d	t0021, t5615: use $PWD instead of $(pwd) in PATH-like shell variables We have to use $PWD instead of $(pwd) because on Windows the latter would add a C: style path to bash's Unix-style $PATH variable, which becomes confused by the colon after the drive letter. ($PWD is a Unix-style path.) In the case of GIT_ALTERNATE_OBJECT_DIRECTORIES, bash on Windows assembles a Unix-style path list with the colon as separators. It converts the value to a Windows-style path list with the semicolon as path separator when it forwards the variable to git.exe. The same confusion happens when bash's original value is contaminated with Windows style paths. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-11 10:54:46 -08:00
Junio C Hamano	7b2c338cae	Merge branch 'jk/filter-process-fix' Test portability improvements and cleanups for t0021. * jk/filter-process-fix: t0021: fix filehandle usage on older perl t0021: use $PERL_PATH for rot13-filter.pl t0021: put $TEST_ROOT in $PATH t0021: use write_script to create rot13 shell script	2016-11-10 13:17:30 -08:00
Johannes Sixt	ec2e8b3da2	t0021: compute file size with a single process instead of a pipeline Avoid unwanted coding patterns (prodigal use of pipelines), and in particular a useless use of cat. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Jeff King <peff@peff.net>	2016-11-08 15:26:40 -05:00
Johannes Sixt	038212c4c4	t0021: expect more variations in the output of uniq -c Some versions of uniq -c write the count left-justified, other version write it right-justified. Be prepared for both kinds. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Jeff King <peff@peff.net>	2016-11-08 15:26:40 -05:00
Jeff King	f272696a35	t0021: use $PERL_PATH for rot13-filter.pl The rot13-filter.pl script hardcodes "#!/usr/bin/perl", and does not respect $PERL_PATH at all. That is a problem if the system does not have perl at that path, or if it has a perl that is too old to run a complicated script like the rot13-filter (but PERL_PATH points to a more modern one). We can fix this by using write_script() to create a new copy of the script with the correct #!-line. In theory we could move the whole script inside t0021-conversion.sh rather than having it as an auxiliary file, but it's long enough that it just makes things harder to read. As a bonus, we can stop using the full path to the script in the filter-process config we add (because the trash directory is in our PATH). Not only is this shorter, but it sidesteps any shell-quoting issues. The original was broken when $TEST_DIRECTORY contained a space, because it was interpolated in the outer script. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-02 19:36:29 -07:00
Jeff King	30030a36b6	t0021: put $TEST_ROOT in $PATH We create a rot13.sh script in the trash directory, but need to call it by its full path when we have moved our cwd to another directory. Let's just put $TEST_ROOT in our $PATH so that the script is always found. This is a minor convenience for rot13.sh, but will be a major one when we switch rot13-filter.pl to a script in the same directory, as it means we will not have to deal with shell quoting inside the filter-process config. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-02 19:36:29 -07:00
Jeff King	cbb6707b11	t0021: use write_script to create rot13 shell script This avoids us fooling around with $SHELL_PATH and the executable bit ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-02 19:36:29 -07:00
Lars Schneider	edcc85814c	convert: add filter.<driver>.process option Git's clean/smudge mechanism invokes an external filter process for every single blob that is affected by a filter. If Git filters a lot of blobs then the startup time of the external filter processes can become a significant part of the overall Git execution time. In a preliminary performance test this developer used a clean/smudge filter written in golang to filter 12,000 files. This process took 364s with the existing filter mechanism and 5s with the new mechanism. See details here: https://github.com/github/git-lfs/pull/1382 This patch adds the `filter.<driver>.process` string option which, if used, keeps the external filter process running and processes all blobs with the packet format (pkt-line) based protocol over standard input and standard output. The full protocol is explained in detail in `Documentation/gitattributes.txt`. A few key decisions: * The long running filter process is referred to as filter protocol version 2 because the existing single shot filter invocation is considered version 1. * Git sends a welcome message and expects a response right after the external filter process has started. This ensures that Git will not hang if a version 1 filter is incorrectly used with the filter.<driver>.process option for version 2 filters. In addition, Git can detect this kind of error and warn the user. * The status of a filter operation (e.g. "success" or "error) is set before the actual response and (if necessary!) re-set after the response. The advantage of this two step status response is that if the filter detects an error early, then the filter can communicate this and Git does not even need to create structures to read the response. * All status responses are pkt-line lists terminated with a flush packet. This allows us to send other status fields with the same protocol in the future. Helped-by: Martin-Louis Bright <mlbright@gmail.com> Reviewed-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:45:52 -07:00
Lars Schneider	ed54970324	convert: modernize tests Use `test_config` to set the config, check that files are empty with `test_must_be_empty`, compare files with `test_cmp`, and remove spaces after ">" and "<". Please note that the "rot13" filter configured in "setup" keeps using `git config` instead of `test_config` because subsequent tests might depend on it. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:49 -07:00
Jeff King	06dec439a3	diff: do not reuse worktree files that need "clean" conversion When accessing a blob for a diff, we may try to reuse file contents in the working tree, under the theory that it is faster to mmap those file contents than it would be to extract the content from the object database. When we have to filter those contents, though, that assumption does not hold. Even for our internal conversions like CRLF, we have to allocate and fill a new buffer anyway. But much worse, for external clean filters we have to exec an arbitrary script, and we have no idea how expensive it may be to run. So let's skip this optimization when conversion into git's "clean" form is required. This applies whenever the "want_file" flag is false. When it's true, the caller actually wants the smudged worktree contents, which the reused file by definition already has (in fact, this is a key optimization going the other direction, since reusing the worktree file there lets us skip smudge filters). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-22 12:31:24 -07:00
Lars Schneider	1a8630dc3b	convert: treat an empty string for clean/smudge filters as "cat" Once a lower-priority configuration file defines a clean or smudge filter, there is no convenient way to override it to produce as-is output. Even though the configuration mechanism implements "the last one wins" semantics, you cannot set them to an empty string and expect them to work, as apply_filter() would try to run the empty string as an external command and fail. The conversion is not done, but the function would still report a failure to convert. Even though resetting the variable to "cat" (i.e. pass the data back as-is and report success) is an obvious and a viable way to solve this, it is wasteful to spawn an external process just as a workaround. Instead, teach apply_filter() to treat an empty string as a no-op filter that always returns successfully its input as-is without conversion. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-29 11:04:27 -08:00
Junio C Hamano	152722f155	Merge branch 'jh/filter-empty-contents' The clean/smudge interface did not work well when filtering an empty contents (failed and then passed the empty input through). It can be argued that a filter that produces anything but empty for an empty input is nonsense, but if the user wants to do strange things, then why not? * jh/filter-empty-contents: sha1_file: pass empty buffer to index empty file	2015-06-01 12:45:11 -07:00
Junio C Hamano	0c4dd67a04	filter_buffer_or_fd(): ignore EPIPE We are explicitly ignoring SIGPIPE, as we fully expect that the filter program may not read our output fully. Ignore EPIPE that may come from writing to it as well. A new test was stolen from Jeff's suggestion. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-20 10:19:12 -07:00
Jim Hill	f6a1e1e288	sha1_file: pass empty buffer to index empty file `git add` of an empty file with a filter pops complaints from `copy_fd` about a bad file descriptor. This traces back to these lines in sha1_file.c:index_core: if (!size) { ret = index_mem(sha1, NULL, size, type, path, flags); The problem here is that content to be added to the index can be supplied from an fd, or from a memory buffer, or from a pathname. This call is supplying a NULL buffer pointer and a zero size. Downstream logic takes the complete absence of a buffer to mean the data is to be found elsewhere -- for instance, these, from convert.c: if (params->src) { write_err = (write_in_full(child_process.in, params->src, params->size) < 0); } else { write_err = copy_fd(params->fd, child_process.in); } ~If there's a buffer, write from that, otherwise the data must be coming from an open fd.~ Perfectly reasonable logic in a routine that's going to write from either a buffer or an fd. So change `index_core` to supply an empty buffer when indexing an empty file. There's a patch out there that instead changes the logic quoted above to take a `-1` fd to mean "use the buffer", but it seems to me that the distinction between a missing buffer and an empty one carries intrinsic semantics, where the logic change is adapting the code to handle incorrect arguments. Signed-off-by: Jim Hill <gjthill@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-18 10:15:20 -07:00
Steffen Prohaska	9035d75a2b	convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-28 10:25:15 -07:00
Junio C Hamano	6219bb22ba	test: turn EXPENSIVE into a lazy prerequisite Two test scripts (t0021 and t5551) had copy & paste code to set EXPENSIVE prerequisite. Use the test_lazy_prereq helper to define them in the common t/test-lib.sh. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-09 14:18:55 -07:00
Steffen Prohaska	0b6806b9e4	xread, xwrite: limit size of IO to 8MB Checking out 2GB or more through an external filter (see test) fails on Mac OS X 10.8.4 (12E55) for a 64-bit executable with: error: read from external filter cat failed error: cannot feed the input to external filter cat error: cat died of signal 13 error: external filter cat failed 141 error: external filter cat failed The reason is that read() immediately returns with EINVAL when asked to read more than 2GB. According to POSIX [1], if the value of nbyte passed to read() is greater than SSIZE_MAX, the result is implementation-defined. The write function has the same restriction [2]. Since OS X still supports running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX = 2GB - 1) seems to be also imposed on 64-bit executables under certain conditions. For write, the problem has been addressed earlier [6c642a]. Address the problem for read() and write() differently, by limiting size of IO chunks unconditionally on all platforms in xread() and xwrite(). Large chunks only cause problems, like causing latencies when killing the process, even if OS X was not buggy. Doing IO in reasonably sized smaller chunks should have no negative impact on performance. The compat wrapper clipped_write() introduced earlier [6c642a] is not needed anymore. It will be reverted in a separate commit. The new test catches read and write problems. Note that 'git add' exits with 0 even if it prints filtering errors to stderr. The test, therefore, checks stderr. 'git add' should probably be changed (sometime in another commit) to exit with nonzero if filtering fails. The test could then be changed to use test_must_fail. Thanks to the following people for suggestions and testing: Johannes Sixt <j6t@kdbg.org> John Keeping <john@keeping.me.uk> Jonathan Nieder <jrnieder@gmail.com> Kyle J. McKay <mackyle@gmail.com> Linus Torvalds <torvalds@linux-foundation.org> Torsten Bögershausen <tboegi@web.de> [1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html [2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html [6c642a] commit `6c642a8786` compate/clipped-write.c: large write(2) fails on Mac OS X/XNU Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-20 11:10:59 -07:00
Jehan Bing	36daaaca00	Add a setting to require a filter to be successful By default, a missing filter driver or a failure from the filter driver is not an error, but merely makes the filter operation a no-op pass through. This is useful to massage the content into a shape that is more convenient for the platform, filesystem, and the user to use, and the content filter mechanism is not used to turn something unusable into usable. However, we could also use of the content filtering mechanism and store the content that cannot be directly used in the repository (e.g. a UUID that refers to the true content stored outside git, or an encrypted content) and turn it into a usable form upon checkout (e.g. download the external content, or decrypt the encrypted content). For such a use case, the content cannot be used when filter driver fails, and we need a way to tell Git to abort the whole operation for such a failing or missing filter driver. Add a new "filter.<driver>.required" configuration variable to mark the second use case. When it is set, git will abort the operation when the filter driver does not exist or exits with a non-zero status code. Signed-off-by: Jehan Bing <jehan@orb.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-17 07:37:08 -08:00

1 2

63 Commits