git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	51b8aecabe	Merge branch 'ls/filter-process-delayed' The filter-process interface learned to allow a process with long latency give a "delayed" response. * ls/filter-process-delayed: convert: add "status=delayed" to filter process protocol convert: refactor capabilities negotiation convert: move multiple file filter error handling to separate function convert: put the flags field before the flag itself for consistent style t0021: write "OUT <size>" only on success t0021: make debug log file name configurable t0021: keep filter log files on comparison	2017-08-11 13:27:00 -07:00
Lars Schneider	2841e8f81c	convert: add "status=delayed" to filter process protocol Some `clean` / `smudge` filters may require a significant amount of time to process a single blob (e.g. the Git LFS smudge filter might perform network requests). During this process the Git checkout operation is blocked and Git needs to wait until the filter is done to continue with the checkout. Teach the filter process protocol, introduced in `edcc8581` ("convert: add filter.<driver>.process option", 2016-10-16), to accept the status "delayed" as response to a filter request. Upon this response Git continues with the checkout operation. After the checkout operation Git calls "finish_delayed_checkout" which queries the filter for remaining blobs. If the filter is still working on the completion, then the filter is expected to block. If the filter has completed all remaining blobs then an empty response is expected. Git has a multiple code paths that checkout a blob. Support delayed checkouts only in `clone` (in unpack-trees.c) and `checkout` operations for now. The optimization is most effective in these code paths as all files of the tree are processed. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-30 13:50:41 -07:00
Brandon Williams	a33e0b2a77	convert: convert renormalize_buffer to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Brandon Williams	82b474e025	convert: convert convert_to_git to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Brandon Williams	d6c41c20e6	convert: convert convert_to_git_filter_fd to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Brandon Williams	a7609c54b3	convert: convert get_cached_convert_stats_ascii to take an index Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-13 11:40:51 -07:00
Torsten Bögershausen	6523728499	convert: unify the "auto" handling of CRLF Before this change, $ echo "* text=auto" >.gitattributes $ echo "* eol=crlf" >>.gitattributes would have the same effect as $ echo "* text" >.gitattributes $ git config core.eol crlf Since the 'eol' attribute had higher priority than 'text=auto', this may corrupt binary files and is not what most users expect to happen. Make the 'eol' attribute to obey 'text=auto' and now $ echo "* text=auto" >.gitattributes $ echo "* eol=crlf" >>.gitattributes behaves the same as $ echo "* text=auto" >.gitattributes $ git config core.eol crlf In other words, $ echo "* text=auto eol=crlf" >.gitattributes has the same effect as $ git config core.autocrlf true and $ echo "* text=auto eol=lf" >.gitattributes has the same effect as $ git config core.autocrlf input Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-06 11:53:51 -07:00
Torsten Bögershausen	a7630bd427	ls-files: add eol diagnostics When working in a cross-platform environment, a user may want to check if text files are stored normalized in the repository and if .gitattributes are set appropriately. Make it possible to let Git show the line endings in the index and in the working tree and the effective text/eol attributes. The end of line ("eolinfo") are shown like this: "-text" binary (or with bare CR) file "none" text file without any EOL "lf" text file with LF "crlf" text file with CRLF "mixed" text file with mixed line endings. The effective text/eol attribute is one of these: "", "-text", "text", "text=auto", "text eol=lf", "text eol=crlf" git ls-files --eol gives an output like this: i/none w/none attr/text=auto t/t5100/empty i/-text w/-text attr/-text t/test-binary-2.png i/lf w/lf attr/text eol=lf t/t5100/rfc2047-info-0007 i/lf w/crlf attr/text eol=crlf doit.bat i/mixed w/mixed attr/ locale/XX.po to show what eol convention is used in the data in the index ('i'), and in the working tree ('w'), and what attribute is in effect, for each path that is shown. Add test cases in t0027. Helped-By: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-18 19:48:43 -08:00
Steffen Prohaska	9035d75a2b	convert: stream from fd to required clean filter to reduce used address space The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-28 10:25:15 -07:00
Steffen Prohaska	7ce7c7607b	convert: drop arguments other than 'path' from would_convert_to_git() It is only the path that matters in the decision whether to filter or not. Clarify this by making path the only argument of would_convert_to_git(). Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-21 15:27:20 -07:00
Ondřej Bílka	749f763dbb	typofix: in-code comments Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-22 16:06:49 -07:00
Jeff King	92ac3197e4	teach convert_to_git a "dry run" mode Some callers may want to know whether convert_to_git will actually do anything before performing the conversion itself (e.g., to decide whether to stream or handle blobs in-core). This patch lets callers specify the dry run mode by passing a NULL destination buffer. The return value, instead of indicating whether conversion happened, will indicate whether conversion would occur. For readability, we also include a wrapper function which makes it more obvious we are not actually performing the conversion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-24 14:11:27 -08:00
Junio C Hamano	4ae6670444	stream filter: add "no more input" to the filters Some filters may need to buffer the input and look-ahead inside it to decide what to output, and they may consume more than zero bytes of input and still not produce any output. After feeding all the input, pass NULL as input as keep calling stream_filter() to let such filters know there is no more input coming, and it is time for them to produce the remaining output based on the buffered input. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-26 16:47:15 -07:00
Junio C Hamano	b6691092d7	Add streaming filter API This introduces an API to plug custom filters to an input stream. The caller gets get_stream_filter("path") to obtain an appropriate filter for the path, and then uses it when opening an input stream via open_istream(). After that, the caller can read from the stream with read_istream(), and close it with close_istream(), just like an unfiltered stream. This only adds a "null" filter that is a pass-thru filter, but later changes can add LF-to-CRLF and other filters, and the callers of the streaming API do not have to change. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-26 16:47:15 -07:00
Junio C Hamano	d1bf0e0831	convert.h: move declarations for conversion from cache.h Before adding the streaming filter API to the conversion layer, move the existing declarations related to the conversion to its own header file. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-26 16:47:15 -07:00

15 Commits