git-commit-vandalism

Author	SHA1	Message	Date
René Scharfe	ca56dadb4b	use CALLOC_ARRAY Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 16:00:09 -08:00
Jeff King	185e865226	convert: drop unused crlf_action from check_global_conv_flags_eol() The crlf_action parameter hasn't been used since `a0ad53c181` (convert: Correct NNO tests and missing `LF will be replaced by CRLF`, 2016-08-13), where that part of the function was hoisted out to a separate will_convert_lf_to_crlf() helper. Let's drop the useless parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:47 -07:00
Junio C Hamano	afbdba391e	run_command: teach API users to use embedded 'args' more The child_process structure has an embedded strvec for formulating the command line argument list these days, but code that predates the wide use of it prepared a separate char *argv[] array and manually set the child_process.argv pointer point at it. Teach these old-style code to lose the separate argv[] array. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-26 15:32:37 -07:00
Jeff King	f5914f4b6b	parse_config_key(): return subsection len as size_t We return the length to a subset of a string using an "int *" out-parameter. This is fine most of the time, as we'd expect config keys to be relatively short, but it could behave oddly if we had a gigantic config key. A more appropriate type is size_t. Let's switch over, which lets our callers use size_t as appropriate (they are bound by our type because they must pass the out-parameter as a pointer). This is mostly just a cleanup to make it clear this code handles long strings correctly. In practice, our config parser already chokes on long key names (because of a similar int/size_t mixup!). When doing an int/size_t conversion, we have to be careful that nobody was trying to assign a negative value to the variable. I manually confirmed that for each case here. They tend to just feed the result to xmemdupz() or similar; in a few cases I adjusted the parameter types for helper functions to make sure the size_t is preserved. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:44:29 -07:00
brian m. carlson	c397aac02f	convert: provide additional metadata to filters Now that we have the codebase wired up to pass any additional metadata to filters, let's collect the additional metadata that we'd like to pass. The two main places we pass this metadata are checkouts and archives. In these two situations, reading HEAD isn't a valid option, since HEAD isn't updated for checkouts until after the working tree is written and archives can accept an arbitrary tree. In other situations, HEAD will usually reflect the refname of the branch in current use. We pass a smaller amount of data in other cases, such as git cat-file, where we can really only logically know about the blob. This commit updates only the parts of the checkout code where we don't use unpack_trees. That function and callers of it will be handled in a future commit. In the archive code, we leak a small amount of memory, since nothing we pass in the archiver argument structure is freed. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
brian m. carlson	ab90ecae99	convert: permit passing additional metadata to filter processes There are a variety of situations where a filter process can make use of some additional metadata. For example, some people find the ident filter too limiting and would like to include the commit or the branch in their smudged files. This information isn't available during checkout as HEAD hasn't been updated at that point, and it wouldn't be available in archives either. Let's add a way to pass this metadata down to the filter. We pass the blob we're operating on, the treeish (preferring the commit over the tree if one exists), and the ref we're operating on. Note that we won't pass this information in all cases, such as when renormalizing or when we're performing diffs, since it doesn't make sense in those cases. The data we currently get from the filter process looks like the following: command=smudge pathname=git.c 0000 With this change, we'll get data more like this: command=smudge pathname=git.c refname=refs/tags/v2.25.1 treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b blob=7be7ad34bd053884ec48923706e70c81719a8660 0000 There are a couple things to note about this approach. For operations like checkout, treeish will always be a commit, since we cannot check out individual trees, but for other operations, like archive, we can end up operating on only a particular tree, so we'll provide only a tree as the treeish. Similar comments apply for refname, since there are a variety of cases in which we won't have a ref. This commit wires up the code to print this information, but doesn't pass any of it at this point. In a future commit, we'll have various code paths pass the actual useful data down. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-16 11:37:02 -07:00
Junio C Hamano	78e67cda42	Merge branch 'mt/use-passed-repo-more-in-funcs' Some codepaths were given a repository instance as a parameter to work in the repository, but passed the_repository instance to its callees, which has been cleaned up (somewhat). * mt/use-passed-repo-more-in-funcs: sha1-file: allow check_object_signature() to handle any repo sha1-file: pass git_hash_algo to hash_object_file() sha1-file: pass git_hash_algo to write_object_file_prepare() streaming: allow open_istream() to handle any repo pack-check: use given repo's hash_algo at verify_packfile() cache-tree: use given repo's hash_algo at verify_one() diff: make diff_populate_filespec() honor its repo argument	2020-02-14 12:54:22 -08:00
Junio C Hamano	a3dcf84df0	Merge branch 'js/convert-typofix' Typofix. * js/convert-typofix: convert: fix typo	2020-02-12 12:41:39 -08:00
Johannes Schindelin	2b0f19fa7a	convert: fix typo Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-11 12:03:09 -08:00
Matheus Tavares	2dcde20e1c	sha1-file: pass git_hash_algo to hash_object_file() Allow hash_object_file() to work on arbitrary repos by introducing a git_hash_algo parameter. Change callers which have a struct repository pointer in their scope to pass on the git_hash_algo from the said repo. For all other callers, pass on the_hash_algo, which was already being used internally at hash_object_file(). This functionality will be used in the following patch to make check_object_signature() be able to work on arbitrary repos (which, in turn, will be used to fix an inconsistency at object.c:parse_object()). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 10:45:39 -08:00
Junio C Hamano	d2489ce92c	Merge branch 'rs/skip-iprefix' Code simplification. * rs/skip-iprefix: convert: use skip_iprefix() in validate_encoding() utf8: use skip_iprefix() in same_utf_encoding()	2019-12-01 09:04:36 -08:00
René Scharfe	ed28358833	convert: use skip_iprefix() in validate_encoding() Use skip_iprefix() to parse "UTF" case-insensitively instead of checking with istarts_with(), building an upper-case version and then using skip_prefix() on it. This gets rid of duplicate code and of a small allocation. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-10 16:04:48 +09:00
Elijah Newren	15beaaa3d1	Fix spelling errors in code comments Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-10 16:00:54 +09:00
Junio C Hamano	d17f54947d	Merge branch 'rs/convert-fix-utf-without-dash' The code to skip "UTF" and "UTF-" prefix, when computing an advice message, did not work correctly when the prefix was "UTF", which has been fixed. * rs/convert-fix-utf-without-dash: convert: fix handling of dashless UTF prefix in validate_encoding()	2019-10-09 14:01:00 +09:00
René Scharfe	b181676ce9	convert: fix handling of dashless UTF prefix in validate_encoding() Strip "UTF" and an optional dash from the start of 'upper' without passing a NULL pointer to skip_prefix() in the second call, as it cannot handle that. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-06 09:43:01 +09:00
brian m. carlson	2c65d90f75	am: reload .gitattributes after patching it When applying multiple patches with git am, or when rebasing using the am backend, it's possible that one of our patches has updated a gitattributes file. Currently, we cache this information, so if a file in a subsequent patch has attributes applied, the file will be written out with the attributes in place as of the time we started the rebase or am operation, not with the attributes applied by the previous patch. This problem does not occur when using the -m or -i flags to rebase. To ensure we write the correct data into the working tree, expire the cache after each patch that touches a path ending in ".gitattributes". Since we load these attributes in multiple separate files, we must expire them accordingly. Verify that both the am and rebase code paths work correctly, including the conflict marker size with am -3. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-03 15:16:18 -07:00
Junio C Hamano	1828e52efc	Merge branch 'jh/resize-convert-scratch-buffer' When the "clean" filter can reduce the size of a huge file in the working tree down to a small "token" (a la Git LFS), there is no point in allocating a huge scratch area upfront, but the buffer is sized based on the original file size. The convert mechanism now allocates very minimum and reallocates as it receives the output from the clean filter process. * jh/resize-convert-scratch-buffer: convert: avoid malloc of original file size	2019-04-10 02:14:22 +09:00
Joey Hess	02156ab031	convert: avoid malloc of original file size We write the output of a "clean" filter into a strbuf. Rather than growing the strbuf dynamically as we read its output, we make the initial allocation as large as the original input file. This is a good guess when the filter is just tweaking a few bytes, but it's disastrous when the point of the filter is to condense a very large file into a short identifier (e.g., the way git-lfs and git-annex do). We may ask to allocate many gigabytes, causing the allocation to fail and Git to die(). Instead, let's just let strbuf do its usual growth. When the clean filter does output something around the same size as the worktree file, the buffer will need to be reallocated until it fits, starting at 8192 and doubling in size. Benchmarking indicates that reallocation is not a significant overhead for outputs up to a few MB in size. Signed-off-by: Joey Hess <id@joeyh.name> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-08 10:13:00 +09:00
Junio C Hamano	b2fc9d2fb0	Merge branch 'jk/unused-parameter-cleanup' Code cleanup. * jk/unused-parameter-cleanup: convert: drop path parameter from actual conversion functions convert: drop len parameter from conversion checks config: drop unused parameter from maybe_remove_section() show_date_relative(): drop unused "tz" parameter column: drop unused "opts" parameter in item_length() create_bundle(): drop unused "header" parameter apply: drop unused "def" parameter from find_name_gnu() match-trees: drop unused path parameter from score functions	2019-02-06 22:05:23 -08:00
Junio C Hamano	7589e63648	Merge branch 'nd/the-index-final' The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument	2019-02-06 22:05:23 -08:00
Jeff King	55ad152cfb	convert: drop path parameter from actual conversion functions The caller is responsible for looking up the attributes, after which point we no longer care about the path at which the content is found. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 12:35:45 -08:00
Jeff King	129beeee9a	convert: drop len parameter from conversion checks We've already extracted the useful information into our text_stat struct, so the length is no longer needed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 12:35:45 -08:00
Nguyễn Thái Ngọc Duy	f8adbec9fe	cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 11:55:06 -08:00
Junio C Hamano	3434569fc2	Merge branch 'nd/style-opening-brace' Code clean-up. * nd/style-opening-brace: style: the opening '{' of a function is in a separate line	2019-01-18 13:49:52 -08:00
Nguyễn Thái Ngọc Duy	3b3357626e	style: the opening '{' of a function is in a separate line Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-12-10 15:41:09 +09:00
Nguyễn Thái Ngọc Duy	ec36c42a63	Indent code with TABs We indent with TABs and sometimes for fine alignment, TABs followed by spaces, but never all spaces (unless the indentation is less than 8 columns). Indenting with spaces slips through in some places. Fix them. Imported code and compat/ are left alone on purpose. The former should remain as close as upstream as possible. The latter pretty much has separate maintainers, it's up to them to decide. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-12-09 12:37:32 +09:00
Torsten Bögershausen	d64324cb60	Make git_check_attr() a void function git_check_attr() returns always 0. Remove all the error handling code of the callers, which is never executed. Change git_check_attr() to be a void function. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:15:34 -07:00
Junio C Hamano	dc0f6f9e1d	Merge branch 'nd/no-the-index' The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...	2018-08-20 11:33:53 -07:00
Junio C Hamano	7d020f5a78	Merge branch 'jk/size-t' Code clean-up to use size_t/ssize_t when they are the right type. * jk/size-t: strbuf_humanise: use unsigned variables pass st.st_size as hint for strbuf_readlink() strbuf_readlink: use ssize_t strbuf: use size_t for length in intermediate variables reencode_string: use size_t for string lengths reencode_string: use st_add/st_mult helpers	2018-08-15 15:08:25 -07:00
Junio C Hamano	4bea8485e3	Merge branch 'nd/i18n' Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...	2018-08-15 15:08:23 -07:00
Nguyễn Thái Ngọc Duy	7f944e264e	convert.c: remove an implicit dependency on the_index Make the convert API take an index_state instead of assuming the_index in convert.c. All external call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:42 -07:00
Nguyễn Thái Ngọc Duy	7a400a2c02	attr: remove an implicit dependency on the_index Make the attr API take an index_state instead of assuming the_index in attr code. All call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. There is one ugly temporary workaround added in attr.c that needs some more explanation. Commit `c24f3abace` (apply: file commited with CRLF should roundtrip diff and apply - 2017-08-19) forces one convert_to_git() call to NOT read the index at all. But what do you know, we read it anyway by falling back to the_index. When "istate" from convert_to_git is now propagated down to read_attr_from_array() we will hit segfault somewhere inside read_blob_data_from_index. The right way of dealing with this is to kill "use_index" variable and only follow "istate" but at this stage we are not ready for that: while most git_attr_set_direction() calls just passes the_index to be assigned to use_index, unpack-trees passes a different one which is used by entry.c code, which has no way to know what index to use if we delete use_index. So this has to be done later. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 14:14:42 -07:00
Junio C Hamano	00da9b2091	Merge branch 'bb/pedantic' The codebase has been updated to compile cleanly with -pedantic option. * bb/pedantic: utf8.c: avoid char overflow string-list.c: avoid conversion from void * to function pointer sequencer.c: avoid empty statements at top level convert.c: replace "\e" escapes with "\033". fixup! refs/refs-internal.h: avoid forward declaration of an enum refs/refs-internal.h: avoid forward declaration of an enum fixup! connect.h: avoid forward declaration of an enum connect.h: avoid forward declaration of an enum	2018-07-24 14:50:47 -07:00
Jeff King	c7d017d7e1	reencode_string: use size_t for string lengths The iconv interface takes a size_t, which is the appropriate type for an in-memory buffer. But our reencode_string_* functions use integers, meaning we may get confusing results when the sizes exceed INT_MAX. Let's use size_t consistently. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 10:19:29 -07:00
Nguyễn Thái Ngọc Duy	d26a328eaf	convert.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	1a07e59c3e	Update messages in preparation for i18n Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Junio C Hamano	00624d608c	Merge branch 'sb/object-store-grafts' The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-07-18 12:20:28 -07:00
Beat Bolli	8302f50e8c	convert.c: replace "\e" escapes with "\033". The "\e" escape is not defined in ISO C. While on this line, add a missing space after the comma. Signed-off-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-09 14:36:24 -07:00
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:42:03 +09:00
Junio C Hamano	1ac0ce4d32	Merge branch 'ls/checkout-encoding' The new "checkout-encoding" attribute can ask Git to convert the contents to the specified encoding when checking out to the working tree (and the other way around when checking in). * ls/checkout-encoding: convert: add round trip check based on 'core.checkRoundtripEncoding' convert: add tracing for 'working-tree-encoding' attribute convert: check for detectable errors in UTF encodings convert: add 'working-tree-encoding' attribute utf8: add function to detect a missing UTF-16/32 BOM utf8: add function to detect prohibited UTF-16/32 BOM utf8: teach same_encoding() alternative UTF encoding names strbuf: add a case insensitive starts_with() strbuf: add xstrdup_toupper() strbuf: remove unnecessary NUL assignment in xstrdup_tolower()	2018-05-08 15:59:22 +09:00
Lars Schneider	e92d622536	convert: add round trip check based on 'core.checkRoundtripEncoding' UTF supports lossless conversion round tripping and conversions between UTF and other encodings are mostly round trip safe as Unicode aims to be a superset of all other character encodings. However, certain encodings (e.g. SHIFT-JIS) are known to have round trip issues [1]. Add 'core.checkRoundtripEncoding', which contains a comma separated list of encodings, to define for what encodings Git should check the conversion round trip if they are used in the 'working-tree-encoding' attribute. Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'. [1] https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 11:40:56 +09:00
Lars Schneider	541d059cd9	convert: add tracing for 'working-tree-encoding' attribute Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable tracing for content that is reencoded with the 'working-tree-encoding' attribute. This is useful to debug encoding issues. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 11:40:56 +09:00
Lars Schneider	7a17918c34	convert: check for detectable errors in UTF encodings Check that new content is valid with respect to the user defined 'working-tree-encoding' attribute. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 11:40:56 +09:00
Lars Schneider	107642fe26	convert: add 'working-tree-encoding' attribute Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as most Git web front ends do not visualize the content. Add an attribute to tell Git what encoding the user has defined for a given file. If the content is added to the index, then Git reencodes the content to a canonical UTF-8 representation. On checkout Git will reverse this operation. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 11:40:56 +09:00
brian m. carlson	1a750441a7	convert: convert to struct object_id Convert convert.c to struct object_id. Add a use of the_hash_algo to replace hard-coded constants and change a strbuf_add to a strbuf_addstr to avoid another hard-coded constant. Note that a strict conversion using the hexsz constant would cause problems in the future if the internal and user-visible hash algorithms differed, as anticipated by the hash function transition plan. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
Junio C Hamano	8be8342b4c	Merge branch 'po/object-id' Conversion from uchar[20] to struct object_id continues. * po/object-id: sha1_file: rename hash_sha1_file_literally sha1_file: convert write_loose_object to object_id sha1_file: convert force_object_loose to object_id sha1_file: convert write_sha1_file to object_id notes: convert write_notes_tree to object_id notes: convert combine_notes_* to object_id commit: convert commit_tree* to object_id match-trees: convert splice_tree to object_id cache: clear whole hash buffer with oidclr sha1_file: convert hash_sha1_file to object_id dir: convert struct sha1_stat to use object_id sha1_file: convert pretend_sha1_file to object_id	2018-02-15 14:55:43 -08:00
Patryk Obara	f070faccc1	sha1_file: convert hash_sha1_file to object_id Convert the declaration and definition of hash_sha1_file to use struct object_id and adjust all function calls. Rename this function to hash_object_file. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Torsten Bögershausen	8462ff43e4	convert_to_git(): safe_crlf/checksafe becomes int conv_flags When calling convert_to_git(), the checksafe parameter defined what should happen if the EOL conversion (CRLF --> LF --> CRLF) does not roundtrip cleanly. In addition, it also defined if line endings should be renormalized (CRLF --> LF) or kept as they are. checksafe was an safe_crlf enum with these values: SAFE_CRLF_FALSE: do nothing in case of EOL roundtrip errors SAFE_CRLF_FAIL: die in case of EOL roundtrip errors SAFE_CRLF_WARN: print a warning in case of EOL roundtrip errors SAFE_CRLF_RENORMALIZE: change CRLF to LF SAFE_CRLF_KEEP_CRLF: keep all line endings as they are In some cases the integer value 0 was passed as checksafe parameter instead of the correct enum value SAFE_CRLF_FALSE. That was no problem because SAFE_CRLF_FALSE is defined as 0. FALSE/FAIL/WARN are different from RENORMALIZE and KEEP_CRLF. Therefore, an enum is not ideal. Let's use a integer bit pattern instead and rename the parameter to conv_flags to make it more generically usable. This allows us to extend the bit pattern in a subsequent commit. Reported-By: Randall S. Becker <rsbecker@nexbridge.com> Helped-By: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 12:35:56 -08:00
Junio C Hamano	720b1764de	Merge branch 'tb/check-crlf-for-safe-crlf' The "safe crlf" check incorrectly triggered for contents that does not use CRLF as line endings, which has been corrected. * tb/check-crlf-for-safe-crlf: t0027: Adapt the new MIX tests to Windows convert: tighten the safe autocrlf handling	2017-12-27 11:16:21 -08:00
Torsten Bögershausen	86ff70a0f0	convert: tighten the safe autocrlf handling When a text file had been commited with CRLF and the file is commited again, the CRLF are kept if .gitattributs has "text=auto". This is done by analyzing the content of the blob stored in the index: If a '\r' is found, Git assumes that the blob was commited with CRLF. The simple search for a '\r' does not always work as expected: A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary. Now the content is converted into UTF-8. At the next commit Git treats the file as text, the CRLF should be converted into LF, but isn't. Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found, 0 is returned directly, this is the most common case. If a '\r' is found, the content is analyzed more deeply. Reported-By: Ashish Negi <ashishnegi33@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-27 10:17:24 +09:00

1 2 3 4

200 Commits