git-commit-vandalism

Author	SHA1	Message	Date
Jeff King	c0566d78aa	hash.h: move object_id definition from cache.h Our hashmap.h helpfully defines a sha1hash() function. But it cannot define a similar oidhash() without including all of cache.h, which itself wants to include hashmap.h! Let's break this circular dependency by moving the definition to hash.h, along with the remaining RAWSZ macros, etc. That will put them with the existing git_hash_algo definition. One alternative would be to move oidhash() into cache.h, but it's already quite bloated. We're better off moving things out than in. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-20 10:40:42 -07:00
Johannes Schindelin	d4c0a3ac78	fill_stat_cache_info(): prepare for an fsmonitor fix We will need to pass down the `struct index_state` to `mark_fsmonitor_valid()` for an upcoming bug fix, and this here function calls that there function, so we need to extend the signature of `fill_stat_cache_info()` first. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-28 12:43:42 -07:00
Junio C Hamano	cfd635c742	Merge branch 'js/fsmonitor-refresh-after-discarding-index' The fsmonitor interface got out of sync after the in-core index file gets discarded, which has been corrected. * js/fsmonitor-refresh-after-discarding-index: fsmonitor: force a refresh after the index was discarded fsmonitor: demonstrate that it is not refreshed after discard_index()	2019-05-19 16:45:33 +09:00
Junio C Hamano	4aeeef3773	Merge branch 'dl/no-extern-in-func-decl' Mechanically and systematically drop "extern" from function declarlation. * dl/no-extern-in-func-decl: .[ch]: manually align parameter lists .[ch]: remove extern from function declarations using sed *.[ch]: remove extern from function declarations using spatch	2019-05-13 23:50:32 +09:00
Junio C Hamano	0b179f3175	Merge branch 'nd/sha1-name-c-wo-the-repository' Further code clean-up to allow the lowest level of name-to-object mapping layer to work with a passed-in repository other than the default one. * nd/sha1-name-c-wo-the-repository: (34 commits) sha1-name.c: remove the_repo from get_oid_mb() sha1-name.c: remove the_repo from other get_oid_* sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name submodule-config.c: use repo_get_oid for reading .gitmodules sha1-name.c: add repo_get_oid() sha1-name.c: remove the_repo from get_oid_with_context_1() sha1-name.c: remove the_repo from resolve_relative_path() sha1-name.c: remove the_repo from diagnose_invalid_index_path() sha1-name.c: remove the_repo from handle_one_ref() sha1-name.c: remove the_repo from get_oid_1() sha1-name.c: remove the_repo from get_oid_basic() sha1-name.c: remove the_repo from get_describe_name() sha1-name.c: remove the_repo from get_oid_oneline() sha1-name.c: add repo_interpret_branch_name() sha1-name.c: remove the_repo from interpret_branch_mark() sha1-name.c: remove the_repo from interpret_nth_prior_checkout() sha1-name.c: remove the_repo from get_short_oid() sha1-name.c: add repo_for_each_abbrev() sha1-name.c: store and use repo in struct disambiguate_state sha1-name.c: add repo_find_unique_abbrev_r() ...	2019-05-09 00:37:25 +09:00
Junio C Hamano	96379f043f	Merge branch 'en/merge-directory-renames' "git merge-recursive" backend recently learned a new heuristics to infer file movement based on how other files in the same directory moved. As this is inherently less robust heuristics than the one based on the content similarity of the file itself (rather than based on what its neighbours are doing), it sometimes gives an outcome unexpected by the end users. This has been toned down to leave the renamed paths in higher/conflicted stages in the index so that the user can examine and confirm the result. * en/merge-directory-renames: merge-recursive: switch directory rename detection default merge-recursive: give callers of handle_content_merge() access to contents merge-recursive: track information associated with directory renames t6043: fix copied test description to match its purpose merge-recursive: switch from (oid,mode) pairs to a diff_filespec merge-recursive: cleanup handle_rename_* function signatures merge-recursive: track branch where rename occurred in rename struct merge-recursive: remove ren[12]_other fields from rename_conflict_info merge-recursive: shrink rename_conflict_info merge-recursive: move some struct declarations together merge-recursive: use 'ci' for rename_conflict_info variable name merge-recursive: rename locals 'o' and 'a' to 'obuf' and 'abuf' merge-recursive: rename diff_filespec 'one' to 'o' merge-recursive: rename merge_options argument from 'o' to 'opt' Use 'unsigned short' for mode, like diff_filespec does	2019-05-09 00:37:22 +09:00
Johannes Schindelin	398a3b0899	fsmonitor: force a refresh after the index was discarded With this change, the `index_state` struct becomes the new home for the flag that says whether the fsmonitor hook has been run, i.e. it is now per-index. It also gets re-set when the index is discarded, fixing the bug demonstrated by the "test_expect_failure" test added in the preceding commit. In that case fsmonitor-enabled Git would miss updates under certain circumstances, see that preceding commit for details. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-08 12:03:48 +09:00
Denton Liu	ad6dad0996	*.[ch]: manually align parameter lists In previous patches, extern was mechanically removed from function declarations without care to formatting, causing parameter lists to be misaligned. Manually format changed sections such that the parameter lists should be realigned. Viewing this patch with 'git diff -w' should produce no output. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:10 +09:00
Denton Liu	b199d7147a	.[ch]: remove extern from function declarations using sed There has been a push to remove extern from function declarations. Finish the job by removing all instances of "extern" for function declarations in headers using sed. This was done by running the following on my system with sed 4.2.2: $ git ls-files \.{c,h} \| grep -v ^compat/ \| xargs sed -i'' -e 's/^$\s$extern $[^(]([^]$/\1\2/' Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Then, leftover instances of extern were found by running $ git grep -w -C3 extern \.{c,h} and manually checking the output. No other instances were found. Note that the regex used specifically excludes function variables which _should_ be left as extern. Not the most elegant way to do it but it gets the job done. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:08 +09:00
Denton Liu	554544276a	.[ch]: remove extern from function declarations using spatch There has been a push to remove extern from function declarations. Remove some instances of "extern" for function declarations which are caught by Coccinelle. Note that Coccinelle has some difficulty with processing functions with `__attribute__` or varargs so some `extern` declarations are left behind to be dealt with in a future patch. This was the Coccinelle patch used: @@ type T; identifier f; @@ - extern T f(...); and it was run with: $ git ls-files \.{c,h} \| grep -v ^compat/ \| xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:06 +09:00
Junio C Hamano	5795a75f9b	Merge branch 'bp/post-index-change-hook' A new hook "post-index-change" is called when the on-disk index file changes, which can help e.g. a virtualized working tree implementation. * bp/post-index-change-hook: read-cache: add post-index-change hook	2019-04-25 16:41:11 +09:00
Junio C Hamano	e36adf7122	Merge branch 'ps/stash-in-c' "git stash" rewritten in C. * ps/stash-in-c: (28 commits) tests: add a special setup where stash.useBuiltin is off stash: optionally use the scripted version again stash: add back the original, scripted `git stash` stash: convert `stash--helper.c` into `stash.c` stash: replace all `write-tree` child processes with API calls stash: optimize `get_untracked_files()` and `check_changes()` stash: convert save to builtin stash: make push -q quiet stash: convert push to builtin stash: convert create to builtin stash: convert store to builtin stash: convert show to builtin stash: convert list to builtin stash: convert pop to builtin stash: convert branch to builtin stash: convert drop and clear to builtin stash: convert apply to builtin stash: mention options in `show` synopsis stash: add tests for `git stash show` config stash: rename test cases to be more descriptive ...	2019-04-22 11:14:43 +09:00
Nguyễn Thái Ngọc Duy	0daf7ff6c0	sha1-name.c: remove the_repo from get_oid_mb() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	65e5046400	sha1-name.c: remove the_repo from other get_oid_* Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	e270f42c4d	sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	ec580eaaa3	sha1-name.c: add repo_get_oid() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	2b1790f5ab	sha1-name.c: remove the_repo from get_oid_1() There is a cyclic dependency between one of these functions so they cannot be converted one by one, so all related functions are converted at once. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	4e99f2dbea	sha1-name.c: add repo_for_each_abbrev() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:52 +09:00
Nguyễn Thái Ngọc Duy	8bb95572b0	sha1-name.c: add repo_find_unique_abbrev_r() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:52 +09:00
Nguyễn Thái Ngọc Duy	8f56e9d4ba	refs.c: remove the_repo from substitute_branch_name() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-08 17:26:32 +09:00
Elijah Newren	5ec1e72823	Use 'unsigned short' for mode, like diff_filespec does struct diff_filespec defines mode to be an 'unsigned short'. Several other places in the API which we'd like to interact with using a diff_filespec used a plain unsigned (or unsigned int). This caused problems when taking addresses, so switch to unsigned short. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-08 16:02:07 +09:00
Junio C Hamano	6b5688b760	Merge branch 'ma/clear-repository-format' The setup code has been cleaned up to avoid leaks around the repository_format structure. * ma/clear-repository-format: setup: fix memory leaks with `struct repository_format` setup: free old value before setting `work_tree`	2019-03-20 15:16:07 +09:00
Junio C Hamano	32038fef00	Merge branch 'jh/trace2' A more structured way to obtain execution trace has been added. * jh/trace2: trace2: add for_each macros to clang-format trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh trace2:data: add subverb for rebase trace2:data: add subverb to reset command trace2:data: add subverb to checkout command trace2:data: pack-objects: add trace2 regions trace2:data: add trace2 instrumentation to index read/write trace2:data: add trace2 hook classification trace2:data: add trace2 transport child classification trace2:data: add trace2 sub-process classification trace2:data: add editor/pager child classification trace2:data: add trace2 regions to wt-status trace2: collect Windows-specific process information trace2: create new combined trace facility trace2: Documentation/technical/api-trace2.txt	2019-03-07 09:59:56 +09:00
Junio C Hamano	4e021dc28e	Merge branch 'wh/author-committer-ident-config' Four new configuration variables {author,committer}.{name,email} have been introduced to override user.{name,email} in more specific cases. * wh/author-committer-ident-config: config: allow giving separate author and committer idents	2019-03-07 09:59:53 +09:00
Junio C Hamano	7d0c1f4556	Merge branch 'tg/checkout-no-overlay' "git checkout --no-overlay" can be used to trigger a new mode of checking out paths out of the tree-ish, that allows paths that match the pathspec that are in the current index and working tree and are not in the tree-ish. * tg/checkout-no-overlay: revert "checkout: introduce checkout.overlayMode config" checkout: introduce checkout.overlayMode config checkout: introduce --{,no-}overlay option checkout: factor out mark_cache_entry_for_checkout function checkout: clarify comment read-cache: add invalidate parameter to remove_marked_cache_entries entry: support CE_WT_REMOVE flag in checkout_entry entry: factor out unlink_entry function move worktree tests to t24*	2019-03-07 09:59:51 +09:00
Thomas Gummerer	0640897dc5	ident: don't require calling prepare_fallback_ident first In `fd5a58477c` ("ident: add the ability to provide a "fallback identity"", 2019-02-25) I made it a requirement to call prepare_fallback_ident as the first function in the ident API. However in stash we didn't actually end up following that. This leads to a BUG if user.email and user.name are set. It was not caught in the test suite because we only rely on environment variables for setting the user name and email instead of the config. Instead of making it a bug to call other functions in the ident API first, just return silently if the identity of a user was already set up. Reported-by: Denton Liu <liu.denton@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-07 09:41:40 +09:00
Martin Ågren	e8805af1c3	setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-01 08:52:00 +09:00
Johannes Schindelin	fd5a58477c	ident: add the ability to provide a "fallback identity" In `3bc2111fc2` (stash: tolerate missing user identity, 2018-11-18), `git stash` learned to provide a fallback identity for the case that no proper name/email was given (and `git stash` does not really care about a correct identity anyway, but it does want to create a commit object). In preparation for the same functionality in the upcoming built-in version of `git stash`, let's offer the same functionality as an API function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> [tg: add docs; make it a bug to call the function before other functions in the ident API] Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-01 08:03:46 +09:00
Paul-Sebastian Ungureanu	f5116f43f6	sha1-name.c: add `get_oidf()` which acts like `get_oid()` Compared to `get_oid()`, `get_oidf()` has as parameters a pointer to `object_id`, a printf format string and additional arguments. This will help simplify the code in subsequent commits. Original-idea-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-01 08:03:46 +09:00
Jeff Hostetler	ee4512ed48	trace2: create new combined trace facility Create a new unified tracing facility for git. The eventual intent is to replace the current trace_printf* and trace_performance* routines with a unified set of git_trace2* routines. In addition to the usual printf-style API, trace2 provides higer-level event verbs with fixed-fields allowing structured data to be written. This makes post-processing and analysis easier for external tools. Trace2 defines 3 output targets. These are set using the environment variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be set to "1" or to an absolute pathname (just like the current GIT_TRACE). * GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command summary data. * GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE. It extends the output with columns for the command process, thread, repo, absolute and relative elapsed times. It reports events for child process start/stop, thread start/stop, and per-thread function nesting. * GIT_TR2_EVENT is a new structured format. It writes event data as a series of JSON records. Calls to trace2 functions log to any of the 3 output targets enabled without the need to call different trace_printf* or trace_performance* routines. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-22 15:27:59 -08:00
Ben Peart	1956ecd0ab	read-cache: add post-index-change hook Add a post-index-change hook that is invoked after the index is written in do_write_locked_index(). This hook is meant primarily for notification, and cannot affect the outcome of git commands that trigger the index write. The hook is passed a flag to indicate whether the working directory was updated or not and a flag indicating if a skip-worktree bit could have changed. These flags enable the hook to optimize its response to the index change notification. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-15 11:00:33 -08:00
Junio C Hamano	cba595ab1a	Merge branch 'jk/loose-object-cache-oid' Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references	2019-02-06 22:05:27 -08:00
Junio C Hamano	ecbe1beb8e	Merge branch 'lt/date-human' A new date format "--date=human" that morphs its output depending on how far the time is from the current time has been introduced. "--date=auto" can be used to use this new format when the output is going to the pager or to the terminal and otherwise the default format. * lt/date-human: Add `human` date format tests. Add `human` format to test-tool Add 'human' date format documentation Replace the proposed 'auto' mode with 'auto:' Add 'human' date format	2019-02-06 22:05:24 -08:00
Junio C Hamano	b2fc9d2fb0	Merge branch 'jk/unused-parameter-cleanup' Code cleanup. * jk/unused-parameter-cleanup: convert: drop path parameter from actual conversion functions convert: drop len parameter from conversion checks config: drop unused parameter from maybe_remove_section() show_date_relative(): drop unused "tz" parameter column: drop unused "opts" parameter in item_length() create_bundle(): drop unused "header" parameter apply: drop unused "def" parameter from find_name_gnu() match-trees: drop unused path parameter from score functions	2019-02-06 22:05:23 -08:00
Junio C Hamano	7589e63648	Merge branch 'nd/the-index-final' The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument	2019-02-06 22:05:23 -08:00
Junio C Hamano	09a9c1f427	Merge branch 'tt/bisect-in-c' More code in "git bisect" has been rewritten in C. * tt/bisect-in-c: bisect--helper: `bisect_start` shell function partially in C bisect--helper: `get_terms` & `bisect_terms` shell function in C bisect--helper: `bisect_next_check` shell function in C bisect--helper: `check_and_set_terms` shell function in C wrapper: move is_empty_file() and rename it as is_empty_or_missing_file() bisect--helper: `bisect_write` shell function in C bisect--helper: `bisect_reset` shell function in C	2019-02-06 22:05:22 -08:00
Junio C Hamano	cfd9167c15	Merge branch 'dt/cat-file-batch-ambiguous' "git cat-file --batch" reported a dangling symbolic link by mistake, when it wanted to report that a given name is ambiguous. * dt/cat-file-batch-ambiguous: t1512: test ambiguous cat-file --batch and --batch-output Do not print 'dangling' for cat-file in case of ambiguity	2019-02-06 22:05:21 -08:00
Junio C Hamano	1c418243a5	Merge branch 'jk/add-ignore-errors-bit-assignment-fix' "git add --ignore-errors" did not work as advertised and instead worked as an unintended synonym for "git add --renormalize", which has been fixed. * jk/add-ignore-errors-bit-assignment-fix: add: use separate ADD_CACHE_RENORMALIZE flag	2019-02-05 14:26:13 -08:00
William Hubbs	39ab4d0951	config: allow giving separate author and committer idents The author.email, author.name, committer.email and committer.name settings are analogous to the GIT_AUTHOR_* and GIT_COMMITTER_* environment variables, but for the git config system. This allows them to be set separately for each repository. Git supports setting different authorship and committer information with environment variables. However, environment variables are set in the shell, so if different authorship and committer information is needed for different repositories an external tool is required. This adds support to git config for author.email, author.name, committer.email and committer.name settings so this information can be set per repository. Also, it generalizes the fmt_ident function so it can handle author vs committer identification. Signed-off-by: William Hubbs <williamh@gentoo.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-04 12:18:13 -08:00
Junio C Hamano	371820d5f1	Merge branch 'bc/tree-walk-oid' The code to walk tree objects has been taught that we may be working with object names that are not computed with SHA-1. * bc/tree-walk-oid: cache: make oidcpy always copy GIT_MAX_RAWSZ bytes tree-walk: store object_id in a separate member match-trees: use hashcpy to splice trees match-trees: compute buffer offset correctly when splicing tree-walk: copy object ID before use	2019-01-29 12:47:56 -08:00
Junio C Hamano	33e4ae9c50	Merge branch 'bc/sha-256' Add sha-256 hash and plug it through the code to allow building Git with the "NewHash". * bc/sha-256: hash: add an SHA-256 implementation using OpenSSL sha256: add an SHA-256 implementation using libgcrypt Add a base implementation of SHA-256 support commit-graph: convert to using the_hash_algo t/helper: add a test helper to compute hash speed sha1-file: add a constant for hash block size t: make the sha1 test-tool helper generic t: add basic tests for our SHA-1 implementation cache: make hashcmp and hasheq work with larger hashes hex: introduce functions to print arbitrary hashes sha1-file: provide functions to look up hash algorithms sha1-file: rename algorithm to "sha1"	2019-01-29 12:47:55 -08:00
Stephen P. Smith	b841d4ff43	Add `human` format to test-tool Add the human format support to the test tool so that GIT_TEST_DATE_NOW can be used to specify the current time. The get_time() helper function was created and and checks the GIT_TEST_DATE_NOW environment variable. If GIT_TEST_DATE_NOW is set, then that date is used instead of the date returned by by gettimeofday(). All calls to gettimeofday() were replaced by calls to get_time(). Renamed occurances of TEST_DATE_NOW to GIT_TEST_DATE_NOW since the variable is now used in the get binary and not just in the test-tool. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-29 09:40:08 -08:00
Jeff King	3d42034a18	show_date_relative(): drop unused "tz" parameter The timestamp we receive is in epoch time, so there's no need for a timezone parameter to interpret it. The matching show_date() uses "tz" to show dates in author local time, but relative dates show only the absolute time difference. The author's location is irrelevant, barring relativistic effects from using Git close to the speed of light. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 12:35:44 -08:00
Nguyễn Thái Ngọc Duy	f8adbec9fe	cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 11:55:06 -08:00
David Turner	d1dd94b308	Do not print 'dangling' for cat-file in case of ambiguity The return values -1 and -2 from get_oid could mean two different things, depending on whether they were from an enum returned by get_tree_entry_follow_symlinks, or from a different code path. This caused 'dangling' to be printed from a git cat-file in the case of an ambiguous (-2) result. Unify the results of get_oid* and get_tree_entry_follow_symlinks to be one common type, with unambiguous values. Signed-off-by: David Turner <novalis@novalis.org> Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 15:22:02 -08:00
Linus Torvalds	acdd37769d	Add 'human' date format This adds --date=human, which skips the timezone if it matches the current time-zone, and doesn't print the whole date if that matches (ie skip printing year for dates that are "this year", but also skip the whole date itself if it's in the last few days and we can just say what weekday it was). For really recent dates (same day), use the relative date stamp, while for old dates (year doesn't match), don't bother with time and timezone. Also add 'auto' date mode, which defaults to human if we're using the pager. So you can do git config --add log.date auto and your "git log" commands will show the human-legible format unless you're scripting things. Note that this time format still shows the timezone for recent enough events (but not so recent that they show up as relative dates). You can combine it with the "-local" suffix to never show timezones for an even more simplified view. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:31:23 -08:00
Jeff King	9e5da3d055	add: use separate ADD_CACHE_RENORMALIZE flag Commit `9472935d81` (add: introduce "--renormalize", 2017-11-16) taught git-add to pass HASH_RENORMALIZE to add_to_index(), which then passes the flag along to index_path(). However, the flags taken by add_to_index() and the ones taken by index_path() are distinct namespaces. We cannot take HASH_* flags in add_to_index(), because they overlap with the ADD_CACHE_* flags we already take (in this case, HASH_RENORMALIZE conflicts with ADD_CACHE_IGNORE_ERRORS). We can solve this by adding a new ADD_CACHE_RENORMALIZE flag, and using it to set HASH_RENORMALIZE within add_to_index(). In order to make it clear that these two flags come from distinct sets, let's also change the name "newflags" in the function to "hash_flags". Reported-by: Dmitriy Smirnov <dmitriy.smirnov@jetbrains.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 13:40:21 -08:00
brian m. carlson	974e4a85e3	cache: make oidcpy always copy GIT_MAX_RAWSZ bytes There are some situations in which we want to store an object ID into struct object_id without the_hash_algo necessarily being set correctly. One such case is when cloning a repository, where we must read refs from the remote side without having a repository from which to read the preferred algorithm. In this cases, we may have the_hash_algo set to SHA-1, which is the default, but read refs into struct object_id that are SHA-256. When copying these values, we will want to copy them completely, not just the first 20 bytes. Consequently, make sure that oidcpy copies the maximum number of bytes at all times, regardless of the setting of the_hash_algo. Since oidcpy and hashcpy are no longer functionally identical, remove the Cocinelle object_id transformations that convert from one into the other. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-15 09:57:41 -08:00
Nguyễn Thái Ngọc Duy	150fe065f7	read-cache.c: remove the_* from index_has_changes() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:05 -08:00
Nguyễn Thái Ngọc Duy	3a7a698e93	sha1-name.c: remove implicit dependency on the_index This kills the_index dependency in get_oid_with_context() but for get_oid() and friends, they still assume the_repository (which also means the_index). Unfortunately the widespread use of get_oid() will make it hard to make the conversion now. We probably will add repo_get_oid() at some point and limit the use of get_oid() in builtin/ instead of forcing all get_oid() call sites to carry struct repository. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy	1b0d968b34	read-cache.c: replace update_index_if_able with repo_& Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy	e1ff0a32e4	read-cache.c: kill read_index() read_index() shares the same problem as hold_locked_index(): it assumes $GIT_DIR/index. Move all call sites to repo_read_index() instead. read_index_preload() and read_index_unmerged() are also killed as a consequence. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy	3a95f31d1c	repository.c: replace hold_locked_index() with repo_hold_locked_index() hold_locked_index() assumes the index path at $GIT_DIR/index. This is not good for places that take an arbitrary index_state instead of the_index, which is basically everywhere except builtin/. Replace it with repo_hold_locked_index(). hold_locked_index() remains as a wrapper around repo_hold_locked_index() to reduce changes in builtin/ Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Jeff King	00a7760e81	sha1-file: modernize loose header/stream functions As with the open/map/close functions for loose objects that were recently converted, the functions for parsing the loose object stream use the name "sha1" and a bare "unsigned char *". Let's fix that so that unpack_sha1_header() becomes unpack_loose_header(), etc. These conversions are less clear-cut than the file access functions. You could argue that the they are parsing Git's canonical object format (i.e., "type size\0contents", over which we compute the hash), which is not strictly tied to loose storage. But in practice these functions are used only for loose objects, and using the term "loose_header" (instead of "object_header") distinguishes it from the object header found in packfiles (which contains the same information in a different format). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Jeff King	c93206b412	update comment references to sha1_object_info() Commit `abef9020e3` (sha1_file: convert sha1_object_info* to object_id, 2018-03-12) renamed the function to oid_object_info(), but missed some comments which mention it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Thomas Gummerer	6fdc205722	read-cache: add invalidate parameter to remove_marked_cache_entries When marking cache entries for removal, and later removing them all at once using 'remove_marked_cache_entries()', cache entries currently have to be invalidated manually in the cache tree and in the untracked cache. Add an invalidate flag to the function. With the flag set, the function will take care of invalidating the path in the cache tree and in the untracked cache. Note that the current callsites already do the invalidation properly in other places, so we're just passing 0 from there to keep the status quo. This will be useful in a subsequent commit. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 15:28:05 -08:00
Thomas Gummerer	b702dd12d5	entry: factor out unlink_entry function Factor out the 'unlink_entry()' function from unpack-trees.c to entry.c. It will be used in other places as well in subsequent steps. As it's no longer a static function, also move the documentation to the header file to make it more discoverable. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 15:28:05 -08:00
Pranit Bauva	e3b1e3bdc0	wrapper: move is_empty_file() and rename it as is_empty_or_missing_file() is_empty_file() can help to refactor a lot of code. This will be very helpful in porting "git bisect" to C. Suggested-by: Torsten Bögershausen <tboegi@web.de> Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 10:23:02 -08:00
brian m. carlson	13eeedb5d1	Add a base implementation of SHA-256 support SHA-1 is weak and we need to transition to a new hash function. For some time, we have referred to this new function as NewHash. Recently, we decided to pick SHA-256 as NewHash. The reasons behind the choice of SHA-256 are outlined in the thread starting at [1] and in the commit history for the hash function transition document. Add a basic implementation of SHA-256 based off libtomcrypt, which is in the public domain. Optimize it and restructure it to meet our coding standards. Pull in the update and final functions from the SHA-1 block implementation, as we know these function correctly with all compilers. This implementation is slower than SHA-1, but more performant implementations will be introduced in future commits. Wire up SHA-256 in the list of hash algorithms, and add a test that the algorithm works correctly. Note that with this patch, it is still not possible to switch to using SHA-256 in Git. Additional patches are needed to prepare the code to handle a larger hash algorithm and further test fixes are needed. [1] https://public-inbox.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/ Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:53 +09:00
brian m. carlson	a2ce0a7526	sha1-file: add a constant for hash block size There is one place we need the hash algorithm block size: the HMAC code for push certs. Expose this constant in struct git_hash_algo and expose values for SHA-1 and for the largest value of any hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
brian m. carlson	0dab7129ab	cache: make hashcmp and hasheq work with larger hashes In `183a638b7d` ("hashcmp: assert constant hash size", 2018-08-23), we modified hashcmp to assert that the hash size was always 20 to help it optimize and inline calls to memcmp. In a future series, we replaced many calls to hashcmp and oidcmp with calls to hasheq and oideq to improve inlining further. However, we want to support hash algorithms other than SHA-1, namely SHA-256. When doing so, we must handle the case where these values are 32 bytes long as well as 20. Adjust hashcmp to handle two cases: 20-byte matches, and maximum-size matches. Therefore, when we include SHA-256, we'll automatically handle it properly, while at the same time teaching the compiler that there are only two possible options to consider. This will allow the compiler to write the most efficient possible code. Copy similar code into hasheq and perform an identical transformation. At least with GCC 8.2.0, making hasheq defer to hashcmp when there are two branches prevents the compiler from inlining the comparison, while the code in this patch is inlined properly. Add a comment to avoid an accidental performance regression from well-intentioned refactoring. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
brian m. carlson	47edb64997	hex: introduce functions to print arbitrary hashes Currently, we have functions that turn an arbitrary SHA-1 value or an object ID into hex format, either using a static buffer or with a user-provided buffer. Add variants of these functions that can handle an arbitrary hash algorithm, specified by constant. Update the documentation as well. While we're at it, remove the "extern" declaration from this family of functions, since it's not needed and our style now recommends against it. We use the variant taking the algorithm structure pointer as the internal variant, since taking an algorithm pointer is the easiest way to handle all of the variants in use. Note that we maintain these functions because there are hashes which must change based on the hash algorithm in use but are not object IDs (such as pack checksums). Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
Nguyễn Thái Ngọc Duy	0f086e6dca	checkout: print something when checking out paths One of the problems with "git checkout" is that it does so many different things and could confuse people specially when we fail to handle ambiguation correctly. One way to help with that is tell the user what sort of operation is actually carried out. When switching branches, we always print something unless --quiet, either - "HEAD is now at ..." - "Reset branch ..." - "Already on ..." - "Switched to and reset ..." - "Switched to a new branch ..." - "Switched to branch ..." Checking out paths however is silent. Print something so that if we got the user intention wrong, they won't waste too much time to find that out. For the remaining cases of checkout we now print either - "Checked out ... paths out of the index" - "Checked out ... paths out of <abbrev hash>" Since the purpose of printing this is to help disambiguate. Only do it when "--" is missing. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 15:10:35 +09:00
Junio C Hamano	11aa560de9	Merge branch 'bp/refresh-index-using-preload' The helper function to refresh the cached stat information in the in-core index has learned to perform the lstat() part of the operation in parallel on multi-core platforms. * bp/refresh-index-using-preload: refresh_index: remove unnecessary calls to preload_index() speed up refresh_index() by utilizing preload_index()	2018-11-13 22:37:26 +09:00
Junio C Hamano	abb4824d13	Merge branch 'ao/submodule-wo-gitmodules-checked-out' The submodule support has been updated to read from the blob at HEAD:.gitmodules when the .gitmodules file is missing from the working tree. * ao/submodule-wo-gitmodules-checked-out: t/helper: add test-submodule-nested-repo-config submodule: support reading .gitmodules when it's not in the working tree submodule: add a helper to check if it is safe to write to .gitmodules t7506: clean up .gitmodules properly before setting up new scenario submodule: use the 'submodule--helper config' command submodule--helper: add a new 'config' subcommand t7411: be nicer to future tests and really clean things up t7411: merge tests 5 and 6 submodule: factor out a config_set_in_gitmodules_file_gently function submodule: add a print_config_from_gitmodules() helper	2018-11-13 22:37:22 +09:00
Junio C Hamano	6c268fdda9	Merge branch 'js/mingw-perl5lib' Windows fix. * js/mingw-perl5lib: mingw: unset PERL5LIB by default config: move Windows-specific config settings into compat/mingw.c config: allow for platform-specific core.* config settings config: rename `dummy` parameter to `cb` in git_default_config()	2018-11-13 22:37:20 +09:00
Junio C Hamano	8c758f9a67	Merge branch 'nd/per-worktree-config' A fourth class of configuration files (in addition to the traditional "system wide", "per user in the $HOME directory" and "per repository in the $GIT_DIR/config") has been introduced so that different worktrees that share the same repository (hence the same $GIT_DIR/config file) can use different customization. * nd/per-worktree-config: worktree: add per-worktree config files t1300: extract and use test_cmp_config()	2018-11-13 22:37:18 +09:00
Junio C Hamano	b49ef560ed	Merge branch 'ag/rebase-i-in-c' Rewrite of the remaining "rebase -i" machinery in C. * ag/rebase-i-in-c: rebase -i: move rebase--helper modes to rebase--interactive rebase -i: remove git-rebase--interactive.sh rebase--interactive2: rewrite the submodes of interactive rebase in C rebase -i: implement the main part of interactive rebase as a builtin rebase -i: rewrite init_basic_state() in C rebase -i: rewrite write_basic_state() in C rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in C rebase -i: implement the logic to initialize $revisions in C rebase -i: remove unused modes and functions rebase -i: rewrite complete_action() in C t3404: todo list with commented-out commands only aborts sequencer: change the way skip_unnecessary_picks() returns its result sequencer: refactor append_todo_help() to write its message to a buffer rebase -i: rewrite checkout_onto() in C rebase -i: rewrite setup_reflog_action() in C sequencer: add a new function to silence a command, except if it fails rebase -i: rewrite the edit-todo functionality in C editor: add a function to launch the sequence editor rebase -i: rewrite append_todo_help() in C sequencer: make three functions and an enum from sequencer.c public	2018-11-02 11:04:53 +09:00
Johannes Schindelin	bdfbb0ea93	config: move Windows-specific config settings into compat/mingw.c Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-31 12:46:27 +09:00
Ben Peart	99ce720c33	speed up refresh_index() by utilizing preload_index() Speed up refresh_index() by utilizing preload_index() to do most of the work spread across multiple threads. This works because most cache entries will get marked CE_UPTODATE so that refresh_cache_ent() can bail out early when called from within refresh_index(). On a Windows repo with ~200K files, this drops refresh times from 6.64 seconds to 2.87 seconds for a savings of 57%. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-30 11:28:39 +09:00
Junio C Hamano	7a43ab6fb2	Merge branch 'sg/split-index-racefix' The codepath to support the experimental split-index mode had remaining "racily clean" issues fixed. * sg/split-index-racefix: split-index: BUG() when cache entry refers to non-existing shared entry split-index: smudge and add racily clean cache entries to split index split-index: don't compare cached data of entries already marked for split index split-index: count the number of deleted entries t1700-split-index: date back files to avoid racy situations split-index: add tests to demonstrate the racy split index problem t1700-split-index: document why FSMONITOR is disabled in this test script	2018-10-26 14:22:10 +09:00
Nguyễn Thái Ngọc Duy	58b284a2e9	worktree: add per-worktree config files A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. () "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-22 13:17:04 +09:00
Junio C Hamano	4d87b38e6c	Merge branch 'nd/status-refresh-progress' "git status" learns to show progress bar when refreshing the index takes a long time. * nd/status-refresh-progress: status: show progress bar if refreshing the index takes too long	2018-10-19 13:34:03 +09:00
Junio C Hamano	11877b9ebe	Merge branch 'nd/the-index' Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...	2018-10-19 13:34:02 +09:00
SZEDER Gábor	5581a019ba	split-index: smudge and add racily clean cache entries to split index Ever since the split index feature was introduced [1], refreshing a split index is prone to a variant of the classic racy git problem. Consider the following sequence of commands updating the split index when the shared index contains a racily clean cache entry, i.e. an entry whose cached stat data matches with the corresponding file in the worktree and the cached mtime matches that of the index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # size stays the same! # ... wait ... git update-index --add other-file Normally, when a non-split index is updated, then do_write_index() (the function responsible for writing all kinds of indexes, "regular", split, and shared) recognizes racily clean cache entries, and writes them with smudged stat data, i.e. with file size set to 0. When subsequent git commands read the index, they will notice that the smudged stat data doesn't match with the file in the worktree, and then go on to check the file's content and notice its dirtiness. In the above example, however, in the second 'git update-index' prepare_to_write_split_index() decides which cache entries stored only in the shared index should be replaced in the new split index. Alas, this function never looks out for racily clean cache entries, and since the file's stat data in the worktree hasn't changed since the shared index was written, it won't be replaced in the new split index. Consequently, do_write_index() doesn't even get this racily clean cache entry, and can't smudge its stat data. Subsequent git commands will then see that the index has more recent mtime than the file and that the (not smudged) cached stat data still matches with the file in the worktree, and, ultimately, will erroneously consider the file clean. Modify prepare_to_write_split_index() to recognize racily clean cache entries, and mark them to be added to the split index. Note that there are two places where it should check raciness: first those cache entries that are only stored in the shared index, and then those that have been copied by unpack_trees() from the shared index while it constructed a new index. This way do_write_index() will get these racily clean cache entries as well, and will then write them with smudged stat data to the new split index. This change makes all tests in 't1701-racy-split-index.sh' pass, so flip the two 'test_expect_failure' tests to success. Also add the '#' (as in nr. of trial) to those tests' description that were omitted when the tests expected failure. Note that after this change if the index is split when it contains a racily clean cache entry, then a smudged cache entry will be written both to the new shared and to the new split indexes. This doesn't affect regular git commands: as far as they are concerned this is just an entry in the split index replacing an outdated entry in the shared index. It did affect a few tests in 't1700-split-index.sh', though, because they actually check which entries are stored in the split index; a previous patch in this series has already made the necessary adjustments in 't1700'. And racily clean cache entries and index splitting are rare enough to not worry about the resulting duplicated smudged cache entries, and the additional complexity required to prevent them is not worth it. Several tests failed occasionally when the test suite was run with 'GIT_TEST_SPLIT_INDEX=yes'. Here are those that I managed to trace back to this racy split index problem, starting with those failing more frequently, with a link to a failing Travis CI build job for each. The highlighted line [2] shows when the racy file was written, which is not always in the failing test but in a preceeding setup test. t3903-stash.sh: https://travis-ci.org/git/git/jobs/385542084#L5858 t4024-diff-optimize-common.sh: https://travis-ci.org/git/git/jobs/386531969#L3174 t4015-diff-whitespace.sh: https://travis-ci.org/git/git/jobs/360797600#L8215 t2200-add-update.sh: https://travis-ci.org/git/git/jobs/382543426#L3051 t0090-cache-tree.sh: https://travis-ci.org/git/git/jobs/416583010#L3679 There might be others, e.g. perhaps 't1000-read-tree-m-3way.sh' and others using 'lib-read-tree-m-3way.sh', but I couldn't confirm yet. [1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge branch 'nd/split-index', 2014-07-16). [2] Note that those highlighted lines are in the 'after failure' fold, and your browser might unhelpfully fold it up before you could take a good look. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-12 07:23:29 +09:00
Antonio Ospite	b5c259f226	submodule: add a helper to check if it is safe to write to .gitmodules Introduce a helper function named is_writing_gitmodules_ok() to verify that the .gitmodules file is safe to write. The function name follows the scheme of is_staging_gitmodules_ok(). The two symbolic constants GITMODULES_INDEX and GITMODULES_HEAD are used to get help from the C preprocessor in preventing typos, especially for future users. This is in preparation for a future change which teaches git how to read .gitmodules from the index or from the current branch if the file is not available in the working tree. The rationale behind the check is that writing to .gitmodules requires the file to be present in the working tree, unless a brand new .gitmodules is being created (in which case the .gitmodules file would not exist at all: neither in the working tree nor in the index or in the current branch). Expose the functionality also via a "submodule-helper config --check-writeable" command, as git scripts may want to perform the check before modifying submodules configuration. Signed-off-by: Antonio Ospite <ao2@ao2.it> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-09 12:40:21 +09:00
Nguyễn Thái Ngọc Duy	26d024ecf0	ws.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:51:18 -07:00
Nguyễn Thái Ngọc Duy	58bf2a4cc7	sha1-file.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:11 -07:00
Nguyễn Thái Ngọc Duy	7e196c3a28	merge.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:10 -07:00
Nguyễn Thái Ngọc Duy	92a1bf5a58	read-cache.c: remove 'const' from index_has_changes() This function calls do_diff_cache() which eventually needs to set this "istate" to unpack_options->src_index [1]. This is an unfortunate fact that unpack_trees() _will_ update [2] src_index so we can't really pass a const index_state there. Just remove 'const'. [1] Right now diff_cache() in diff-lib.c assigns the_index to src_index. But the plan is to get rid of the_index, so it should be 'istate' from here that gets assigned to src_index. [2] Some transient bits in the source index are touched. Optional extensions can also be removed. But other than that the source tree should still be valid. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:10 -07:00
Junio C Hamano	769af0fd9e	Merge branch 'jk/cocci' spatch transformation to replace boolean uses of !hashcmp() to newly introduced oideq() is added, and applied, to regain performance lost due to support of multiple hash algorithms. * jk/cocci: show_dirstat: simplify same-content check read-cache: use oideq() in ce_compare functions convert hashmap comparison functions to oideq() convert "hashcmp() != 0" to "!hasheq()" convert "oidcmp() != 0" to "!oideq()" convert "hashcmp() == 0" to hasheq() convert "oidcmp() == 0" to oideq() introduce hasheq() and oideq() coccinelle: use <...> for function exclusion	2018-09-17 13:53:57 -07:00
Junio C Hamano	c2407322b6	Merge branch 'nd/clone-case-smashing-warning' Running "git clone" against a project that contain two files with pathnames that differ only in cases on a case insensitive filesystem would result in one of the files lost because the underlying filesystem is incapable of holding both at the same time. An attempt is made to detect such a case and warn. * nd/clone-case-smashing-warning: clone: report duplicate entries on case-insensitive filesystems	2018-09-17 13:53:47 -07:00
Nguyễn Thái Ngọc Duy	ae9af12287	status: show progress bar if refreshing the index takes too long Refreshing the index is usually very fast, but it can still take a long time sometimes. Cold cache is one. Or copying a repo to a new place (). It's good to show something to let the user know "git status" is not hanging, it's just busy doing something. () In this case, all stat info in the index becomes invalid and git falls back to rehashing all file content to see if there's any difference between updating stat info in the index. This is quite expensive. Even with a repo as small as git.git, it takes 3 seconds. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 09:38:50 -07:00
Jeff King	e3ff0683e2	convert "hashcmp() == 0" to hasheq() This is the partner patch to the previous one, but covering the "hash" variants instead of "oid". Note that our coccinelle rule is slightly more complex to avoid triggering the call in hasheq(). I didn't bother to add a new rule to convert: - hasheq(E1->hash, E2->hash) + oideq(E1, E2) Since these are new functions, there won't be any such existing callers. And since most of the code is already using oideq, we're not likely to introduce new ones. We might still see "!hashcmp(E1->hash, E2->hash)" from topics in flight. But because our new rule comes after the existing ones, that should first get converted to "!oidcmp(E1, E2)" and then to "oideq(E1, E2)". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	4a7e27e957	convert "oidcmp() == 0" to oideq() Using the more restrictive oideq() should, in the long run, give the compiler more opportunities to optimize these callsites. For now, this conversion should be a complete noop with respect to the generated code. The result is also perhaps a little more readable, as it avoids the "zero is equal" idiom. Since it's so prevalent in C, I think seasoned programmers tend not to even notice it anymore, but it can sometimes make for awkward double negations (e.g., we can drop a few !!oidcmp() instances here). This patch was generated almost entirely by the included coccinelle patch. This mechanical conversion should be completely safe, because we check explicitly for cases where oidcmp() is compared to 0, which is what oideq() is doing under the hood. Note that we don't have to catch "!oidcmp()" separately; coccinelle's standard isomorphisms make sure the two are treated equivalently. I say "almost" because I did hand-edit the coccinelle output to fix up a few style violations (it mostly keeps the original formatting, but sometimes unwraps long lines). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	14438c4497	introduce hasheq() and oideq() The main comparison functions we provide for comparing object ids are hashcmp() and oidcmp(). These are more flexible than a strict equality check, since they also express ordering. That makes them useful for sorting and binary searching. However, it also makes them potentially slower than a strict equality check. Consider this C code, which is traditionally what our hashcmp has looked like: #include <string.h> int hashcmp(const unsigned char a, const unsigned char b) { return memcmp(a, b, 20); } Compiling with "gcc -O2 -S -fverbose-asm", the generated assembly shows that we actually call memcmp(). But if we change this to a strict equality check: return !memcmp(a, b, 20); we get a faster inline version: movq (%rdi), %rax # MEM[(void )a_4(D)], MEM[(void )a_4(D)] movq 8(%rdi), %rdx # MEM[(void )a_4(D)], tmp101 xorq (%rsi), %rax # MEM[(void )b_5(D)], tmp94 xorq 8(%rsi), %rdx # MEM[(void )b_5(D)], tmp93 orq %rax, %rdx # tmp94, tmp93 jne .L2 #, movl 16(%rsi), %eax # MEM[(void )b_5(D)], tmp104 cmpl %eax, 16(%rdi) # tmp104, MEM[(void *)a_4(D)] je .L5 #, Obviously our hashcmp() doesn't include the "!". But because it's an inline function, optimizing compilers are able to see "!hashcmp(a,b)" in calling code and take advantage of this case. So there has been no value thus far in introducing a more restricted interface for doing strict equality checks. But as Git learns about more values for the_hash_algo, our hashcmp() will grow more complicated and may even delegate at runtime to functions optimized specifically for that hash size. That breaks the inline connection we have, and the compiler will have to assume that the caller really cares about the sign and magnitude of the memcmp() result, even though the vast majority don't. We can solve that by introducing a hasheq() function (and matching oideq() wrapper), which callers can use to make it clear that they only care about equality. For now, the implementation will literally be "!hashcmp()", but it frees us up later to introduce code optimized specifically for the equality check. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	183a638b7d	hashcmp: assert constant hash size Prior to `509f6f62a4` (cache: update object ID functions for the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a constant size of 20 bytes. Some compilers were able to turn that into a few quad-word comparisons, which is faster than actually calling memcmp(). In `509f6f62a4`, we started using the_hash_algo->rawsz instead. Even though this will always be 20, the compiler doesn't know that while inlining hashcmp() and ends up just generating a call to memcmp(). Eventually we'll have to deal with multiple hash sizes, but for the upcoming v2.19, we can restore some of the original performance by asserting on the size. That gives the compiler enough information to know that the memcmp will always be called with a length of 20, and it performs the same optimization. Here are numbers for p0001.2 run against linux.git on a few versions. This is using -O2 with gcc 8.2.0. Test v2.18.0 v2.19.0-rc0 HEAD ------------------------------------------------------------------------------ 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0% You can see that v2.19 is a little slower than v2.18. This commit ended up slightly faster than v2.18, but there's a fair bit of run-to-run noise (the generated code in the two cases is basically the same). This patch does seem to be consistently 1-2% faster than v2.19. I tried changing hashcpy(), which was also touched by `509f6f62a4`, in the same way, but couldn't measure any speedup. Which makes sense, at least for this workload. A traversal of the whole commit graph requires looking up every entry of every tree via lookup_object(). That's many multiples of the numbers of objects in the repository (most of the lookups just return "yes, we already saw that object"). [jn: verified using "make object.s" that the memcmp call goes away.] Reported-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Jeff King <peff@peff.net Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-23 06:20:58 -07:00
Junio C Hamano	5ade034464	Merge branch 'en/incl-forward-decl' Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations	2018-08-20 12:41:32 -07:00
Junio C Hamano	0c54cdaf65	Merge branch 'jk/for-each-object-iteration' The API to iterate over all objects learned to optionally list objects in the order they appear in packfiles, which helps locality of access if the caller accesses these objects while as objects are enumerated. * jk/for-each-object-iteration: for_each__object: move declarations to object-store.h cat-file: use a single strbuf for all output cat-file: split batch "buf" into two variables cat-file: use oidset check-and-insert cat-file: support "unordered" output for --batch-all-objects cat-file: rename batch_{loose,packed}_object callbacks t1006: test cat-file --batch-all-objects with duplicates for_each_packed_object: support iterating in pack-order for_each__object: give more comprehensive docstrings for_each__object: take flag arguments as enum for_each__object: store flag definitions in a single location	2018-08-20 11:33:52 -07:00
Duy Nguyen	b878579ae7	clone: report duplicate entries on case-insensitive filesystems Paths that only differ in case work fine in a case-sensitive filesystems, but if those repos are cloned in a case-insensitive one, you'll get problems. The first thing to notice is "git status" will never be clean with no indication what exactly is "dirty". This patch helps the situation a bit by pointing out the problem at clone time. Even though this patch talks about case sensitivity, the patch makes no assumption about folding rules by the filesystem. It simply observes that if an entry has been already checked out at clone time when we're about to write a new path, some folding rules are behind this. In the case that we can't rely on filesystem (via inode number) to do this check, fall back to fspathcmp() which is not perfect but should not give false positives. This patch is tested with vim-colorschemes and Sublime-Gitignore repositories on a JFS partition with case insensitive support on Linux. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-17 12:10:37 -07:00
Junio C Hamano	30cf1911e2	Merge branch 'js/vscode' Add a script (in contrib/) to help users of VSCode work better with our codebase. * js/vscode: vscode: let cSpell work on commit messages, too vscode: add a dictionary for cSpell vscode: use 8-space tabs, no trailing ws, etc for Git's source code vscode: wrap commit messages at column 72 by default vscode: only overwrite C/C++ settings mingw: define WIN32 explicitly cache.h: extract enum declaration from inside a struct declaration vscode: hard-code a couple defines contrib: add a script to initialize VS Code configuration	2018-08-15 15:08:26 -07:00
Junio C Hamano	1689c22c1c	Merge branch 'jk/core-use-replace-refs' A new configuration variable core.usereplacerefs has been added, primarily to help server installations that want to ignore the replace mechanism altogether. * jk/core-use-replace-refs: add core.usereplacerefs config option check_replace_refs: rename to read_replace_refs check_replace_refs: fix outdated comment	2018-08-15 15:08:23 -07:00
Elijah Newren	e730b81df6	Move definition of enum branch_track from cache.h to branch.h 'branch_track' feels more closely related to branching, and it is needed later in branch.h; rather than #include'ing cache.h in branch.h for this small enum, just move the enum and the external declaration for git_branch_track to branch.h. Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-15 11:52:09 -07:00
Jeff King	0889aae1cd	for_each__object: move declarations to object-store.h The for_each_loose_object() and for_each_packed_object() functions are meant to be part of a unified interface: they use the same set of for_each_object_flags, and it's not inconceivable that we might one day add a single for_each_object() wrapper around them. Let's put them together in a single file, so we can avoid awkwardness like saying "the flags for this function are over in cache.h". Moving the loose functions to packfile.h is silly. Moving the packed functions to cache.h works, but makes the "cache.h is a kitchen sink" problem worse. The best place is the recently-created object-store.h, since these are quite obviously related to object storage. The for_each__in_objdir() functions do not use the same flags, but they are logically part of the same interface as for_each_loose_object(), and share callback signatures. So we'll move those, as well, as they also make sense in object-store.h. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-14 12:29:57 -07:00
Jeff King	736eb88fdc	for_each_packed_object: support iterating in pack-order We currently iterate over objects within a pack in .idx order, which uses the object hashes. That means that it is effectively random with respect to the location of the object within the pack. If you're going to access the actual object data, there are two reasons to move linearly through the pack itself: 1. It improves the locality of access in the packfile. In the cold-cache case, this may mean fewer disk seeks, or better usage of disk cache. 2. We store related deltas together in the packfile. Which means that the delta base cache can operate much more efficiently if we visit all of those related deltas in sequence, as the earlier items are likely to still be in the cache. Whereas if we visit the objects in random order, our cache entries are much more likely to have been evicted by unrelated deltas in the meantime. So in general, if you're going to access the object contents pack order is generally going to end up more efficient. But if you're simply generating a list of object names, or if you're going to end up sorting the result anyway, you're better off just using the .idx order, as finding the pack order means generating the in-memory pack-revindex. According to the numbers in `8b8dfd5132` (pack-revindex: radix-sort the revindex, 2013-07-11), that takes about 200ms for linux.git, and 20ms for git.git (those numbers are a few years old but are still a good ballpark). That makes it a good optimization for some cases (we can save tens of seconds in git.git by having good locality of delta access, for a 20ms cost), but a bad one for others (e.g., right now "cat-file --batch-all-objects --batch-check="%(objectname)" is 170ms in git.git, so adding 20ms to that is noticeable). Hence this patch makes it an optional flag. You can't actually do any interesting timings yet, as it's not plumbed through to any user-facing tools like cat-file. That will come in a later patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:28 -07:00
Jeff King	8b36155190	for_each_*_object: give more comprehensive docstrings We already mention the local/alternate behavior of these functions, but we can help clarify a few other behaviors: - there's no need to mention LOCAL_ONLY specifically, since we already reference the flags by type (and as we add more flags, we don't want to have to mention each) - clarify that reachability doesn't matter here; this is all accessible objects - what ordering/uniqueness guarantees we give - how pack-specific flags are handled for the loose case Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:26 -07:00
Jeff King	a7ff6f5a0f	for_each_*_object: take flag arguments as enum It's not wrong to pass our flags in an "unsigned", as we know it will be at least as large as the enum. However, using the enum in the declaration makes it more obvious where to find the list of flags. While we're here, let's also drop the "extern" noise-words from the declarations, per our modern coding style. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:25 -07:00
Jeff King	202e7f1e16	for_each_*_object: store flag definitions in a single location These flags were split between cache.h and packfile.h, because some of the flags apply only to packs. However, they share a single numeric namespace, since both are respected for the packed variant. Let's make sure they're defined together so that nobody accidentally adds a new flag in one location that duplicates the other. While we're here, let's also put them in an enum (which helps debugger visibility) and use "(1<<n)" rather than counting powers of 2 manually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:47:50 -07:00
Alban Gruin	2aed01811d	editor: add a function to launch the sequence editor As part of the rewrite of interactive rebase, the sequencer will need to open the sequence editor to allow the user to edit the todo list. Instead of duplicating the existing launch_editor() function, this refactors it to a new function, launch_specified_editor(), which takes the editor as a parameter, in addition to the path, the buffer and the environment variables. launch_sequence_editor() is then added to launch the sequence editor. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-10 11:56:22 -07:00
Junio C Hamano	78a72ad4f8	Merge branch 'jt/commit-graph-per-object-store' The singleton commit-graph in-core instance is made per in-core repository instance. * jt/commit-graph-per-object-store: commit-graph: add repo arg to graph readers commit-graph: store graph in struct object_store commit-graph: add free_commit_graph commit-graph: add missing forward declaration object-store: add missing include commit-graph: refactor preparing commit graph	2018-08-02 15:30:47 -07:00

1 2 3 4 5 ...

1957 Commits