Commit Graph

1707 Commits

Author SHA1 Message Date
Junio C Hamano
023020401d Sync with Git 2.15.2
* maint-2.15:
  Git 2.15.2
  Git 2.14.4
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-22 14:18:06 +09:00
Junio C Hamano
9e0f06d55d Sync with Git 2.14.4
* maint-2.14:
  Git 2.14.4
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-22 14:15:14 +09:00
Junio C Hamano
7b01c71b64 Sync with Git 2.13.7
* maint-2.13:
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-22 14:10:49 +09:00
Jeff King
10ecfa7649 verify_path: disallow symlinks in .gitmodules
There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:

  1. It won't be portable to systems without symlinks.

  2. It may behave inconsistently, since Git may look at
     this file in the index or a tree without bothering to
     resolve any symbolic links. We don't do this _yet_, but
     the config infrastructure is there and it's planned for
     the future.

With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:

  a. A symlinked .gitmodules file may circumvent any fsck
     checks of the content.

  b. Git may read and write from the on-disk file without
     sanity checking the symlink target. So for example, if
     you link ".gitmodules" to "../oops" and run "git
     submodule add", we'll write to the file "oops" outside
     the repository.

Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.

Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-21 23:50:11 -04:00
Johannes Schindelin
e7cb0b4455 is_ntfs_dotgit: match other .git files
When we started to catch NTFS short names that clash with .git, we only
looked for GIT~1. This is sufficient because we only ever clone into an
empty directory, so .git is guaranteed to be the first subdirectory or
file in that directory.

However, even with a fresh clone, .gitmodules is *not* necessarily the
first file to be written that would want the NTFS short name GITMOD~1: a
malicious repository can add .gitmodul0000 and friends, which sorts
before `.gitmodules` and is therefore checked out *first*. For that
reason, we have to test not only for ~1 short names, but for others,
too.

It's hard to just adapt the existing checks in is_ntfs_dotgit(): since
Windows 2000 (i.e., in all Windows versions still supported by Git),
NTFS short names are only generated in the <prefix>~<number> form up to
number 4. After that, a *different* prefix is used, calculated from the
long file name using an undocumented, but stable algorithm.

For example, the short name of .gitmodules would be GITMOD~1, but if it
is taken, and all of ~2, ~3 and ~4 are taken, too, the short name
GI7EBA~1 will be used. From there, collisions are handled by
incrementing the number, shortening the prefix as needed (until ~9999999
is reached, in which case NTFS will not allow the file to be created).

We'd also want to handle .gitignore and .gitattributes, which suffer
from a similar problem, using the fall-back short names GI250A~1 and
GI7D29~1, respectively.

To accommodate for that, we could reimplement the hashing algorithm, but
it is just safer and simpler to provide the known prefixes. This
algorithm has been reverse-engineered and described at
https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but
still available via https://web.archive.org/.

These can be recomputed by running the following Perl script:

-- snip --
use warnings;
use strict;

sub compute_short_name_hash ($) {
        my $checksum = 0;
        foreach (split('', $_[0])) {
                $checksum = ($checksum * 0x25 + ord($_)) & 0xffff;
        }

        $checksum = ($checksum * 314159269) & 0xffffffff;
        $checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000);
        $checksum -= (($checksum * 1152921497) >> 60) * 1000000007;

        return scalar reverse sprintf("%x", $checksum & 0xffff);
}

print compute_short_name_hash($ARGV[0]);
-- snap --

E.g., running that with the argument ".gitignore" will
result in "250a" (which then becomes "gi250a" in the code).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
2018-05-21 23:50:11 -04:00
Junio C Hamano
b0e0fc267b Merge branch 'tg/split-index-fixes' into maint
The split-index mode had a few corner case bugs fixed.

* tg/split-index-fixes:
  travis: run tests with GIT_TEST_SPLIT_INDEX
  split-index: don't write cache tree with null oid entries
  read-cache: fix reading the shared index for other repos
2018-03-22 14:24:10 -07:00
Thomas Gummerer
4bddd98311 split-index: don't write cache tree with null oid entries
In a96d3cc3f6 ("cache-tree: reject entries with null sha1", 2017-04-21)
we made sure that broken cache entries do not get propagated to new
trees.  Part of that was making sure not to re-use an existing cache
tree that includes a null oid.

It did so by dropping the cache tree in 'do_write_index()' if one of
the entries contains a null oid.  In split index mode however, there
are two invocations to 'do_write_index()', one for the shared index
and one for the split index.  The cache tree is only written once, to
the split index.

As we only loop through the elements that are effectively being
written by the current invocation, that may not include the entry with
a null oid in the split index (when it is already written to the
shared index), where we write the cache tree.  Therefore in split
index mode we may still end up writing the cache tree, even though
there is an entry with a null oid in the index.

Fix this by checking for null oids in prepare_to_write_split_index,
where we loop the entries of the shared index as well as the entries for
the split index.

This fixes t7009 with GIT_TEST_SPLIT_INDEX.  Also add a new test that's
more specifically showing the problem.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-19 10:36:39 -08:00
Thomas Gummerer
a125a22334 read-cache: fix reading the shared index for other repos
read_index_from() takes a path argument for the location of the index
file.  For reading the shared index in split index mode however it just
ignores that path argument, and reads it from the gitdir of the current
repository.

This works as long as an index in the_repository is read.  Once that
changes, such as when we read the index of a submodule, or of a
different working tree than the current one, the gitdir of
the_repository will no longer contain the appropriate shared index,
and git will fail to read it.

For example t3007-ls-files-recurse-submodules.sh was broken with
GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
broken in a similar manner, probably by introducing struct repository
there, although I didn't track down the exact commit for that.

be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) breaks with split index mode in a similar
manner, not erroring out when it can't read the index, but instead
carrying on with pruning, without taking the index of the worktree into
account.

Fix this by passing an additional gitdir parameter to read_index_from,
to indicate where it should look for and read the shared index from.

read_cache_from() defaults to using the gitdir of the_repository.  As it
is mostly a convenience macro, having to pass get_git_dir() for every
call seems overkill, and if necessary users can have more control by
using read_index_from().

Helped-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-19 10:36:34 -08:00
Junio C Hamano
b6825b5c8e Merge branch 'ew/empty-merge-with-dirty-index-maint' into ew/empty-merge-with-dirty-index
* ew/empty-merge-with-dirty-index-maint:
  merge-recursive: avoid incorporating uncommitted changes in a merge
  move index_has_changes() from builtin/am.c to merge.c for reuse
  t6044: recursive can silently incorporate dirty changes in a merge
2017-12-22 12:48:38 -08:00
Elijah Newren
b101793c43 move index_has_changes() from builtin/am.c to merge.c for reuse
index_has_changes() is a function we want to reuse outside of just am,
making it also available for merge-recursive and merge-ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-22 12:20:29 -08:00
Junio C Hamano
0c69a132cb Merge branch 'ls/editor-waiting-message'
Git shows a message to tell the user that it is waiting for the
user to finish editing when spawning an editor, in case the editor
opens to a hidden window or somewhere obscure and the user gets
lost.

* ls/editor-waiting-message:
  launch_editor(): indicate that Git waits for user input
  refactor "dumb" terminal determination
2017-12-19 11:33:59 -08:00
Junio C Hamano
8d7fefaac4 Merge branch 'ar/unconfuse-three-dots'
Ancient part of codebase still shows dots after an abbreviated
object name just to show that it is not a full object name, but
these ellipses are confusing to people who newly discovered Git
who are used to seeing abbreviated object names and find them
confusing with the range syntax.

* ar/unconfuse-three-dots:
  t2020: test variations that matter
  t4013: test new output from diff --abbrev --raw
  diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value
  t4013: prepare for upcoming "diff --raw --abbrev" output format change
  checkout: describe_detached_head: remove ellipsis after committish
  print_sha1_ellipsis: introduce helper
  Documentation: user-manual: limit usage of ellipsis
  Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").
2017-12-19 11:33:58 -08:00
Junio C Hamano
721cc4314c Merge branch 'bc/hash-algo'
An infrastructure to define what hash function is used in Git is
introduced, and an effort to plumb that throughout various
codepaths has been started.

* bc/hash-algo:
  repository: fix a sparse 'using integer as NULL pointer' warning
  Switch empty tree and blob lookups to use hash abstraction
  Integrate hash algorithm support with repo setup
  Add structure representing hash algorithm
  setup: expose enumerated repo info
2017-12-13 13:28:54 -08:00
Junio C Hamano
4c6dad0059 Merge branch 'bw/protocol-v1'
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.

* bw/protocol-v1:
  Documentation: document Extra Parameters
  ssh: introduce a 'simple' ssh variant
  i5700: add interop test for protocol transition
  http: tell server that the client understands v1
  connect: tell server that the client understands v1
  connect: teach client to recognize v1 server response
  upload-pack, receive-pack: introduce protocol version 1
  daemon: recognize hidden request arguments
  protocol: introduce protocol extension mechanisms
  pkt-line: add packet_write function
  connect: in ref advertisement, shallows are last
2017-12-06 09:23:44 -08:00
Lars Schneider
a64f213d3f refactor "dumb" terminal determination
Move the code to detect "dumb" terminals into a single location. This
avoids duplicating the terminal detection code yet again in a subsequent
commit.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-04 09:38:30 -08:00
Ann T Ropea
a2cd709de3 print_sha1_ellipsis: introduce helper
Introduce a helper print_sha1_ellipsis() that pays attention to the
GIT_PRINT_SHA1_ELLIPSIS environment variable, and prepare the tests to
unconditionally set it for the test pieces that will be broken once the code
stops showing the extra dots by default.

The removal of these dots is merely a plan at this step and has not happened
yet but soon will.

Document GIT_PRINT_SHA1_ELLIPSIS.

Signed-off-by: Ann T Ropea <bedhanger@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-04 08:25:35 -08:00
Junio C Hamano
af6e0fe3a5 Merge branch 'tb/add-renormalize'
"git add --renormalize ." is a new and safer way to record the fact
that you are correcting the end-of-line convention and other
"convert_to_git()" glitches in the in-repository data.

* tb/add-renormalize:
  add: introduce "--renormalize"
2017-11-27 11:06:37 +09:00
Junio C Hamano
c9fdbca92c Merge branch 'av/fsmonitor'
Various fixes to bp/fsmonitor topic.

* av/fsmonitor:
  fsmonitor: simplify determining the git worktree under Windows
  fsmonitor: store fsmonitor bitmap before splitting index
  fsmonitor: read from getcwd(), not the PWD environment variable
  fsmonitor: delay updating state until after split index is merged
  fsmonitor: document GIT_TRACE_FSMONITOR
  fsmonitor: don't bother pretty-printing JSON from watchman
  fsmonitor: set the PWD to the top of the working tree
2017-11-21 14:07:51 +09:00
Junio C Hamano
e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Torsten Bögershausen
9472935d81 add: introduce "--renormalize"
Make it safer to normalize the line endings in a repository.
Files that had been commited with CRLF will be commited with LF.

The old way to normalize a repo was like this:

 # Make sure that there are not untracked files
 $ echo "* text=auto" >.gitattributes
 $ git read-tree --empty
 $ git add .
 $ git commit -m "Introduce end-of-line normalization"

The user must make sure that there are no untracked files,
otherwise they would have been added and tracked from now on.

The new "add --renormalize" does not add untracked files:

 $ echo "* text=auto" >.gitattributes
 $ git add --renormalize .
 $ git commit -m "Introduce end-of-line normalization"

Note that "git add --renormalize <pathspec>" is the short form for
"git add -u --renormalize <pathspec>".

While at it, document that the same renormalization may be needed,
whenever a clean filter is added or changed.

Helped-By: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-17 10:31:05 +09:00
Junio C Hamano
e539a83455 Merge branch 'bp/read-index-from-skip-verification'
Drop (perhaps overly cautious) sanity check before using the index
read from the filesystem at runtime.

* bp/read-index-from-skip-verification:
  read_index_from(): speed index loading by skipping verification of the entry order
2017-11-15 12:14:37 +09:00
brian m. carlson
eb0ccfd7f5 Switch empty tree and blob lookups to use hash abstraction
Switch the uses of empty_tree_oid and empty_blob_oid to use the
current_hash abstraction that represents the current hash algorithm in
use.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-13 13:20:44 +09:00
brian m. carlson
78a6766802 Integrate hash algorithm support with repo setup
In future versions of Git, we plan to support an additional hash
algorithm.  Integrate the enumeration of hash algorithms with repository
setup, and store a pointer to the enumerated data in struct repository.
Of course, we currently only support SHA-1, so hard-code this value in
read_repository_format.  In the future, we'll enumerate this value from
the configuration.

Add a constant, the_hash_algo, which points to the hash_algo structure
pointer in the repository global.  Note that this is the hash which is
used to serialize data to disk, not the hash which is used to display
items to the user.  The transition plan anticipates that these may be
different.  We can add an additional element in the future (say,
ui_hash_algo) to provide for this case.

Include repository.h in cache.h since we now need to have access to
these struct and variable definitions.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-13 13:20:44 +09:00
Junio C Hamano
bde1370010 Merge branch 'rs/hex-to-bytes-cleanup'
Code cleanup.

* rs/hex-to-bytes-cleanup:
  sha1_file: use hex_to_bytes()
  http-push: use hex_to_bytes()
  notes: move hex_to_bytes() to hex.c and export it
2017-11-09 14:31:27 +09:00
Ben Peart
00ec50e56d read_index_from(): speed index loading by skipping verification of the entry order
There is code in post_read_index_from() to catch out of order
entries when reading an index file.  This order verification is ~13%
of the cost of every call to read_index_from().

Update check_ce_order() so that it skips this verification unless
the "verify_ce_order" global variable is set.

Teach fsck to force this verification.

The effect can be seen using t/perf/p0002-read-cache.sh:

Test                                          HEAD              HEAD~1
--------------------------------------------------------------------------------------
0002.1: read_cache/discard_cache 1000 times   0.41(0.04+0.04)   0.50(0.00+0.10) +22.0%

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08 10:39:41 +09:00
Junio C Hamano
0b646bcac9 Merge branch 'ma/lockfile-fixes'
An earlier update made it possible to use an on-stack in-core
lockfile structure (as opposed to having to deliberately leak an
on-heap one).  Many codepaths have been updated to take advantage
of this new facility.

* ma/lockfile-fixes:
  read_cache: roll back lock in `update_index_if_able()`
  read-cache: leave lock in right state in `write_locked_index()`
  read-cache: drop explicit `CLOSE_LOCK`-flag
  cache.h: document `write_locked_index()`
  apply: remove `newfd` from `struct apply_state`
  apply: move lockfile into `apply_state`
  cache-tree: simplify locking logic
  checkout-index: simplify locking logic
  tempfile: fix documentation on `delete_tempfile()`
  lockfile: fix documentation on `close_lock_file_gently()`
  treewide: prefer lockfiles on the stack
  sha1_file: do not leak `lock_file`
2017-11-06 13:11:21 +09:00
Alex Vandiver
ba1b9caca6 fsmonitor: delay updating state until after split index is merged
If the fsmonitor extension is used in conjunction with the split index
extension, the set of entries in the index when it is first loaded is
only a subset of the real index.  This leads to only the non-"base"
index being marked as CE_FSMONITOR_VALID.

Delay the expansion of the ewah bitmap until after tweak_split_index
has been called to merge in the base index as well.

The new fsmonitor_dirty is kept from being leaked by dint of being
cleaned up in post_read_index_from, which is guaranteed to be called
after do_read_index in read_index_from.

Signed-off-by: Alex Vandiver <alexmv@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01 13:28:20 +09:00
René Scharfe
0ec218656a notes: move hex_to_bytes() to hex.c and export it
Make the function for converting pairs of hexadecimal digits to binary
available to other call sites.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01 10:35:35 +09:00
Brandon Williams
19113a26b6 http: tell server that the client understands v1
Tell a server that protocol v1 can be used by sending the http header
'Git-Protocol' with 'version=1' indicating this.

Also teach the apache http server to pass through the 'Git-Protocol'
header as an environment variable 'GIT_PROTOCOL'.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
Brandon Williams
373d70efb2 protocol: introduce protocol extension mechanisms
Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
Martin Ågren
b74c90fb41 read_cache: roll back lock in update_index_if_able()
`update_index_if_able()` used to always commit the lock or roll it back.
Commit 03b866477 (read-cache: new API write_locked_index instead of
write_index/write_cache, 2014-06-13) stopped rolling it back in case a
write was not even attempted. This change in behavior is not motivated
in the commit message and appears to be accidental: the `else`-path was
removed, although that changed the behavior in case the `if` shortcuts.

Reintroduce the rollback and document this behavior. While at it, move
the documentation on this function from the function definition to the
function declaration in cache.h.

If `write_locked_index(..., COMMIT_LOCK)` fails, it will roll back the
lock for us (see the previous commit).

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-07 10:20:56 +09:00
Martin Ågren
df60cf5789 read-cache: leave lock in right state in write_locked_index()
If the original version of `write_locked_index()` returned with an
error, it didn't roll back the lockfile unless the error occured at the
very end, during closing/committing. See commit 03b866477 (read-cache:
new API write_locked_index instead of write_index/write_cache,
2014-06-13).

In commit 9f41c7a6b (read-cache: close index.lock in do_write_index,
2017-04-26), we learned to close the lock slightly earlier in the
callstack. That was mostly a side-effect of lockfiles being implemented
using temporary files, but didn't cause any real harm.

Recently, commit 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05) introduced a subtle bug. If the temporary file is deleted
(i.e., the lockfile is rolled back), the tempfile-pointer in the `struct
lock_file` will be left dangling. Thus, an attempt to reuse the
lockfile, or even just to roll it back, will induce undefined behavior
-- most likely a crash.

Besides not crashing, we clearly want to make things consistent. The
guarantees which the lockfile-machinery itself provides is A) if we ask
to commit and it fails, roll back, and B) if we ask to close and it
fails, do _not_ roll back. Let's do the same for consistency.

Do not delete the temporary file in `do_write_index()`. One of its
callers, `write_locked_index()` will thereby avoid rolling back the
lock. The other caller, `write_shared_index()`, will delete its
temporary file anyway. Both of these callers will avoid undefined
behavior (crashing).

Teach `write_locked_index(..., COMMIT_LOCK)` to roll back the lock
before returning. If we have already succeeded and committed, it will be
a noop. Simplify the existing callers where we now have a superfluous
call to `rollback_lockfile()`. That should keep future readers from
wondering why the callers are inconsistent.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-07 10:20:56 +09:00
Martin Ågren
812d6b0075 read-cache: drop explicit CLOSE_LOCK-flag
`write_locked_index()` takes two flags: `COMMIT_LOCK` and `CLOSE_LOCK`.
At most one is allowed. But it is also possible to use no flag, i.e.,
`0`. But when `write_locked_index()` calls `do_write_index()`, the
temporary file, a.k.a. the lockfile, will be closed. So passing `0` is
effectively the same as `CLOSE_LOCK`, which seems like a bug.

We might feel tempted to restructure the code in order to close the file
later, or conditionally. It also feels a bit unfortunate that we simply
"happen" to close the lock by way of an implementation detail of
lockfiles. But note that we need to close the temporary file before
`stat`-ing it, at least on Windows. See 9f41c7a6b (read-cache: close
index.lock in do_write_index, 2017-04-26).

Drop `CLOSE_LOCK` and make it explicit that `write_locked_index()`
always closes the lock. Whether it is also committed is governed by the
remaining flag, `COMMIT_LOCK`.

This means we neither have nor suggest that we have a mode to write the
index and leave the file open. Whatever extra contents we might
eventually want to write, we should probably write it from within
`write_locked_index()` itself anyway.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-07 10:20:56 +09:00
Martin Ågren
8dc3834610 cache.h: document write_locked_index()
The next patches will tweak the behavior of this function. Document it
in order to establish a basis for those patches.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06 10:07:18 +09:00
Junio C Hamano
d4e93836a6 Merge branch 'jk/no-optional-locks'
Some commands (most notably "git status") makes an opportunistic
update when performing a read-only operation to help optimize later
operations in the same repository.  The new "--no-optional-locks"
option can be passed to Git to disable them.

* jk/no-optional-locks:
  git: add --no-optional-locks option
2017-10-03 15:42:49 +09:00
Ben Peart
883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Junio C Hamano
14a8168e2f Merge branch 'rj/no-sign-compare'
Many codepaths have been updated to squelch -Wsign-compare
warnings.

* rj/no-sign-compare:
  ALLOC_GROW: avoid -Wsign-compare warnings
  cache.h: hex2chr() - avoid -Wsign-compare warnings
  commit-slab.h: avoid -Wsign-compare warnings
  git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings
2017-09-29 11:23:42 +09:00
Jeff King
27344d6a6c git: add --no-optional-locks option
Some tools like IDEs or fancy editors may periodically run
commands like "git status" in the background to keep track
of the state of the repository. Some of these commands may
refresh the index and write out the result in an
opportunistic way: if they can get the index lock, then they
update the on-disk index with any updates they find. And if
not, then their in-core refresh is lost and just has to be
recomputed by the next caller.

But taking the index lock may conflict with other operations
in the repository. Especially ones that the user is doing
themselves, which _aren't_ opportunistic. In other words,
"git status" knows how to back off when somebody else is
holding the lock, but other commands don't know that status
would be happy to drop the lock if somebody else wanted it.

There are a couple possible solutions:

  1. Have some kind of "pseudo-lock" that allows other
     commands to tell status that they want the lock.

     This is likely to be complicated and error-prone to
     implement (and maybe even impossible with just
     dotlocks to work from, as it requires some
     inter-process communication).

  2. Avoid background runs of commands like "git status"
     that want to do opportunistic updates, preferring
     instead plumbing like diff-files, etc.

     This is awkward for a couple of reasons. One is that
     "status --porcelain" reports a lot more about the
     repository state than is available from individual
     plumbing commands. And two is that we actually _do_
     want to see the refreshed index. We just don't want to
     take a lock or write out the result. Whereas commands
     like diff-files expect us to refresh the index
     separately and write it to disk so that they can depend
     on the result. But that write is exactly what we're
     trying to avoid.

  3. Ask "status" not to lock or write the index.

     This is easy to implement. The big downside is that any
     work done in refreshing the index for such a call is
     lost when the process exits. So a background process
     may end up re-hashing a changed file multiple times
     until the user runs a command that does an index
     refresh themselves.

This patch implements the option 3. The idea (and the test)
is largely stolen from a Git for Windows patch by Johannes
Schindelin, 67e5ce7f63 (status: offer *not* to lock the
index and update it, 2016-08-12). The twist here is that
instead of making this an option to "git status", it becomes
a "git" option and matching environment variable.

The reason there is two-fold:

  1. An environment variable is carried through to
     sub-processes. And whether an invocation is a
     background process or not should apply to the whole
     process tree. So you could do "git --no-optional-locks
     foo", and if "foo" is a script or alias that calls
     "status", you'll still get the effect.

  2. There may be other programs that want the same
     treatment.

     I've punted here on finding more callers to convert,
     since "status" is the obvious one to call as a repeated
     background job. But "git diff"'s opportunistic refresh
     of the index may be a good candidate.

The test is taken from 67e5ce7f63, and it's worth repeating
Johannes's explanation:

  Note that the regression test added in this commit does
  not *really* verify that no index.lock file was written;
  that test is not possible in a portable way. Instead, we
  verify that .git/index is rewritten *only* when `git
  status` is run without `--no-optional-locks`.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27 16:11:01 +09:00
Ramsay Jones
356a293f39 cache.h: hex2chr() - avoid -Wsign-compare warnings
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-22 13:00:38 +09:00
Jonathan Nieder
607bd8315c pack: make packed_git_mru global a value instead of a pointer
The MRU cache that keeps track of recently used packs is represented
using two global variables:

	struct mru packed_git_mru_storage;
	struct mru *packed_git_mru = &packed_git_mru_storage;

Callers never assign to the packed_git_mru pointer, though, so we can
simplify by eliminating it and using &packed_git_mru_storage (renamed
to &packed_git_mru) directly.  This variable is only used by the
packfile subsystem, making this a relatively uninvasive change (and
any new unadapted callers would trigger a compile error).

Noticed while moving these globals to the object_store struct.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:05:48 +09:00
Junio C Hamano
f04f860dfa Merge branch 'sb/sha1-file-cleanup' into maint
Code clean-up.

* sb/sha1-file-cleanup:
  sha1_file: make read_info_alternates static
2017-09-10 17:03:04 +09:00
Junio C Hamano
822a4d4178 Merge branch 'jk/hashcmp-memcmp' into maint
Code clean-up.

* jk/hashcmp-memcmp:
  hashcmp: use memcmp instead of open-coded loop
2017-09-10 17:02:59 +09:00
Junio C Hamano
eabdcd4ab4 Merge branch 'jt/packmigrate'
Code movement to make it easier to hack later.

* jt/packmigrate: (23 commits)
  pack: move for_each_packed_object()
  pack: move has_pack_index()
  pack: move has_sha1_pack()
  pack: move find_pack_entry() and make it global
  pack: move find_sha1_pack()
  pack: move find_pack_entry_one(), is_pack_valid()
  pack: move check_pack_index_ptr(), nth_packed_object_offset()
  pack: move nth_packed_object_{sha1,oid}
  pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
  pack: move unpack_object_header()
  pack: move get_size_from_delta()
  pack: move unpack_object_header_buffer()
  pack: move {,re}prepare_packed_git and approximate_object_count
  pack: move install_packed_git()
  pack: move add_packed_git()
  pack: move unuse_pack()
  pack: move use_pack()
  pack: move pack-closing functions
  pack: move release_pack_memory()
  pack: move open_pack_index(), parse_pack_index()
  ...
2017-08-26 22:55:09 -07:00
Junio C Hamano
6b8aa3294e Merge branch 'po/object-id'
* po/object-id:
  sha1_file: convert index_stream to struct object_id
  sha1_file: convert hash_sha1_file_literally to struct object_id
  sha1_file: convert index_fd to struct object_id
  sha1_file: convert index_path to struct object_id
  read-cache: convert to struct object_id
  builtin/hash-object: convert to struct object_id
2017-08-26 22:55:07 -07:00
Junio C Hamano
b6c4058f97 Merge branch 'sb/diff-color-move'
"git diff" has been taught to optionally paint new lines that are
the same as deleted lines elsewhere differently from genuinely new
lines.

* sb/diff-color-move: (25 commits)
  diff: document the new --color-moved setting
  diff.c: add dimming to moved line detection
  diff.c: color moved lines differently, plain mode
  diff.c: color moved lines differently
  diff.c: buffer all output if asked to
  diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY
  diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP
  diff.c: convert word diffing to use emit_diff_symbol
  diff.c: convert show_stats to use emit_diff_symbol
  diff.c: convert emit_binary_diff_body to use emit_diff_symbol
  submodule.c: migrate diff output to use emit_diff_symbol
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF
  diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS}
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN]
  diff.c: migrate emit_line_checked to use emit_diff_symbol
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF
  diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO
  ...
2017-08-26 22:55:03 -07:00
Jonathan Tan
7709f468fd pack: move for_each_packed_object()
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:07 -07:00
Jonathan Tan
f9a8672a81 pack: move has_pack_index()
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:07 -07:00
Jonathan Tan
150e3001d0 pack: move has_sha1_pack()
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:07 -07:00
Jonathan Tan
d6fe0036fd pack: move find_sha1_pack()
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:07 -07:00
Jonathan Tan
a2551953b9 pack: move find_pack_entry_one(), is_pack_valid()
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:07 -07:00