* js/c-merge-recursive: (21 commits)
discard_cache(): discard index, even if no file was mmap()ed
merge-recur: do not die unnecessarily
merge-recur: try to merge older merge bases first
merge-recur: if there is no common ancestor, fake empty one
merge-recur: do not setenv("GIT_INDEX_FILE")
merge-recur: do not call git-write-tree
merge-recursive: fix rename handling
.gitignore: git-merge-recur is a built file.
merge-recur: virtual commits shall never be parsed
merge-recur: use the unpack_trees() interface instead of exec()ing read-tree
merge-recur: fix thinko in unique_path()
Makefile: git-merge-recur depends on xdiff libraries.
merge-recur: Explain why sha_eq() and struct stage_data cannot go
merge-recur: Cleanup last mixedCase variables...
merge-recur: Fix compiler warning with -pedantic
merge-recur: Remove dead code
merge-recur: Get rid of debug code
merge-recur: Convert variable names to lower_case
Cumulative update of merge-recursive in C
recur vs recursive: help testing without touching too many stuff.
...
This is an evil merge that removes TEST script from the toplevel.
Introduces global inline:
hashcmp(const unsigned char *sha1, const unsigned char *sha2)
Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/racy:
Remove the "delay writing to avoid runtime penalty of racy-git avoidance"
Add check program "git-check-racy"
Documentation/technical/racy-git.txt
avoid nanosleep(2)
The work-around should not be needed. Even if it turns out we
would want it later, git will remember the patch for us ;-).
Signed-off-by: Junio C Hamano <junkio@cox.net>
This will help counting the racily clean paths, but it should be
useless for daily use. Do not even enable it in the makefile.
Signed-off-by: Junio C Hamano <junkio@cox.net>
[jc: I needed to hand merge the changes to the updated codebase,
so the result needs to be checked.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
On Solaris nanosleep(2) is not available in libc; you need to
link with -lrt to get it.
The purpose of the loop is to wait until the next filesystem
timestamp granularity, and the code uses subsecond sleep in the
hope that it can shorten the delay to 0.5 seconds on average
instead of a full second. It is probably not worth depending on
an extra library for this.
We might want to yank out the whole "racy-git avoidance is
costly later at runtime, so let's delay writing the index out"
codepath later, but that is a separate issue and needs some
testing on large trees to figure it out. After playing with the
kernel tree, I have a feeling that the whole thing may not be
worth it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since add_cacheinfo() can be called without a mapped index file,
discard_cache() _has_ to discard the entries, even when
cache_mmap == NULL.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Instead of looping over the entries and writing out, use a
separate loop after all entries have been written out to check
how many entries are racily clean. Make sure that the newly
created index file gets the right timestamp when we check by
flushing the buffered data by ce_write().
Signed-off-by: Junio C Hamano <junkio@cox.net>
Immediately after a bulk checkout, most of the paths in the
working tree would have the same timestamp as the index file,
and this would force ce_match_stat() to take slow path for all
of them. When writing an index file out, if many of the paths
have very new (read: the same timestamp as the index file being
written out) timestamp, we are better off delaying the return
from the command, to make sure that later command to touch the
working tree files will leave newer timestamps than recorded in
the index, thereby avoiding to take the slow path.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Doing an "strace" on "git diff" shows that we close() a file descriptor
twice (getting EBADFD on the second one) when we end up in ce_compare_data
if the index does not match the checked-out stat information.
The "index_fd()" function will already have closed the fd for us, so we
should not close it again.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* js/read-tree: (107 commits)
read-tree: move merge functions to the library
read-trees: refactor the unpack_trees() part
tar-tree: illustrate an obscure feature better
git.c: allow alias expansion without a git directory
setup_git_directory_gently: do not barf when GIT_DIR is given.
Build on Debian GNU/kFreeBSD
Call setup_git_directory() much earlier
Call setup_git_directory() early
Display an error from update-ref if target ref name is invalid.
Fix http-fetch
t4103: fix binary patch application test.
git-apply -R: binary patches are irreversible for now.
Teach git-apply about '-R'
Makefile: ssh-pull.o depends on ssh-fetch.c
log and diff family: honor config even from subdirectories
git-reset: detect update-ref error and report it.
lost-found: use fsck-objects --full
Teach git-http-fetch the --stdin switch
Teach git-local-fetch the --stdin switch
Make pull() support fetching multiple targets at once
...
This also moves add_file_to_index() to read-cache.c. Oh, and while
touching builtin-add.c, it also removes a duplicate git_config() call.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This backports the pieces that are not uncooked from the merge-recursive
WIP we have seen earlier, to be used in git-mv rewritten in C.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is just an update for people being interested. Alex and me were
busy with that project for a few days now. While it has progressed nicely,
there are quite a couple TODOs in merge-recursive.c, just search for "TODO".
For impatient people: yes, it passes all the tests, and yes, according
to the evil test Alex did, it is faster than the Python script.
But no, it is not yet finished. Biggest points are:
- there are still three external calls
- in the end, it should not be necessary to write the index more than once
(just before exiting)
- a lot of things can be refactored to make the code easier and shorter
BTW we cannot just plug in git-merge-tree yet, because git-merge-tree
does not handle renames at all.
This patch is meant for testing, and as such,
- it compile the program to git-merge-recur
- it adjusts the scripts and tests to use git-merge-recur instead of
git-merge-recursive
- it provides "TEST", a script to execute the tests regarding -recursive
- it inlines the changes to read-cache.c (read_cache_from(), discard_cache()
and refresh_cache_entry())
Brought to you by Alex Riesen and Dscho
Signed-off-by: Junio C Hamano <junkio@cox.net>
This doesn't make the code uglier or harder to read, yet it makes the
code more portable. This also simplifies checking for other potential
incompatibilities. "gcc -std=c89 -pedantic" can flag many incompatible
constructs as warnings, but C99 comments will cause it to emit an error.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in
various ways. Usually the strategy that required the least changes was used.
Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/dirwalk-n-cache-tree: (212 commits)
builtin-rm: squelch compiler warnings.
Add builtin "git rm" command
Move pathspec matching from builtin-add.c into dir.c
Prevent bogus paths from being added to the index.
builtin-add: fix unmatched pathspec warnings.
Remove old "git-add.sh" remnants
builtin-add: warn on unmatched pathspecs
Do "git add" as a builtin
Clean up git-ls-file directory walking library interface
libify git-ls-files directory traversal
Add a conversion tool to migrate remote information into the config
fetch, pull: ask config for remote information
Fix build procedure for builtin-init-db
read-tree -m -u: do not overwrite or remove untracked working tree files.
apply --cached: do not check newly added file in the working tree
Implement a --dry-run option to git-quiltimport
Implement git-quiltimport
Revert "builtin-grep: workaround for non GNU grep."
builtin-grep: workaround for non GNU grep.
builtin-grep: workaround for non GNU grep.
...
In the "next" branch, write_index_ext_header() writes garbage on a
64-bit big-endian machine; the written index file will be unreadable.
I noticed this on NetBSD/sparc64. Reproducible with:
$ git init-db
$ :>file
$ git-update-index --add file
$ git-write-tree
$ git-update-index
error: index uses extension, which we do not understand
fatal: index file corrupt
Signed-off-by: Dennis Stosberg <dennis@stosberg.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This commit is what this branch is all about. It records the
evil merge needed to adjust built-in git-add and git-rm for
the cache-tree extension.
* lt/dirwalk:
Add builtin "git rm" command
Move pathspec matching from builtin-add.c into dir.c
Prevent bogus paths from being added to the index.
builtin-add: fix unmatched pathspec warnings.
Remove old "git-add.sh" remnants
builtin-add: warn on unmatched pathspecs
Do "git add" as a builtin
Clean up git-ls-file directory walking library interface
libify git-ls-files directory traversal
Conflicts:
Makefile
builtin.h
git.c
update-index.c
* jc/cache-tree: (24 commits)
Fix crash when reading the empty tree
fsck-objects: do not segfault on missing tree in cache-tree
cache-tree: a bit more debugging support.
read-tree: invalidate cache-tree entry when a new index entry is added.
Fix test-dump-cache-tree in one-tree disappeared case.
fsck-objects: mark objects reachable from cache-tree
cache-tree: replace a sscanf() by two strtol() calls
cache-tree.c: typefix
test-dump-cache-tree: validate the cached data as well.
cache_tree_update: give an option to update cache-tree only.
read-tree: teach 1-way merege and plain read to prime cache-tree.
read-tree: teach 1 and 2 way merges about cache-tree.
update-index: when --unresolve, smudge the relevant cache-tree entries.
test-dump-cache-tree: report number of subtrees.
cache-tree: sort the subtree entries.
Teach fsck-objects about cache-tree.
index: make the index file format extensible.
cache-tree: protect against "git prune".
Add test-dump-cache-tree
Use cache-tree in update-index.
...
This cleans up and libifies the "git update-index --[really-]refresh"
functionality. This will be eventually required for eventually doing the
"commit" and "status" commands as built-ins.
It really just moves "refresh_index()" from update-index.c to
read-cache.c, but it also has to change the calling convention so that the
function uses a "unsigned int flags" argument instead of various static
flags variables for passing down the information about whether to be quiet
or not, and allow unmerged entries etc.
That actually cleans up update-index.c too, since it turns out that all
those flags were really specific to that one function of the index update,
so they shouldn't have had file-scope visibility even before.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With this one, it's now a fatal error to try to add a pathname
that cannot be added with "git add", i.e.
[torvalds@g5 git]$ git add .git/config
fatal: unable to add .git/config to index
and
[torvalds@g5 git]$ git add foo/../bar
fatal: unable to add foo/../bar to index
instead of the old "Ignoring path xyz" warning that would end up
silently succeeding on any other paths.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Elsewhere we use xcalloc(); we should consistently do so.
Signed-off-by: Yakov Lerner <iler.ml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
read_cache_1() and write_cache_1() takes an extra parameter
*sha1 that returns the checksum of the index file when non-NULL.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds "assume unchanged" logic, started by this message in the list
discussion recently:
<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>
This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of. On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.
You can use two new options to update-index to set and reset the
CE_VALID bit:
git-update-index --assume-unchanged path...
git-update-index --no-assume-unchanged path...
These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file. Nor they add a new
entry to the index.
When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:
- update-index to explicitly register the current object name to the
index file.
- when update-index --refresh finds the path to be up-to-date.
- when tools like read-tree -u and apply --index update the working
tree file and register the current object name to the index file.
The flag is dropped upon read-tree that does not check out the index
entry. This happens regardless of the core.ignorestat settings.
Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time. However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:
- while "git-read-tree -m" or git-apply need to make sure
that the paths involved in the merge do not have local
modifications. This sacrifices performance for safety.
- when git-checkout-index -f -q -u -a tries to see if it needs
to checkout the paths. Otherwise you can never check
anything out ;-).
- when git-update-index --really-refresh (a new flag) tries to
see if the index entry is up to date. You can start with
everything marked as CE_VALID and run this once to drop
CE_VALID bit for paths that are modified.
Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".
This version is not expected to be perfect. I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.
But the basics seem to work, and ready to be tested by people
who asked for this feature.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a tricky code and warrants extra commenting. I wasted
30 minutes trying to break it until I realized why it works.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The previous round caught the most trivial case well, but broke
down once index file is updated again. Smudge problematic
entries (they should be very few if any under normal interactive
workflow) before writing a new index file out.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes the longstanding "Racy GIT" problem, which was pretty
much there from the beginning of time, but was first
demonstrated by Pasky in this message on October 24, 2005:
http://marc.theaimsgroup.com/?l=git&m=113014629716878
If you run the following sequence of commands:
echo frotz >infocom
git update-index --add infocom
echo xyzzy >infocom
so that the second update to file "infocom" does not change
st_mtime, what is recorded as the stat information for the cache
entry "infocom" exactly matches what is on the filesystem
(owner, group, inum, mtime, ctime, mode, length). After this
sequence, we incorrectly think "infocom" file still has string
"frotz" in it, and get really confused. E.g. git-diff-files
would say there is no change, git-update-index --refresh would
not even look at the filesystem to correct the situation.
Some ways of working around this issue were already suggested by
Linus in the same thread on the same day, including waiting
until the next second before returning from update-index if a
cache entry written out has the current timestamp, but that
means we can make at most one commit per second, and given that
the e-mail patch workflow used by Linus needs to process at
least 5 commits per second, it is not an acceptable solution.
Linus notes that git-apply is primarily used to update the index
while processing e-mailed patches, which is true, and
git-apply's up-to-date check is fooled by the same problem but
luckily in the other direction, so it is not really a big issue,
but still it is disturbing.
The function ce_match_stat() is called to bypass the comparison
against filesystem data when the stat data recorded in the cache
entry matches what stat() returns from the filesystem. This
patch tackles the problem by changing it to actually go to the
filesystem data for cache entries that have the same mtime as
the index file itself. This works as long as the index file and
working tree files are on the filesystems that share the same
monotonic clock. Files on network mounted filesystems sometimes
get skewed timestamps compared to "date" output, but as long as
working tree files' timestamps are skewed the same way as the
index file's, this approach still works. The only problematic
files are the ones that have the same timestamp as the index
file's, because two file updates that sandwitch the index file
update must happen within the same second to trigger the
problem.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This starts using the "user.name" and "user.email" config variables if
they exist as the default name and email when committing. This means
that you don't have to use the GIT_COMMITTER_EMAIL environment variable
to override your email - you can just edit the config file instead.
The patch looks bigger than it is because it makes the default name and
email information non-static and renames it appropriately. And it moves
the common git environment variables into a new library file, so that
you can link against libgit.a and get the git environment without having
to link in zlib and libcrypt.
In short, most of it is renaming and moving, the real change core is
just a few new lines in "git_default_config()" that copies the user
config values to the new base.
It also changes "git-var -l" to list the config variables.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With "[core] filemode = false", you can tell git to ignore
differences in the working tree file only in executable bit.
* "git-update-index --refresh" does not say "needs update" if index
entry and working tree file differs only in executable bit.
* "git-update-index" on an existing path takes executable bit
from the existing index entry, if the path and index entry are
both regular files.
* "git-diff-files" and "git-diff-index" without --cached flag
pretend the path on the filesystem has the same executable
bit as the existing index entry, if the path and index entry
are both regular files.
If you are on a filesystem with unreliable mode bits, you may need to
force the executable bit after registering the path in the index.
* "git-update-index --chmod=+x foo" flips the executable bit of the
index file entry for path "foo" on. Use "--chmod=-x" to flip it
off.
Note that --chmod only works in index file and does not look at nor
update the working tree.
So if you are on a filesystem and do not have working executable bit,
you would do:
1. set the appropriate .git/config option;
2. "git-update-index --add new-file.c"
3. "git-ls-files --stage new-file.c" to see if it has the desired
mode bits. If not, e.g. to drop executable bit picked up from the
filesystem, say "git-update-index --chmod=-x new-file.c".
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a first cut at a very simple parser for a git config file.
The format of the file is a simple ini-file like thing, with simple
variable/value pairs. You can (and should) make the variables have a
simple single-level scope, ie a valid file looks something like this:
#
# This is the config file, and
# a '#' or ';' character indicates
# a comment
#
; core variables
[core]
; Don't trust file modes
filemode = false
; Our diff algorithm
[diff]
external = "/usr/local/bin/gnu-diff -u"
renames = true
which parses into three variables: "core.filemode" is associated with the
string "false", and "diff.external" gets the appropriate quoted value.
Right now we only react to one variable: "core.filemode" is a boolean that
decides if we should care about the 0100 (user-execute) bit of the stat
information. Even that is just a parsing demonstration - this doesn't
actually implement that st_mode compare logic itself.
Different programs can react to different config options, although they
should always fall back to calling "git_default_config()" on any config
option name that they don't recognize.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Instead of "git status" ignoring (and hiding) potential errors from the
"git-update-index" call, make it exit if it fails, and show the error.
In order to do this, use the "-q" flag (to ignore not-up-to-date files)
and add a new "--unmerged" flag that allows unmerged entries in the index
without any errors.
This also avoids marking the index "changed" if an entry isn't actually
modified, and makes sure that we exit with an understandable error message
if the index is corrupt or unreadable. "read_cache()" no longer returns an
error for the caller to check.
Finally, make die() and usage() exit with recognizable error codes, if we
ever want to check the failure reason in scripts.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a long overdue clean-up to the code for parsing and passing
diff options. It also tightens some constness issues.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add -m/--modified to show files that have been modified wrt. the index.
[jc: The original came from Brian Gerst on Sep 1st but it only checked
if the paths were cache dirty without actually checking the files were
modified. I also added the usage string and a new test.]
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes up usage of ".." (without an ending slash) and "." (with or
without the ending slash) in the git diff family.
It also fixes pathspec matching for the case of an empty pathspec, since a
"." in the top-level directory (or enough ".." under subdirectories) will
result in an empty pathspec. We used to not match it against anything, but
it should in fact match everything.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
I have reviewed all occurrences of mmap() in git and fixed three types
of errors/defects:
1) The result is not checked.
2) The file descriptor is closed if mmap() succeeds, but not when it
fails.
3) Various casts applied to -1 are used instead of MAP_FAILED, which is
specifically defined to check mmap() return value.
[jc: This is a second round of Pavel's patch. He fixed up the problem
that close() potentially clobbering the errno from mmap, which
the first round had.]
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
... and make git-diff-files use it too. This all _should_ make the
diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the
path matching early interally.
An earlier change to optimize directory-file conflict check
broke what "read-tree --emu23" expects. This is fixed by this
commit.
(1) Introduces an explicit flag to tell add_cache_entry() not to
check for conflicts and use it when reading an existing tree
into an empty stage --- by definition this case can never
introduce such conflicts.
(2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name()
aware of the cache stages, and flag conflict only with paths
in the same stage.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is (imho) more readable, and is also a lot faster. The expense of
looking up sub-directory beginnings was killing us on things like
"git-diff-cache", even though that one didn't even care at all about the
file vs directory conflicts.
We really only care when somebody tries to add a conflicting name to
stage 0.
We should go through the conflict rules more carefully some day.
When we choose to omit deleted entries, we should subtract
numbers of such entries from the total number in the header.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Oops.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
check_file_directory_conflict can give the wrong answers. This is
because the wrong length is passed to cache_name_pos. The length
passed should be the length of the whole path from the root, not
the length of each path subcomponent.
$ git-init-db
defaulting to local storage area
$ mkdir path && touch path/file
$ git-update-cache --add path/file
$ rm path/file
$ mkdir path/file && touch path/file/f
$ git-update-cache --add path/file/f <-- Conflict ignored
$
Signed-off-by: David Meybohm <dmeybohmlkml@bellsouth.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>