Earlier in commit 0781b8a9b2
(add_file_to_index: skip rehashing if the cached stat already
matches), add_file_to_index() were taught not to re-add the path
if it already matches the index.
The change meant well, but was not executed quite right. It
used ie_modified() to see if the file on the work tree is really
different from the index, and skipped adding the contents if the
function says "not modified".
This was wrong. There are three possible comparison results
between the index and the file in the work tree:
- with lstat(2) we _know_ they are different. E.g. if the
length or the owner in the cached stat information is
different from the length we just obtained from lstat(2), we
can tell the file is modified without looking at the actual
contents.
- with lstat(2) we _know_ they are the same. The same length,
the same owner, the same everything (but this has a twist, as
described below).
- we cannot tell from lstat(2) information alone and need to go
to the filesystem to actually compare.
The last case arises from what we call 'racy git' situation,
that can be caused with this sequence:
$ echo hello >file
$ git add file
$ echo aeiou >file ;# the same length
If the second "echo" is done within the same filesystem
timestamp granularity as the first "echo", then the timestamp
recorded by "git add" and the timestamp we get from lstat(2)
will be the same, and we can mistakenly say the file is not
modified. The path is called 'racily clean'. We need to
reliably detect racily clean paths are in fact modified.
To solve this problem, when we write out the index, we mark the
index entry that has the same timestamp as the index file itself
(that is the time from the point of view of the filesystem) to
tell any later code that does the lstat(2) comparison not to
trust the cached stat info, and ie_modified() then actually goes
to the filesystem to compare the contents for such a path.
That's all good, but it should not be used for this "git add"
optimization, as the goal of "git add" is to actually update the
path in the index and make it stat-clean. With the false
optimization, we did _not_ cause any data loss (after all, what
we failed to do was only to update the cached stat information),
but it made the following sequence leave the file stat dirty:
$ echo hello >file
$ git add file
$ echo hello >file ;# the same contents
$ git add file
The solution is not to use ie_modified() which goes to the
filesystem to see if it is really clean, but instead use
ie_match_stat() with "assume racily clean paths are dirty"
option, to force re-adding of such a path.
There was another problem with "git add -u". The codepath
shares the same issue when adding the paths that are found to be
modified, but in addition, it asked "git diff-files" machinery
run_diff_files() function (which is "git diff-files") to list
the paths that are modified. But "git diff-files" machinery
uses the same ie_modified() call so that it does not report
racily clean _and_ actually clean paths as modified, which is
not what we want.
The patch allows the callers of run_diff_files() to pass the
same "assume racily clean paths are dirty" option, and makes
"git-add -u" codepath to use that option, to discover and re-add
racily clean _and_ actually clean paths.
We could further optimize on top of this patch to differentiate
the case where the path really needs re-adding (i.e. the content
of the racily clean entry was indeed different) and the case
where only the cached stat information needs to be refreshed
(i.e. the racily clean entry was actually clean), but I do not
think it is worth it.
This patch applies to maint and all the way up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ph/parseopt: (24 commits)
gc: use parse_options
Fixed a command line option type for builtin-fsck.c
Make builtin-pack-refs.c use parse_options.
Make builtin-name-rev.c use parse_options.
Make builtin-count-objects.c use parse_options.
Make builtin-fsck.c use parse_options.
Update manpages to reflect new short and long option aliases
Make builtin-for-each-ref.c use parse-opts.
Make builtin-symbolic-ref.c use parse_options.
Make builtin-update-ref.c use parse_options
Make builtin-revert.c use parse_options.
Make builtin-describe.c use parse_options
Make builtin-branch.c use parse_options.
Make builtin-mv.c use parse-options
Make builtin-rm.c use parse_options.
Port builtin-add.c to use the new option parser.
parse-options: allow callbacks to take no arguments at all.
parse-options: Allow abbreviated options when unambiguous
Add shortcuts for very often used options.
parse-options: make some arguments optional, add callbacks.
...
Conflicts:
Makefile
builtin-add.c
* kh/commit:
Export rerere() and launch_editor().
Introduce entry point add_interactive and add_files_to_cache
Enable wt-status to run against non-standard index file.
Enable wt-status output to a given FILE pointer.
* maint:
RelNotes-1.5.3.5: describe recent fixes
merge-recursive.c: mrtree in merge() is not used before set
sha1_file.c: avoid gcc signed overflow warnings
Fix a small memory leak in builtin-add
honor the http.sslVerify option in shell scripts
prune_directory and fill_directory allocated one byte per pathspec and never
freed it.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This refactors builtin-add.c a little to provide a unique entry point
for launching git add --interactive, which will be used by
builtin-commit too. If we later want to make add --interactive a
builtin or change how it is launched, we just start from this function.
It also exports the private function update() which is used to
add all modified paths to the index as add_files_to_cache().
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cr/reset:
Simplify cache API
An additional test for "git-reset -- path"
Make "git reset" a builtin.
Move make_cache_entry() from merge-recursive.c into read-cache.c
Add tests for documented features of "git reset".
An earlier commit fixed type-change case in "git add -u".
This adds a test to make sure we do not introduce regression.
At the same time, it fixes a stupid typo in the error message.
Signed-off-by: Benoit Sigoure <tsuna@lrde.epita.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, add_file_to_index() invalidated the path in the cache-tree
but remove_file_from_cache() did not, and the user of the latter
needed to invalidate the entry himself. This led to a few bugs due to
missed invalidate calls already. This patch makes the management of
cache-tree less error prone by making more invalidate calls from lower
level cache API functions.
The rules are:
- If you are going to write the index, you should either maintain
cache_tree correctly.
- If you cannot, alternatively you can remove the entire cache_tree
by calling cache_tree_free() before you call write_cache().
- When you modify the index, cache_tree_invalidate_path() should be
called with the path you are modifying, to discard the entry from
the cache-tree structure.
- The following cache API functions exported from read-cache.c (and
the macro whose names have "cache" instead of "index")
automatically call cache_tree_invalidate_path() for you:
- remove_file_from_index();
- add_file_to_index();
- add_index_entry();
You can modify the index bypassing the above API functions
(e.g. find an existing cache entry from the index and modify it in
place). You need to call cache_tree_invalidate_path() yourself in
such a case.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently the error message seems to imply (at least to me) that only
the listed files were withheld and the rest of the files was added to the
index, even though that's obviously not the case.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The usage string for the executable was missing --refresh. In
addition, the documentation referred to "file", but the usage string
referred to "filepattern". Updated the documentation to
"filepattern", as git-add does handle patterns.
Signed-off-by: Brian Hetro <whee@smaertness.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-add -u also takes the path limiters, but unlike the
command without the -u option, the code forgot that it
could be invoked from a subdirectory, and did not correctly
handle the prefix.
Signed-off-by: Salikh Zakirov <salikh@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This applies to 'maint' to fix a rather serious data corruption
issue. When "git add -u" affects a subdirectory in such a way
that the only changes to its contents are path removals, the
next tree object written out of that index was bogus, as the
remove codepath forgot to invalidate the cache-tree entry.
Reported by Salikh Zakirov.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows to refresh only a subset of the project files, based on
the specified pathspecs.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling access(p, m) with p == NULL is not specified, so don't do that. On
GNU/Hurd systems doing so will result in a SIGSEGV.
Signed-off-by: Thomas Schwinge <tschwinge@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the code would always set up the excludes, and then manually
pick through the pathspec we were given, assuming that non-added but
existing paths were just ignored. This was mostly correct, but would
erroneously mark a totally empty directory as 'ignored'.
Instead, we now use the collect_ignored option of dir_struct, which
unambiguously tells us whether a path was ignored. This simplifies the
code, and means empty directories are now just not mentioned at all.
Furthermore, we now conditionally ask dir_struct to respect excludes,
depending on whether the '-f' flag has been set. This means we don't have
to pick through the result, checking for an 'ignored' flag; ignored entries
were either added or not in the first place.
We can safely get rid of the special 'ignored' flags to dir_entry, which
were not used anywhere else.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than updating all working tree paths, we limit
ourselves to paths listed on the command line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a shorthand of what "git commit -a" does in preparation
for making a commit, which is:
git diff-files --name-only -z | git update-index --remove -z --stdin
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/index-output:
git-read-tree --index-output=<file>
_GIT_INDEX_OUTPUT: allow plumbing to output to an alternative index file.
Conflicts:
builtin-apply.c
This function was not called "add_file_to_cache()" only because
an ancient program, update-cache, used that name as an internal
function name that does something slightly different. Now that
is gone, we can take over the better name.
The plan is to name all functions that operate on the default
index xxx_cache(). Later patches create a variant of them that
take an explicit parameter xxx_index(), and then turn
xxx_cache() functions into macros that use "the_index".
Signed-off-by: Junio C Hamano <junkio@cox.net>
When defined, this allows plumbing commands that update the
index (add, apply, checkout-index, merge-recursive, mv,
read-tree, rm, update-index, and write-tree) to write their
resulting index to an alternative index file while holding a
lock to the original index file. With this, git-commit that
jumps the index does not have to make an extra copy of the index
file, and more importantly, it can do the update while holding
the lock on the index.
However, I think the interface to let an environment variable
specify the output is a mistake, as shown in the documentation.
If a curious user has the environment variable set to something
other than the file GIT_INDEX_FILE points at, almost everything
will break. This should instead be a command line parameter to
tell these plumbing commands to write the result in the named
file, to prevent stupid mistakes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The way things are set up, you can now pass a "pathspec" to the
"read_directory()" function. If you pass NULL, it acts exactly
like it used to do (read everything). If you pass a non-NULL
pointer, it will simplify it into a "these are the prefixes
without any special characters", and stop any readdir() early if
the path in question doesn't match any of the prefixes.
NOTE! This does *not* obviate the need for the caller to do the *exact*
pathspec match later. It's a first-level filter on "read_directory()", but
it does not do the full pathspec thing. Maybe it should. But in the
meantime, builtin-add.c really does need to do first
read_directory(dir, .., pathspec);
if (pathspec)
prune_directory(dir, pathspec, baselen);
ie the "prune_directory()" part will do the *exact* pathspec pruning,
while the "read_directory()" will use the pathspec just to do some quick
high-level pruning of the directories it will recurse into.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds the 'core.excludesfile' configuration variable. This variable can
hold a path to a file containing patterns of file names to exclude from
git-add, like $GIT_DIR/info/exclude. Patterns in the excludes file are used
in addition to those in info/exclude.
Signed-off-by: James Bowes <jbowes@dangerouslyinc.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When '*.ig' is ignored, and you have two files f.ig and d.ig/foo
in the working tree,
$ git add .
correctly ignored f.ig but failed to ignore d.ig/foo. This was
caused by a thinko in an earlier commit 4888c534, when we tried
to allow adding otherwise ignored files.
After reverting that commit, this takes a much simpler approach.
When we have an unmatched pathspec that talks about an existing
pathname, we know it is an ignored path the user tried to add,
so we include it in the set of paths directory walker returned.
This does not let you say "git add -f D" on an ignored directory
D and add everything under D. People can submit a patch to
further allow it if they want to, but I think it is a saner
behaviour to require explicit paths to be spelled out in such a
case.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Instead of just warning, refuse to add otherwise ignored files
by default, and allow it with an -f option.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We allow otherwise ignored paths to be added to the index by
spelling its path out on the command line, but we would warn the
user about them when we do so.
Signed-off-by: Junio C Hamano <junkio@cox.net>
One thing many people found confusing about git-add was that a
file whose name matches an ignored pattern could not be added to
the index. With this, such a file can be added by explicitly
spelling its name to git-add.
Fileglobs and recursive behaviour do not add ignored files to
the index. That is, if a pattern '*.o' is in .gitignore, and
two files foo.o, bar/baz.o are in the working tree:
$ git add foo.o
$ git add '*.o'
$ git add bar
Only the first form adds foo.o to the index.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
A script to be driven when the user says "git add --interactive"
is introduced.
When it is run, first it runs its internal 'status' command to
show the current status, and then goes into its internactive
command loop.
The command loop shows the list of subcommands available, and
gives a prompt "What now> ". In general, when the prompt ends
with a single '>', you can pick only one of the choices given
and type return, like this:
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now> 1
You also could say "s" or "sta" or "status" above as long as the
choice is unique.
The main command loop has 6 subcommands (plus help and quit).
* 'status' shows the change between HEAD and index (i.e. what
will be committed if you say "git commit"), and between index
and working tree files (i.e. what you could stage further
before "git commit" using "git-add") for each path. A sample
output looks like this:
staged unstaged path
1: binary nothing foo.png
2: +403/-35 +1/-1 git-add--interactive.perl
It shows that foo.png has differences from HEAD (but that is
binary so line count cannot be shown) and there is no
difference between indexed copy and the working tree
version (if the working tree version were also different,
'binary' would have been shown in place of 'nothing'). The
other file, git-add--interactive.perl, has 403 lines added
and 35 lines deleted if you commit what is in the index, but
working tree file has further modifications (one addition and
one deletion).
* 'update' shows the status information and gives prompt
"Update>>". When the prompt ends with double '>>', you can
make more than one selection, concatenated with whitespace or
comma. Also you can say ranges. E.g. "2-5 7,9" to choose
2,3,4,5,7,9 from the list. You can say '*' to choose
everything.
What you chose are then highlighted with '*', like this:
staged unstaged path
1: binary nothing foo.png
* 2: +403/-35 +1/-1 git-add--interactive.perl
To remove selection, prefix the input with - like this:
Update>> -2
After making the selection, answer with an empty line to
stage the contents of working tree files for selected paths
in the index.
* 'revert' has a very similar UI to 'update', and the staged
information for selected paths are reverted to that of the
HEAD version. Reverting new paths makes them untracked.
* 'add untracked' has a very similar UI to 'update' and
'revert', and lets you add untracked paths to the index.
* 'patch' lets you choose one path out of 'status' like
selection. After choosing the path, it presents diff between
the index and the working tree file and asks you if you want
to stage the change of each hunk. You can say:
y - add the change from that hunk to index
n - do not add the change from that hunk to index
a - add the change from that hunk and all the rest to index
d - do not the change from that hunk nor any of the rest to index
j - do not decide on this hunk now, and view the next
undecided hunk
J - do not decide on this hunk now, and view the next hunk
k - do not decide on this hunk now, and view the previous
undecided hunk
K - do not decide on this hunk now, and view the previous hunk
After deciding the fate for all hunks, if there is any hunk
that was chosen, the index is updated with the selected hunks.
* 'diff' lets you review what will be committed (i.e. between
HEAD and index).
This is still rough, but does everything except a few things I
think are needed.
* 'patch' should be able to allow splitting a hunk into
multiple hunks.
* 'patch' does not adjust the line offsets @@ -k,l +m,n @@
in the hunk header. This does not have major problem in
practice, but it _should_ do the adjustment.
* It does not have any explicit support for a merge in
progress; it may not work at all.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This brings the power of the index up front using a proper mental model
without talking about the index at all. See for example how all the
technical discussion has been evacuated from the git-add man page.
Any content to be committed must be added together. Whether that
content comes from new files or modified files doesn't matter. You
just need to "add" it, either with git-add, or by providing
git-commit with -a (for already known files only of course).
No need for a separate command to distinguish new vs modified files
please. That would only screw the mental model everybody should have
when using GIT.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Most of the callers except the one in refs.c use the function to
update the index file. Among the index writers, everybody
except write-tree dies if they cannot open it for writing.
This gives the function an extra argument, to tell it to die
when it cannot create a new file as the lockfile.
The only caller that does not have to die is write-tree, because
updating the index for the cache-tree part is optional and not
being able to do so does not affect the correctness. I think we
do not have to be so careful and make the failure into die() the
same way as other callers, but that would be a different patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The only change in behaviour should be having a "usage: " prefix
on the output string rather than "fatal: ", and an exit code of
129 rather than 128.
Signed-off-by: Ramsay Allan Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This merges the new built-in calling convention code into Johannes's
builtin-mv topic in order to resolve their conflicts early on.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This changes the calling convention of built-in commands and
passes the "prefix" (i.e. pathname of $PWD relative to the
project root level) down to them.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This also moves add_file_to_index() to read-cache.c. Oh, and while
touching builtin-add.c, it also removes a duplicate git_config() call.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The framework to create lockfiles that are removed at exit is
first used to reliably write the index file, but it is
applicable to other things, so stop calling it "cache_file".
This also rewords a few remaining error message that called the
index file "cache file".
Signed-off-by: Junio C Hamano <junkio@cox.net>
This commit is what this branch is all about. It records the
evil merge needed to adjust built-in git-add and git-rm for
the cache-tree extension.
* lt/dirwalk:
Add builtin "git rm" command
Move pathspec matching from builtin-add.c into dir.c
Prevent bogus paths from being added to the index.
builtin-add: fix unmatched pathspec warnings.
Remove old "git-add.sh" remnants
builtin-add: warn on unmatched pathspecs
Do "git add" as a builtin
Clean up git-ls-file directory walking library interface
libify git-ls-files directory traversal
Conflicts:
Makefile
builtin.h
git.c
update-index.c
"git add Documentation/" when Documentation directory exists
does not barf (as it should not), but "git add ." barfed when it
did not add anything. This was because we checked for the path
prefix ("Documentation/" in the former case, and an empty string
in the latter case) for existence, and lstat("", &st) would say
"Huh?".
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is in the same spirit as what bba319b5 and 45e48120 tried
to do to help users. A command such as "git add Documentaiton"
with misspelled pathspecs would give a friendly reminder with
this.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
First try. Let's see how well this works.
In many ways, the hard parts of "git commit" are not so different from
this, and a builtin commit would share a lot of the code, I think.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>