Commit Graph

214 Commits

Author SHA1 Message Date
Junio C Hamano
2855e70ad1 Merge branch 'py/diff-submodule'
* py/diff-submodule:
  is_racy_timestamp(): do not check timestamp for gitlinks
  diff-lib.c: rename check_work_tree_entity()
  diff: a submodule not checked out is not modified
  Add t7506 to test submodule related functions for git-status
  t4027: test diff for submodule with empty directory
2008-05-10 18:16:25 -07:00
Junio C Hamano
380a742679 Merge branch 'lt/case-insensitive'
* lt/case-insensitive:
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-10 18:14:28 -07:00
Junio C Hamano
050288d52d is_racy_timestamp(): do not check timestamp for gitlinks
Because we do not even check the timestamp to determie if a gitlink
is up to date or not, triggering the racy-timestamp check for gitlinks
does not make sense.

This fixes the recently added test in t7506.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:27 -07:00
Junio C Hamano
e06c43c795 write_index(): optimize ce_smudge_racily_clean_entry() calls with CE_UPTODATE
When writing the index out, we need to check the work tree again to see if
an entry whose timestamp indicates that it could be "racily clean", in
order to smudge it if it is stat-clean but with modified contents.

However, we can skip this step for entries marked with CE_UPTODATE,
which are known to be the really clean (i.e. the one we already have
checked when we prepared the index).  This will reduce lstat(2) calls
necessary in git-status.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-12 19:42:17 -07:00
Linus Torvalds
1102952b45 Make git-add behave more sensibly in a case-insensitive environment
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.

In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a

	git add .

(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").

However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
6835550def When adding files to the index, add support for case-independent matches
This simplifies the matching case of "I already have this file and it is
up-to-date" and makes it do the right thing in the face of
case-insensitive aliases.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
96872bc200 Move name hashing functions into a file of its own
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.

It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
d1f128b050 Add 'const' where appropriate to index handling functions
This is in an effort to make the source index of 'unpack_trees()' as
being const, and thus making the compiler help us verify that we only
access it for reading.

The constification also extended to some of the hashing helpers that get
called indirectly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:48 -08:00
Linus Torvalds
0ab9e1e8cd Add 'df_name_compare()' helper function
This new helper is identical to base_name_compare(), except it compares
conflicting directory/file entries as equal in order to help handling DF
conflicts (thus the name).

Note that while a directory name compares as equal to a regular file
with the new helper, they then individually compare _differently_ to a
filename that has a dot after the basename (because '\0' < '.' < '/').

So a directory called "foo/" will compare equal to a file "foo", even
though "foo.c" will compare after "foo" and before "foo/"

This will be used by routines that want to traverse the git namespace
but then handle conflicting entries together when possible.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:46 -08:00
Junio C Hamano
5a4d707a6d Merge branch 'db/checkout'
* db/checkout: (21 commits)
  checkout: error out when index is unmerged even with -m
  checkout: show progress when checkout takes long time while switching branches
  Add merge-subtree back
  checkout: updates to tracking report
  builtin-checkout.c: Remove unused prefix arguments in switch_branches path
  checkout: work from a subdirectory
  checkout: tone down the "forked status" diagnostic messages
  Clean up reporting differences on branch switch
  builtin-checkout.c: fix possible usage segfault
  checkout: notice when the switched branch is behind or forked
  Build in checkout
  Move code to clean up after a branch change to branch.c
  Library function to check for unmerged index entries
  Use diff -u instead of diff in t7201
  Move create_branch into a library file
  Build-in merge-recursive
  Add "skip_unmerged" option to unpack_trees.
  Discard "deleted" cache entries after using them to update the working tree
  Send unpack-trees debugging output to stderr
  Add flag to make unpack_trees() not print errors.
  ...

Conflicts:

	Makefile
2008-02-27 12:53:26 -08:00
Linus Torvalds
d070e3a31b Name hash fixups: export (and rename) remove_hash_entry
This makes the name hash removal function (which really just sets the
bit that disables lookups of it) available to external routines, and
makes read_cache_unmerged() use it when it drops an unmerged entry from
the index.

It's renamed to remove_index_entry(), and we drop the (unused) 'istate'
argument.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Linus Torvalds
a22c637124 Fix name re-hashing semantics
We handled the case of removing and re-inserting cache entries badly,
which is something that merging commonly needs to do (removing the
different stages, and then re-inserting one of them as the merged
state).

We even had a rather ugly special case for this failure case, where
replace_index_entry() basically turned itself into a no-op if the new
and the old entries were the same, exactly because the hash routines
didn't handle it on their own.

So what this patch does is to not just have the UNHASHED bit, but a
HASHED bit too, and when you insert an entry into the name hash, that
involves:

 - clear the UNHASHED bit, because now it's valid again for lookup
   (which is really all that UNHASHED meant)

 - if we're being lazy, we're done here (but we still want to clear the
   UNHASHED bit regardless of lazy mode, since we can become unlazy
   later, and so we need the UNHASHED bit to always be set correctly,
   even if we never actually insert the entry into the hash list)

 - if it was already hashed, we just leave it on the list

 - otherwise mark it HASHED and insert it into the list

this all means that unhashing and rehashing a name all just works
automatically.  Obviously, you cannot change the name of an entry (that
would be a serious bug), but nothing can validly do that anyway (you'd
have to allocate a new struct cache_entry anyway since the name length
could change), so that's not a new limitation.

The code actually gets simpler in many ways, although the lazy hashing
does mean that there are a few odd cases (ie something can be marked
unhashed even though it was never on the hash in the first place, and
isn't actually marked hashed!).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Daniel Barkalow
94a5728cfb Library function to check for unmerged index entries
It's small, but it was in three places already, so it should be in the
library.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Junio C Hamano
9cb76b8cdc lazy index hashing
This delays the hashing of index names until it becomes necessary for
the first time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 23:01:13 -08:00
Linus Torvalds
cf558704fb Create pathname-based hash-table lookup into index
This creates a hash index of every single file added to the index.
Right now that hash index isn't actually used for much: I implemented a
"cache_name_exists()" function that uses it to efficiently look up a
filename in the index without having to do the O(logn) binary search,
but quite frankly, that's not why this patch is interesting.

No, the whole and only reason to create the hash of the filenames in the
index is that by modifying the hash function, you can fairly easily do
things like making it always hash equivalent names into the same bucket.

That, in turn, means that suddenly questions like "does this name exist
in the index under an _equivalent_ name?" becomes much much cheaper.

Guiding principles behind this patch:

 - it shouldn't be too costly. In fact, my primary goal here was to
   actually speed up "git commit" with a fully populated kernel tree, by
   being faster at checking whether a file already existed in the index. I
   did succeed, but only barely:

	Best before:
		[torvalds@woody linux]$ time git commit > /dev/null
		real    0m0.255s
		user    0m0.168s
		sys     0m0.088s

	Best after:

		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
		real    0m0.233s
		user    0m0.144s
		sys     0m0.088s

   so some things are actually faster (~8%).

   Caveat: that's really the best case. Other things are invariably going
   to be slightly slower, since we populate that index cache, and quite
   frankly, few things really use it to look things up.

   That said, the cost is really quite small. The worst case is probably
   doing a "git ls-files", which will do very little except puopulate the
   index, and never actually looks anything up in it, just lists it.

	Before:
		[torvalds@woody linux]$ time git ls-files > /dev/null
		real    0m0.016s
		user    0m0.016s
		sys     0m0.000s

	After:
		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
		real    0m0.021s
		user    0m0.012s
		sys     0m0.008s

   and while the thing has really gotten relatively much slower, we're
   still talking about something almost unmeasurable (eg 5ms). And that
   really should be pretty much the worst case.

   So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
   poison - this patch has the advantage that it will _likely_ speed up
   the cases that are complex and expensive more than it slows down the
   cases that are already so fast that nobody cares. But if you look at
   relative speedups/slowdowns, it doesn't look so good.

 - It should be simple and clean

   The code may be a bit subtle (the reasons I do hash removal the way I
   do etc), but it re-uses the existing hash.c files, so it really is
   fairly small and straightforward apart from a few odd details.

Now, this patch on its own doesn't really do much, but I think it's worth
looking at, if only because if done correctly, the name hashing really can
make an improvement to the whole issue of "do we have a filename that
looks like this in the index already". And at least it gets real testing
by being used even by default (ie there is a real use-case for it even
without any insane filesystems).

NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
not ashamed of it enough to really care. I took all the numbers out of my
nether regions - I'm sure it's good enough that it works in practice, but
the whole point was that you can make a really much fancier hash that
hashes characters not directly, but by their upper-case value or something
like that, and thus you get a case-insensitive hash, while still keeping
the name and the index itself totally case sensitive.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:46:30 -08:00
Junio C Hamano
6d91da6d3c read-cache.c: introduce is_racy_timestamp() helper
This moves a common boolean expression into a helper function,
and makes the comparison between filesystem timestamp and index
timestamp done in the function in line with the other places.
st.st_mtime should be casted to (unsigned int) when compared to
an index timestamp ce_mtime.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:26:40 -08:00
Junio C Hamano
077c48df8a read-cache.c: fix a couple more CE_REMOVE conversion
It is a D/F conflict if you want to add "foo/bar" to the index
when "foo" already exists.  Also it is a conflict if you want to
add a file "foo" when "foo/bar" exists.

An exception is when the existing entry is there only to mark "I
used to be here but I am being removed".  This is needed for
operations such as "git read-tree -m -u" that update the index
and then reflect the result to the work tree --- we need to
remember what to remove somewhere, and we use the index for
that.  In such a case, an existing file "foo" is being removed
and we can create "foo/" directory and hang "bar" underneath it
without any conflict.

We used to use (ce->ce_mode == 0) to mark an entry that is being
removed, but (CE_REMOVE & ce->ce_flags) is used for that purpose
these days.  An earlier commit forgot to convert the logic in
the code that checks D/F conflict condition.

The old code knew that "to be removed" entries cannot be at
higher stage and actively checked that condition, but it was an
unnecessary check.  This patch removes the extra check as well.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:24:21 -08:00
Junio C Hamano
eadb583134 Avoid running lstat(2) on the same cache entry.
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  This
patch is not about reducing them.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.  If somebody later walks the index and
   wants to see if the work tree has changes, they do not have
   to be checked with lstat(2) again.

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * refresh_cache_ent() also marks the cache entry that are clean
   with CE_UPTODATE.

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano
7fec10b7f4 index: be careful when handling long names
We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Linus Torvalds
7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano
c78a24986d Merge branch 'jc/maint-add-sync-stat'
* jc/maint-add-sync-stat:
  t2200: test more cases of "add -u"
  git-add: make the entry stat-clean after re-adding the same contents
  ce_match_stat, run_diff_files: use symbolic constants for readability

Conflicts:

	builtin-add.c
2007-11-14 14:15:40 -08:00
Junio C Hamano
fb63d7f889 git-add: make the entry stat-clean after re-adding the same contents
Earlier in commit 0781b8a9b2
(add_file_to_index: skip rehashing if the cached stat already
matches), add_file_to_index() were taught not to re-add the path
if it already matches the index.

The change meant well, but was not executed quite right.  It
used ie_modified() to see if the file on the work tree is really
different from the index, and skipped adding the contents if the
function says "not modified".

This was wrong.  There are three possible comparison results
between the index and the file in the work tree:

 - with lstat(2) we _know_ they are different.  E.g. if the
   length or the owner in the cached stat information is
   different from the length we just obtained from lstat(2), we
   can tell the file is modified without looking at the actual
   contents.

 - with lstat(2) we _know_ they are the same.  The same length,
   the same owner, the same everything (but this has a twist, as
   described below).

 - we cannot tell from lstat(2) information alone and need to go
   to the filesystem to actually compare.

The last case arises from what we call 'racy git' situation,
that can be caused with this sequence:

    $ echo hello >file
    $ git add file
    $ echo aeiou >file ;# the same length

If the second "echo" is done within the same filesystem
timestamp granularity as the first "echo", then the timestamp
recorded by "git add" and the timestamp we get from lstat(2)
will be the same, and we can mistakenly say the file is not
modified.  The path is called 'racily clean'.  We need to
reliably detect racily clean paths are in fact modified.

To solve this problem, when we write out the index, we mark the
index entry that has the same timestamp as the index file itself
(that is the time from the point of view of the filesystem) to
tell any later code that does the lstat(2) comparison not to
trust the cached stat info, and ie_modified() then actually goes
to the filesystem to compare the contents for such a path.

That's all good, but it should not be used for this "git add"
optimization, as the goal of "git add" is to actually update the
path in the index and make it stat-clean.  With the false
optimization, we did _not_ cause any data loss (after all, what
we failed to do was only to update the cached stat information),
but it made the following sequence leave the file stat dirty:

    $ echo hello >file
    $ git add file
    $ echo hello >file ;# the same contents
    $ git add file

The solution is not to use ie_modified() which goes to the
filesystem to see if it is really clean, but instead use
ie_match_stat() with "assume racily clean paths are dirty"
option, to force re-adding of such a path.

There was another problem with "git add -u".  The codepath
shares the same issue when adding the paths that are found to be
modified, but in addition, it asked "git diff-files" machinery
run_diff_files() function (which is "git diff-files") to list
the paths that are modified.  But "git diff-files" machinery
uses the same ie_modified() call so that it does not report
racily clean _and_ actually clean paths as modified, which is
not what we want.

The patch allows the callers of run_diff_files() to pass the
same "assume racily clean paths are dirty" option, and makes
"git-add -u" codepath to use that option, to discover and re-add
racily clean _and_ actually clean paths.

We could further optimize on top of this patch to differentiate
the case where the path really needs re-adding (i.e. the content
of the racily clean entry was indeed different) and the case
where only the cached stat information needs to be refreshed
(i.e. the racily clean entry was actually clean), but I do not
think it is worth it.

This patch applies to maint and all the way up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:37:39 -08:00
Junio C Hamano
4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
Shawn O. Pearce
e75c55844f Merge branch 'maint'
* maint:
  Yet more 1.5.3.5 fixes mentioned in release notes
  cvsserver: Use exit 1 instead of die when req_Root fails.
  git-blame shouldn't crash if run in an unmerged tree
  git-config: print error message if the config file cannot be read
  fixing output of non-fast-forward output of post-receive-email
2007-10-18 03:11:17 -04:00
Linus Torvalds
cd8ae20195 git-blame shouldn't crash if run in an unmerged tree
If we are in the middle of resolving a merge conflict there may be
one or more files whose entries in the index represent an unmerged
state (index entries in the higher-order stages).

Attempting to run git-blame on any file in such a working directory
resulted in "fatal: internal error: ce_mode is 0" as we use the magic
marker for an unmerged entry is 0 (set up by things like diff-lib.c's
do_diff_cache() and builtin-read-tree.c's read_tree_unmerged())
and the ce_match_stat_basic() function gets upset about this.

I'm not entirely sure that the whole "ce_mode = 0" case is a good
idea to begin with, and maybe the right thing to do is to remove
that horrid freakish special case, but removing the internal error
seems to be the simplest fix for now.

                Linus

[sp: Thanks to Björn Steinbrink for the test case]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-18 02:31:30 -04:00
Carlos Rica
102c2338da Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-26 13:42:10 -07:00
Pierre Habouzit
1dffb8fa80 Small cache_tree_write refactor.
This function cannot fail, make it void. Also make write_one act on a
const char* instead of a char*.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-26 02:27:06 -07:00
Junio C Hamano
58f6fb53dd Merge branch 'jc/cachetree' into cr/reset
* jc/cachetree:
  Simplify cache API
  git-format-patch --in-reply-to: accept <message@id> with angle brackets
  git-add -u: do not barf on type changes
  Remove duplicate note about removing commits with git-filter-branch
  git-clone: improve error message if curl program is missing or not executable
  git.el: Allow the add and remove commands to be applied to ignored files.
  git.el: Allow selecting whether to display uptodate/unknown/ignored files.
  git.el: Keep the status buffer sorted by filename.
  hooks--update: Explicitly check for all zeros for a deleted ref.
2007-09-14 01:19:30 -07:00
Junio C Hamano
09d5dc32fb Simplify cache API
Earlier, add_file_to_index() invalidated the path in the cache-tree
but remove_file_from_cache() did not, and the user of the latter
needed to invalidate the entry himself.  This led to a few bugs due to
missed invalidate calls already.  This patch makes the management of
cache-tree less error prone by making more invalidate calls from lower
level cache API functions.

The rules are:

 - If you are going to write the index, you should either maintain
   cache_tree correctly.

   - If you cannot, alternatively you can remove the entire cache_tree
     by calling cache_tree_free() before you call write_cache().

   - When you modify the index, cache_tree_invalidate_path() should be
     called with the path you are modifying, to discard the entry from
     the cache-tree structure.

 - The following cache API functions exported from read-cache.c (and
   the macro whose names have "cache" instead of "index")
   automatically call cache_tree_invalidate_path() for you:

   - remove_file_from_index();
   - add_file_to_index();
   - add_index_entry();

   You can modify the index bypassing the above API functions
   (e.g. find an existing cache entry from the index and modify it in
   place).  You need to call cache_tree_invalidate_path() yourself in
   such a case.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-14 01:02:21 -07:00
Carlos Rica
6640f88165 Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-12 13:25:07 -07:00
Alexandre Julliard
d616813d75 git-add: Add support for --refresh option.
This allows to refresh only a subset of the project files, based on
the specified pathspecs.

Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-13 12:58:38 -07:00
Junio C Hamano
af3785dc5a Optimize "diff --cached" performance.
The read_tree() function is called only from the call chain to
run "git diff --cached" (this includes the internal call made by
git-runstatus to run_diff_index()).  The function vacates stage
without any funky "merge" magic.  The caller then goes and
compares stage #1 entries from the tree with stage #0 entries
from the original index.

When adding the cache entries this way, it used the general
purpose add_cache_entry().  This function looks for an existing
entry to replace or if there is none to find where to insert the
new entry, resolves D/F conflict and all the other things.

For the purpose of reading entries into an empty stage, none of
that processing is needed.  We can instead append everything and
then sort the result at the end.

This commit changes read_tree() to first make sure that there is
no existing cache entries at specified stage, and if that is the
case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag
(new), and then sort the resulting cache using qsort().

This new flag tells add_cache_entry() to omit all the checks
such as "Does this path already exist?  Does adding this path
remove other existing entries because it turns a directory to a
file?" and instead append the given cache entry straight at the
end of the active cache.  The caller of course is expected to
sort the resulting cache at the end before using the result.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10 11:44:23 -07:00
Junio C Hamano
0781b8a9b2 add_file_to_index: skip rehashing if the cached stat already matches
An earlier commit 366bfcb6 broke git-add by moving read_cache()
call down, because it wanted the directory walking code to grab
paths that are already in the index.  The change serves its
purpose, but introduces a regression because the responsibility
of avoiding unnecessary reindexing by matching the cached stat
is shifted nowhere.

This makes it the job of add_file_to_index() function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-30 17:49:50 -07:00
Johannes Schindelin
2031427167 git add: respect core.filemode with unmerged entries
When a merge left unmerged entries, git add failed to pick up the
file mode from the index, when core.filemode == 0. If more than one
unmerged entry is there, the order of stage preference is 2, 1, 3.

Noticed by Johannes Sixt.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-01 13:26:05 -07:00
Junio C Hamano
a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Martin Waitz
302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
Luiz Fernando N. Capitulino
3511a3774e read_cache_from(): small simplification
This change 'opens' the code block which maps the index file into
memory, making the code clearer and easier to read.

Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 13:44:27 -07:00
Junio C Hamano
4aab5b46f4 Make read-cache.c "the_index" free.
This makes all low-level functions defined in read-cache.c to
take an explicit index_state structure as their first parameter,
to specify which index to work on.  These functions
traditionally operated on "the_index" and were named foo_cache();
the counterparts this patch introduces are called foo_index().

The traditional foo_cache() functions are made into macros that
give "the_index" to their corresponding foo_index() functions.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Junio C Hamano
228e94f935 Move index-related variables into a structure.
This defines a index_state structure and moves index-related
global variables into it.  Currently there is one instance of
it, the_index, and everybody accesses it, so there is no code
change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Linus Torvalds
a8ee75bc7a Fix gitlink index entry filesystem matching
The code to match up index entries with the filesystem was stupidly
broken.  We shouldn't compare the filesystem stat() information with
S_IFDIRLNK, since that's purely a git-internal value, and not what the
filesystem uses (on the filesystem, it's just a regular directory).

Also, don't bother to make the stat() time comparisons etc for DIRLNK
entries in ce_match_stat_basic(), since we do an exact match for these
things, and the hints in the stat data simply doesn't matter.

This fixes "git status" with submodules that haven't been checked out in
the supermodule.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 03:14:12 -07:00
Linus Torvalds
095952585c Teach directory traversal about subprojects
This is the promised cleaned-up version of teaching directory traversal
(ie the "read_directory()" logic) about subprojects. That makes "git add"
understand to add/update subprojects.

It now knows to look at the index file to see if a directory is marked as
a subproject, and use that as information as whether it should be recursed
into or not.

It also generally cleans up the handling of directory entries when
traversing the working tree, by splitting up the decision-making process
into small functions of their own, and adding a fair number of comments.

Finally, it teaches "add_file_to_cache()" that directory names can have
slashes at the end, since the directory traversal adds them to make the
difference between a file and a directory clear (it always did that, but
my previous too-ugly-to-apply subproject patch had a totally different
path for subproject directories and avoided the slash for that case).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 19:09:55 -07:00
Linus Torvalds
1833a92548 Fix thinko in subproject entry sorting
This fixes a total thinko in my original series: subprojects do *not* sort
like directories, because the index is sorted purely by full pathname, and
since a subproject shows up in the index as a normal NUL-terminated
string, it never has the issues with sorting with the '/' at the end.

So if you have a subproject "proj" and a file "proj.c", the subproject
sorts alphabetically before the file in the index (and must thus also sort
that way in a tree object, since trees sort as the index).

In contrast, it you have two files "proj/file" and "proj.c", the "proj.c"
will sort alphabetically before "proj/file" in the index. The index
itself, of course, does not actually contain an entry "proj/", but in the
*tree* that gets written out, the tree entry "proj" will sort after the
file entry "proj.c", which is the only real magic sorting rule.

In other words: the magic sorting rule only affects tree entries, and
*only* affects tree entries that point to other trees (ie are of the type
S_IFDIR).

Anyway, that thinko just means that we should remove the special case to
make S_ISDIRLNK entries sort like S_ISDIR entries. They don't.  They sort
like normal files.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 17:21:12 -07:00
Linus Torvalds
f35a6d3bce Teach core object handling functions about gitlinks
This teaches the really fundamental core SHA1 object handling routines
about gitlinks.  We can compare trees with gitlinks in them (although we
can not actually generate patches for them yet - just raw git diffs),
and they show up as commits in "git ls-tree".

We also know to compare gitlinks as if they were directories (ie the
normal "sort as trees" rules apply).

[jc: amended a cut&paste error]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 13:50:43 -07:00
Junio C Hamano
640ee0d1cd Merge branch 'jc/read-tree-df' (early part)
* 'jc/read-tree-df' (early part):
  Fix switching to a branch with D/F when current branch has file D.
  Fix twoway_merge that passed d/f conflict marker to merged_entry().
  Fix read-tree --prefix=dir/.
  unpack-trees: get rid of *indpos parameter.
  unpack_trees.c: pass unpack_trees_options structure to keep_entry() as well.
  add_cache_entry(): removal of file foo does not conflict with foo/bar
2007-04-07 23:52:40 -07:00
Junio C Hamano
fd1c3bf053 Rename add_file_to_index() to add_file_to_cache()
This function was not called "add_file_to_cache()" only because
an ancient program, update-cache, used that name as an internal
function name that does something slightly different.  Now that
is gone, we can take over the better name.

The plan is to name all functions that operate on the default
index xxx_cache().  Later patches create a variant of them that
take an explicit parameter xxx_index(), and then turn
xxx_cache() functions into macros that use "the_index".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano
ec0cc70469 Propagate cache error internal to refresh_cache() via parameter.
The function refresh_cache() is the only user of cache_errno
that switches its behaviour based on what internal function
refresh_cache_entry() finds; pass the error status back in a
parameter passed down to it, to get rid of the global variable
cache_errno.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano
0424138d57 Fix bogus error message from merge-recursive error path
This error message should not usually trigger, but the function
make_cache_entry() called by add_cacheinfo() can return early
without calling into refresh_cache_entry() that sets cache_errno.

Also the error message had a wrong function name reported, and
it did not say anything about which path failed either.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano
21cd8d00b6 add_cache_entry(): removal of file foo does not conflict with foo/bar
Similarly, removal of file foo/bar does not conflict with a file foo.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-04 00:19:28 -07:00
Shawn O. Pearce
dc49cd769b Cast 64 bit off_t to 32 bit size_t
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.

On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().

In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:15:26 -08:00
Johannes Sixt
78a8d641c1 Add core.symlinks to mark filesystems that do not support symbolic links.
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.

This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system.  A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).

Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-02 16:58:05 -08:00
Junio C Hamano
53bca91a7d index_fd(): pass optional path parameter as hint for blob conversion
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 12:00:00 -08:00
Junio C Hamano
edaec3fbe8 index_fd(): use enum object_type instead of type name string.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 12:00:00 -08:00
Nicolas Pitre
21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Junio C Hamano
185c975faa Do not take mode bits from index after type change.
When we do not trust executable bit from lstat(2), we copied
existing ce_mode bits without checking if the filesystem object
is a regular file (which is the only thing we apply the "trust
executable bit" business) nor if the blob in the index is a
regular file (otherwise, we should do the same as registering a
new regular file, which is to default non-executable).

Noticed by Johannes Sixt.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-16 22:56:06 -08:00
Linus Torvalds
2cdf9509df write-cache: do not leak the serialized cache-tree data.
It is not used after getting written, and just is leaking every time
we write the index out.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-11 12:25:16 -08:00
Andy Whitcroft
93822c2239 short i/o: fix calls to write to use xwrite or write_in_full
We have a number of badly checked write() calls.  Often we are
expecting write() to write exactly the size we requested or fail,
this fails to handle interrupts or short writes.  Switch to using
the new write_in_full().  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xwrite().

Note, the changes to config handling are much larger and handled
in the next patch in the sequence.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Shawn O. Pearce
5fe5c8300d Cleanup read_cache_from error handling.
When I converted the mmap() call to xmmap() I failed to cleanup the
way this routine handles errors and left some crufty code behind.
This is a small cleanup, suggested by Johannes.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
c4712e4553 Replace mmap with xmmap, better handling MAP_FAILED.
In some cases we did not even bother to check the return value of
mmap() and just assume it worked.  This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.

We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location.  If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt.  If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Junio C Hamano
81a361be3b Fix check_file_directory_conflict().
When replacing an existing file A with a directory A that has a
file A/B in it in the index, 'update-index --replace --add A/B'
did not properly remove the file to make room for the new
directory.

There was a trivial logic error, most likely a cut & paste one,
dating back to quite early days of git.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:44 -08:00
Junio C Hamano
c33ab0dd10 git-add: remove conflicting entry when adding.
When replacing an existing file A with a directory A that has a
file A/B in it in the index, 'git add' did not succeed because
it forgot to pass the allow-replace flag to add_cache_entry().

It might be safer to leave this as an error and require the user
to explicitly remove the existing A first before adding A/B
since it is an unusual case, but doing that automatically is
much easier to use.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:43 -08:00
Junio C Hamano
790fa0e297 update-index: make D/F conflict error a bit more verbose.
When you remove a directory D that has a tracked file D/F out of the
way to create a file D and try to "git update-index --add D", it used
to say "cannot add" which was not very helpful.  This issues an extra
error message to explain the situation before the final "fatal" message.

Since D/F conflicts are relatively rare event, extra verbosity would
not make things too noisy.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:43 -08:00
Junio C Hamano
2bbaaed9ee trust-executable-bit: fix breakage for symlinks
An earlier commit f28b34a broke symlinks when trust-executable-bit
is not set because it incorrectly assumed that everything was a
regular file.

Reported by Juergen Ruehle.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-22 16:36:49 -08:00
Rene Scharfe
a6e8a76770 sparse fix: non-ANSI function declaration
The declaration of discard_cache() in cache.h already has its "void".

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-18 11:40:00 -08:00
Shawn Pearce
fd28b34afd Ignore executable bit when adding files if filemode=0.
If the user has configured core.filemode=0 then we shouldn't set
the execute bit in the index when adding a new file as the user
has indicated that the local filesystem can't be trusted.

This means that when adding files that should be marked executable
in a repository with core.filemode=0 the user must perform a
'git update-index --chmod=+x' on the file before committing the
addition.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-26 22:42:52 -07:00
Junio C Hamano
1e49cb8ad4 Merge branch 'js/c-merge-recursive'
* js/c-merge-recursive: (21 commits)
  discard_cache(): discard index, even if no file was mmap()ed
  merge-recur: do not die unnecessarily
  merge-recur: try to merge older merge bases first
  merge-recur: if there is no common ancestor, fake empty one
  merge-recur: do not setenv("GIT_INDEX_FILE")
  merge-recur: do not call git-write-tree
  merge-recursive: fix rename handling
  .gitignore: git-merge-recur is a built file.
  merge-recur: virtual commits shall never be parsed
  merge-recur: use the unpack_trees() interface instead of exec()ing read-tree
  merge-recur: fix thinko in unique_path()
  Makefile: git-merge-recur depends on xdiff libraries.
  merge-recur: Explain why sha_eq() and struct stage_data cannot go
  merge-recur: Cleanup last mixedCase variables...
  merge-recur: Fix compiler warning with -pedantic
  merge-recur: Remove dead code
  merge-recur: Get rid of debug code
  merge-recur: Convert variable names to lower_case
  Cumulative update of merge-recursive in C
  recur vs recursive: help testing without touching too many stuff.
  ...

This is an evil merge that removes TEST script from the toplevel.
2006-08-27 20:33:46 -07:00
David Rientjes
a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
Junio C Hamano
8e3abd4c97 Merge branch 'jc/racy'
* jc/racy:
  Remove the "delay writing to avoid runtime penalty of racy-git avoidance"
  Add check program "git-check-racy"
  Documentation/technical/racy-git.txt
  avoid nanosleep(2)
2006-08-16 14:00:34 -07:00
Junio C Hamano
0fc82cff12 Remove the "delay writing to avoid runtime penalty of racy-git avoidance"
The work-around should not be needed.  Even if it turns out we
would want it later, git will remember the patch for us ;-).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 22:12:54 -07:00
Junio C Hamano
42f774063d Add check program "git-check-racy"
This will help counting the racily clean paths, but it should be
useless for daily use.  Do not even enable it in the makefile.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 21:38:07 -07:00
David Rientjes
96f1e58f52 remove unnecessary initializations
[jc: I needed to hand merge the changes to the updated codebase,
 so the result needs to be checked.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 21:22:20 -07:00
Junio C Hamano
789a09b487 avoid nanosleep(2)
On Solaris nanosleep(2) is not available in libc; you need to
link with -lrt to get it.

The purpose of the loop is to wait until the next filesystem
timestamp granularity, and the code uses subsecond sleep in the
hope that it can shorten the delay to 0.5 seconds on average
instead of a full second.  It is probably not worth depending on
an extra library for this.

We might want to yank out the whole "racy-git avoidance is
costly later at runtime, so let's delay writing the index out"
codepath later, but that is a separate issue and needs some
testing on large trees to figure it out.  After playing with the
kernel tree, I have a feeling that the whole thing may not be
worth it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 03:39:47 -07:00
David Rientjes
968a1d65f4 read-cache.c cleanup
Removes conditional returns.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-14 18:41:12 -07:00
Junio C Hamano
eed94a570e Merge branch 'master' into js/c-merge-recursive
Adjust to hold_lock_file_for_update() change on the master.
2006-08-12 18:35:14 -07:00
Johannes Schindelin
4147d801db discard_cache(): discard index, even if no file was mmap()ed
Since add_cacheinfo() can be called without a mapped index file,
discard_cache() _has_ to discard the entries, even when
cache_mmap == NULL.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-10 14:30:01 -07:00
Junio C Hamano
6015c28b1d read-cache: tweak racy-git delay logic
Instead of looping over the entries and writing out, use a
separate loop after all entries have been written out to check
how many entries are racily clean.  Make sure that the newly
created index file gets the right timestamp when we check by
flushing the buffered data by ce_write().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-08 17:17:04 -07:00
Junio C Hamano
b7e58b17b5 Racy git: avoid having to be always too careful
Immediately after a bulk checkout, most of the paths in the
working tree would have the same timestamp as the index file,
and this would force ce_match_stat() to take slow path for all
of them.  When writing an index file out, if many of the paths
have very new (read: the same timestamp as the index file being
written out) timestamp, we are better off delaying the return
from the command, to make sure that later command to touch the
working tree files will leave newer timestamps than recorded in
the index, thereby avoiding to take the slow path.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-07 01:58:53 -07:00
Linus Torvalds
7f8508e8d3 Fix double "close()" in ce_compare_data
Doing an "strace" on "git diff" shows that we close() a file descriptor
twice (getting EBADFD on the second one) when we end up in ce_compare_data
if the index does not match the checked-out stat information.

The "index_fd()" function will already have closed the fd for us, so we
should not close it again.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-31 11:55:56 -07:00
Junio C Hamano
c1a788acee Merge branch 'js/read-tree' into js/c-merge-recursive
* js/read-tree: (107 commits)
  read-tree: move merge functions to the library
  read-trees: refactor the unpack_trees() part
  tar-tree: illustrate an obscure feature better
  git.c: allow alias expansion without a git directory
  setup_git_directory_gently: do not barf when GIT_DIR is given.
  Build on Debian GNU/kFreeBSD
  Call setup_git_directory() much earlier
  Call setup_git_directory() early
  Display an error from update-ref if target ref name is invalid.
  Fix http-fetch
  t4103: fix binary patch application test.
  git-apply -R: binary patches are irreversible for now.
  Teach git-apply about '-R'
  Makefile: ssh-pull.o depends on ssh-fetch.c
  log and diff family: honor config even from subdirectories
  git-reset: detect update-ref error and report it.
  lost-found: use fsck-objects --full
  Teach git-http-fetch the --stdin switch
  Teach git-local-fetch the --stdin switch
  Make pull() support fetching multiple targets at once
  ...
2006-07-30 23:42:10 -07:00
Johannes Schindelin
11be42a476 Make git-mv a builtin
This also moves add_file_to_index() to read-cache.c. Oh, and while
touching builtin-add.c, it also removes a duplicate git_config() call.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-26 13:36:36 -07:00
Johannes Schindelin
8fd2cb4069 Extract helper bits from c-merge-recursive work
This backports the pieces that are not uncooked from the merge-recursive
WIP we have seen earlier, to be used in git-mv rewritten in C.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-26 13:36:36 -07:00
Johannes Schindelin
6d297f8137 Status update on merge-recursive in C
This is just an update for people being interested. Alex and me were
busy with that project for a few days now. While it has progressed nicely,
there are quite a couple TODOs in merge-recursive.c, just search for "TODO".

For impatient people: yes, it passes all the tests, and yes, according
to the evil test Alex did, it is faster than the Python script.

But no, it is not yet finished. Biggest points are:

- there are still three external calls
- in the end, it should not be necessary to write the index more than once
  (just before exiting)
- a lot of things can be refactored to make the code easier and shorter

BTW we cannot just plug in git-merge-tree yet, because git-merge-tree
does not handle renames at all.

This patch is meant for testing, and as such,

- it compile the program to git-merge-recur
- it adjusts the scripts and tests to use git-merge-recur instead of
  git-merge-recursive
- it provides "TEST", a script to execute the tests regarding -recursive
- it inlines the changes to read-cache.c (read_cache_from(), discard_cache()
  and refresh_cache_entry())

Brought to you by Alex Riesen and Dscho

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 23:10:19 -07:00
Pavel Roskin
a9486b02ec Avoid C99 comments, use old-style C comments instead.
This doesn't make the code uglier or harder to read, yet it makes the
code more portable.  This also simplifies checking for other potential
incompatibilities.  "gcc -std=c89 -pedantic" can flag many incompatible
constructs as warnings, but C99 comments will cause it to emit an error.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-10 00:47:13 -07:00
Florian Forster
1d7f171c3a Remove all void-pointer arithmetic.
ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in
various ways. Usually the strategy that required the least changes was used.

Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-20 01:59:46 -07:00
Junio C Hamano
7d55561986 Merge branch 'jc/dirwalk-n-cache-tree' into jc/cache-tree
* jc/dirwalk-n-cache-tree: (212 commits)
  builtin-rm: squelch compiler warnings.
  Add builtin "git rm" command
  Move pathspec matching from builtin-add.c into dir.c
  Prevent bogus paths from being added to the index.
  builtin-add: fix unmatched pathspec warnings.
  Remove old "git-add.sh" remnants
  builtin-add: warn on unmatched pathspecs
  Do "git add" as a builtin
  Clean up git-ls-file directory walking library interface
  libify git-ls-files directory traversal
  Add a conversion tool to migrate remote information into the config
  fetch, pull: ask config for remote information
  Fix build procedure for builtin-init-db
  read-tree -m -u: do not overwrite or remove untracked working tree files.
  apply --cached: do not check newly added file in the working tree
  Implement a --dry-run option to git-quiltimport
  Implement git-quiltimport
  Revert "builtin-grep: workaround for non GNU grep."
  builtin-grep: workaround for non GNU grep.
  builtin-grep: workaround for non GNU grep.
  ...
2006-05-28 22:34:34 -07:00
Dennis Stosberg
ac58c7b18e git-write-tree writes garbage on sparc64
In the "next" branch, write_index_ext_header() writes garbage on a
64-bit big-endian machine; the written index file will be unreadable.
I noticed this on NetBSD/sparc64. Reproducible with:

$ git init-db
$ :>file
$ git-update-index --add file
$ git-write-tree
$ git-update-index
error: index uses  extension, which we do not understand
fatal: index file corrupt

Signed-off-by: Dennis Stosberg <dennis@stosberg.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-28 13:31:50 -07:00
Junio C Hamano
93872e0700 Merge branch 'lt/dirwalk' into jc/dirwalk-n-cache-tree
This commit is what this branch is all about.  It records the
evil merge needed to adjust built-in git-add and git-rm for
the cache-tree extension.

* lt/dirwalk:
  Add builtin "git rm" command
  Move pathspec matching from builtin-add.c into dir.c
  Prevent bogus paths from being added to the index.
  builtin-add: fix unmatched pathspec warnings.
  Remove old "git-add.sh" remnants
  builtin-add: warn on unmatched pathspecs
  Do "git add" as a builtin
  Clean up git-ls-file directory walking library interface
  libify git-ls-files directory traversal

Conflicts:

	Makefile
	builtin.h
	git.c
	update-index.c
2006-05-20 01:52:19 -07:00
Junio C Hamano
283c8eef6c Merge branch 'jc/cache-tree' into jc/dirwalk-n-cache-tree
* jc/cache-tree: (24 commits)
  Fix crash when reading the empty tree
  fsck-objects: do not segfault on missing tree in cache-tree
  cache-tree: a bit more debugging support.
  read-tree: invalidate cache-tree entry when a new index entry is added.
  Fix test-dump-cache-tree in one-tree disappeared case.
  fsck-objects: mark objects reachable from cache-tree
  cache-tree: replace a sscanf() by two strtol() calls
  cache-tree.c: typefix
  test-dump-cache-tree: validate the cached data as well.
  cache_tree_update: give an option to update cache-tree only.
  read-tree: teach 1-way merege and plain read to prime cache-tree.
  read-tree: teach 1 and 2 way merges about cache-tree.
  update-index: when --unresolve, smudge the relevant cache-tree entries.
  test-dump-cache-tree: report number of subtrees.
  cache-tree: sort the subtree entries.
  Teach fsck-objects about cache-tree.
  index: make the index file format extensible.
  cache-tree: protect against "git prune".
  Add test-dump-cache-tree
  Use cache-tree in update-index.
  ...
2006-05-20 00:56:11 -07:00
Linus Torvalds
405e5b2fe0 Libify the index refresh logic
This cleans up and libifies the "git update-index --[really-]refresh"
functionality. This will be eventually required for eventually doing the
"commit" and "status" commands as built-ins.

It really just moves "refresh_index()" from update-index.c to
read-cache.c, but it also has to change the calling convention so that the
function uses a "unsigned int flags" argument instead of various static
flags variables for passing down the information about whether to be quiet
or not, and allow unmerged entries etc.

That actually cleans up update-index.c too, since it turns out that all
those flags were really specific to that one function of the index update,
so they shouldn't have had file-scope visibility even before.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-19 15:59:18 -07:00
Linus Torvalds
8dcf39c46e Prevent bogus paths from being added to the index.
With this one, it's now a fatal error to try to add a pathname
that cannot be added with "git add", i.e.

	[torvalds@g5 git]$ git add .git/config
	fatal: unable to add .git/config to index

and

	[torvalds@g5 git]$ git add foo/../bar
	fatal: unable to add foo/../bar to index

instead of the old "Ignoring path xyz" warning that would end up
silently succeeding on any other paths.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-18 12:07:31 -07:00
Yakov Lerner
28cc4ab422 read-cache.c: use xcalloc() not calloc()
Elsewhere we use xcalloc(); we should consistently do so.

Signed-off-by: Yakov Lerner <iler.ml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-09 06:28:59 -07:00
Junio C Hamano
bad68ec924 index: make the index file format extensible.
... and move the cache-tree data into it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-24 21:24:13 -07:00
Junio C Hamano
1af1c2b63d read-cache/write-cache: optionally return cache checksum SHA1.
read_cache_1() and write_cache_1() takes an extra parameter
*sha1 that returns the checksum of the index file when non-NULL.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-23 16:57:40 -07:00
Junio C Hamano
7b80be150c cache_name_compare() compares name and stage, nothing else.
The code was a bit unclear in expressing what it wants to compare.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 23:46:25 -08:00
Junio C Hamano
5f73076c1a "Assume unchanged" git
This adds "assume unchanged" logic, started by this message in the list
discussion recently:

	<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>

This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of.  On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.

You can use two new options to update-index to set and reset the
CE_VALID bit:

	git-update-index --assume-unchanged path...
	git-update-index --no-assume-unchanged path...

These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file.  Nor they add a new
entry to the index.

When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:

 - update-index to explicitly register the current object name to the
   index file.

 - when update-index --refresh finds the path to be up-to-date.

 - when tools like read-tree -u and apply --index update the working
   tree file and register the current object name to the index file.

The flag is dropped upon read-tree that does not check out the index
entry.  This happens regardless of the core.ignorestat settings.

Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time.  However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:

 - while "git-read-tree -m" or git-apply need to make sure
   that the paths involved in the merge do not have local
   modifications.  This sacrifices performance for safety.

 - when git-checkout-index -f -q -u -a tries to see if it needs
   to checkout the paths.  Otherwise you can never check
   anything out ;-).

 - when git-update-index --really-refresh (a new flag) tries to
   see if the index entry is up to date.  You can start with
   everything marked as CE_VALID and run this once to drop
   CE_VALID bit for paths that are modified.

Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".

This version is not expected to be perfect.  I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.

But the basics seem to work, and ready to be tested by people
who asked for this feature.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-08 21:54:42 -08:00
Junio C Hamano
4b3511b0f8 ce_smudge_racily_clean_entry: explain why it works.
This is a tricky code and warrants extra commenting.  I wasted
30 minutes trying to break it until I realized why it works.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-20 14:18:47 -08:00
Junio C Hamano
407c8eb0d0 Racy GIT (part #2)
The previous round caught the most trivial case well, but broke
down once index file is updated again.  Smudge problematic
entries (they should be very few if any under normal interactive
workflow) before writing a new index file out.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-20 12:12:18 -08:00
Junio C Hamano
29e4d36357 Racy GIT
This fixes the longstanding "Racy GIT" problem, which was pretty
much there from the beginning of time, but was first
demonstrated by Pasky in this message on October 24, 2005:

    http://marc.theaimsgroup.com/?l=git&m=113014629716878

If you run the following sequence of commands:

	echo frotz >infocom
        git update-index --add infocom
        echo xyzzy >infocom

so that the second update to file "infocom" does not change
st_mtime, what is recorded as the stat information for the cache
entry "infocom" exactly matches what is on the filesystem
(owner, group, inum, mtime, ctime, mode, length).  After this
sequence, we incorrectly think "infocom" file still has string
"frotz" in it, and get really confused.  E.g. git-diff-files
would say there is no change, git-update-index --refresh would
not even look at the filesystem to correct the situation.

Some ways of working around this issue were already suggested by
Linus in the same thread on the same day, including waiting
until the next second before returning from update-index if a
cache entry written out has the current timestamp, but that
means we can make at most one commit per second, and given that
the e-mail patch workflow used by Linus needs to process at
least 5 commits per second, it is not an acceptable solution.
Linus notes that git-apply is primarily used to update the index
while processing e-mailed patches, which is true, and
git-apply's up-to-date check is fooled by the same problem but
luckily in the other direction, so it is not really a big issue,
but still it is disturbing.

The function ce_match_stat() is called to bypass the comparison
against filesystem data when the stat data recorded in the cache
entry matches what stat() returns from the filesystem.  This
patch tackles the problem by changing it to actually go to the
filesystem data for cache entries that have the same mtime as
the index file itself.  This works as long as the index file and
working tree files are on the filesystems that share the same
monotonic clock.  Files on network mounted filesystems sometimes
get skewed timestamps compared to "date" output, but as long as
working tree files' timestamps are skewed the same way as the
index file's, this approach still works.  The only problematic
files are the ones that have the same timestamp as the index
file's, because two file updates that sandwitch the index file
update must happen within the same second to trigger the
problem.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-20 00:22:28 -08:00
Linus Torvalds
e1b10391ea Use git config file for committer name and email info
This starts using the "user.name" and "user.email" config variables if
they exist as the default name and email when committing.  This means
that you don't have to use the GIT_COMMITTER_EMAIL environment variable
to override your email - you can just edit the config file instead.

The patch looks bigger than it is because it makes the default name and
email information non-static and renames it appropriately.  And it moves
the common git environment variables into a new library file, so that
you can link against libgit.a and get the git environment without having
to link in zlib and libcrypt.

In short, most of it is renaming and moving, the real change core is
just a few new lines in "git_default_config()" that copies the user
config values to the new base.

It also changes "git-var -l" to list the config variables.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-11 18:47:34 -07:00
Junio C Hamano
3e09cdfd11 Use core.filemode.
With "[core] filemode = false", you can tell git to ignore
differences in the working tree file only in executable bit.

 * "git-update-index --refresh" does not say "needs update" if index
   entry and working tree file differs only in executable bit.

 * "git-update-index" on an existing path takes executable bit
   from the existing index entry, if the path and index entry are
   both regular files.

 * "git-diff-files" and "git-diff-index" without --cached flag
   pretend the path on the filesystem has the same executable
   bit as the existing index entry, if the path and index entry
   are both regular files.

If you are on a filesystem with unreliable mode bits, you may need to
force the executable bit after registering the path in the index.

 * "git-update-index --chmod=+x foo" flips the executable bit of the
   index file entry for path "foo" on.  Use "--chmod=-x" to flip it
   off.

Note that --chmod only works in index file and does not look at nor
update the working tree.

So if you are on a filesystem and do not have working executable bit,
you would do:

 1. set the appropriate .git/config option;

 2. "git-update-index --add new-file.c"

 3. "git-ls-files --stage new-file.c" to see if it has the desired
   mode bits.  If not, e.g. to drop executable bit picked up from the
   filesystem, say "git-update-index --chmod=-x new-file.c".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-11 18:45:33 -07:00
Linus Torvalds
17712991a5 Add ".git/config" file parser
This is a first cut at a very simple parser for a git config file.

The format of the file is a simple ini-file like thing, with simple
variable/value pairs. You can (and should) make the variables have a
simple single-level scope, ie a valid file looks something like this:

	#
	# This is the config file, and
	# a '#' or ';' character indicates
	# a comment
	#

	; core variables
	[core]
		; Don't trust file modes
		filemode = false

	; Our diff algorithm
	[diff]
		external = "/usr/local/bin/gnu-diff -u"
		renames = true

which parses into three variables: "core.filemode" is associated with the
string "false", and "diff.external" gets the appropriate quoted value.

Right now we only react to one variable: "core.filemode" is a boolean that
decides if we should care about the 0100 (user-execute) bit of the stat
information. Even that is just a parsing demonstration - this doesn't
actually implement that st_mode compare logic itself.

Different programs can react to different config options, although they
should always fall back to calling "git_default_config()" on any config
option name that they don't recognize.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-10 16:31:08 -07:00