git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	9cb76b8cdc	lazy index hashing This delays the hashing of index names until it becomes necessary for the first time. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-22 23:01:13 -08:00
Linus Torvalds	cf558704fb	Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-22 21:46:30 -08:00
Junio C Hamano	6d91da6d3c	read-cache.c: introduce is_racy_timestamp() helper This moves a common boolean expression into a helper function, and makes the comparison between filesystem timestamp and index timestamp done in the function in line with the other places. st.st_mtime should be casted to (unsigned int) when compared to an index timestamp ce_mtime. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-22 21:26:40 -08:00
Junio C Hamano	077c48df8a	read-cache.c: fix a couple more CE_REMOVE conversion It is a D/F conflict if you want to add "foo/bar" to the index when "foo" already exists. Also it is a conflict if you want to add a file "foo" when "foo/bar" exists. An exception is when the existing entry is there only to mark "I used to be here but I am being removed". This is needed for operations such as "git read-tree -m -u" that update the index and then reflect the result to the work tree --- we need to remember what to remove somewhere, and we use the index for that. In such a case, an existing file "foo" is being removed and we can create "foo/" directory and hang "bar" underneath it without any conflict. We used to use (ce->ce_mode == 0) to mark an entry that is being removed, but (CE_REMOVE & ce->ce_flags) is used for that purpose these days. An earlier commit forgot to convert the logic in the code that checks D/F conflict condition. The old code knew that "to be removed" entries cannot be at higher stage and actively checked that condition, but it was an unnecessary check. This patch removes the extra check as well. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-22 21:24:21 -08:00
Junio C Hamano	eadb583134	Avoid running lstat(2) on the same cache entry. Aside from the lstat(2) done for work tree files, there are quite many lstat(2) calls in refname dwimming codepath. This patch is not about reducing them. * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the cache entries that record a regular file blob that is up to date in the work tree. If somebody later walks the index and wants to see if the work tree has changes, they do not have to be checked with lstat(2) again. * fill_stat_cache_info() marks the cache entry it just added with CE_UPTODATE. This has the effect of marking the paths we write out of the index and lstat(2) immediately as "no need to lstat -- we know it is up-to-date", from quite a lot fo callers: - git-apply --index - git-update-index - git-checkout-index - git-add (uses add_file_to_index()) - git-commit (ditto) - git-mv (ditto) * refresh_cache_ent() also marks the cache entry that are clean with CE_UPTODATE. * write_index is changed not to write CE_UPTODATE out to the index file, because CE_UPTODATE is meant to be transient only in core. For the same reason, CE_UPDATE is not written to prevent an accident from happening. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-21 12:44:31 -08:00
Junio C Hamano	7fec10b7f4	index: be careful when handling long names We currently use lower 12-bit (masked with CE_NAMEMASK) in the ce_flags field to store the length of the name in cache_entry, without checking the length parameter given to create_ce_flags(). This can make us store incorrect length. Currently we are mostly protected by the fact that many codepaths first copy the path in a variable of size PATH_MAX, which typically is 4096 that happens to match the limit, but that feels like a bug waiting to happen. Besides, that would not allow us to shorten the width of CE_NAMEMASK to use the bits for new flags. This redefines the meaning of the name length stored in the cache_entry. A name that does not fit is represented by storing CE_NAMEMASK in the field, and the actual length needs to be computed by actually counting the bytes in the name[] field. This way, only the unusually long paths need to suffer. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-21 12:44:31 -08:00
Linus Torvalds	7a51ed66f6	Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-21 12:44:31 -08:00
Junio C Hamano	c78a24986d	Merge branch 'jc/maint-add-sync-stat' * jc/maint-add-sync-stat: t2200: test more cases of "add -u" git-add: make the entry stat-clean after re-adding the same contents ce_match_stat, run_diff_files: use symbolic constants for readability Conflicts: builtin-add.c	2007-11-14 14:15:40 -08:00
Junio C Hamano	fb63d7f889	git-add: make the entry stat-clean after re-adding the same contents Earlier in commit `0781b8a9b2` (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-11-10 00:37:39 -08:00
Junio C Hamano	4bd5b7dacc	ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-11-10 00:24:51 -08:00
Shawn O. Pearce	e75c55844f	Merge branch 'maint' * maint: Yet more 1.5.3.5 fixes mentioned in release notes cvsserver: Use exit 1 instead of die when req_Root fails. git-blame shouldn't crash if run in an unmerged tree git-config: print error message if the config file cannot be read fixing output of non-fast-forward output of post-receive-email	2007-10-18 03:11:17 -04:00
Linus Torvalds	cd8ae20195	git-blame shouldn't crash if run in an unmerged tree If we are in the middle of resolving a merge conflict there may be one or more files whose entries in the index represent an unmerged state (index entries in the higher-order stages). Attempting to run git-blame on any file in such a working directory resulted in "fatal: internal error: ce_mode is 0" as we use the magic marker for an unmerged entry is 0 (set up by things like diff-lib.c's do_diff_cache() and builtin-read-tree.c's read_tree_unmerged()) and the ce_match_stat_basic() function gets upset about this. I'm not entirely sure that the whole "ce_mode = 0" case is a good idea to begin with, and maybe the right thing to do is to remove that horrid freakish special case, but removing the internal error seems to be the simplest fix for now. Linus [sp: Thanks to Björn Steinbrink for the test case] Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-18 02:31:30 -04:00
Carlos Rica	102c2338da	Move make_cache_entry() from merge-recursive.c into read-cache.c The function make_cache_entry() is too useful to be hidden away in merge-recursive. So move it to libgit.a (exposing it via cache.h). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-26 13:42:10 -07:00
Pierre Habouzit	1dffb8fa80	Small cache_tree_write refactor. This function cannot fail, make it void. Also make write_one act on a const char* instead of a char*. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-26 02:27:06 -07:00
Junio C Hamano	58f6fb53dd	Merge branch 'jc/cachetree' into cr/reset * jc/cachetree: Simplify cache API git-format-patch --in-reply-to: accept <message@id> with angle brackets git-add -u: do not barf on type changes Remove duplicate note about removing commits with git-filter-branch git-clone: improve error message if curl program is missing or not executable git.el: Allow the add and remove commands to be applied to ignored files. git.el: Allow selecting whether to display uptodate/unknown/ignored files. git.el: Keep the status buffer sorted by filename. hooks--update: Explicitly check for all zeros for a deleted ref.	2007-09-14 01:19:30 -07:00
Junio C Hamano	09d5dc32fb	Simplify cache API Earlier, add_file_to_index() invalidated the path in the cache-tree but remove_file_from_cache() did not, and the user of the latter needed to invalidate the entry himself. This led to a few bugs due to missed invalidate calls already. This patch makes the management of cache-tree less error prone by making more invalidate calls from lower level cache API functions. The rules are: - If you are going to write the index, you should either maintain cache_tree correctly. - If you cannot, alternatively you can remove the entire cache_tree by calling cache_tree_free() before you call write_cache(). - When you modify the index, cache_tree_invalidate_path() should be called with the path you are modifying, to discard the entry from the cache-tree structure. - The following cache API functions exported from read-cache.c (and the macro whose names have "cache" instead of "index") automatically call cache_tree_invalidate_path() for you: - remove_file_from_index(); - add_file_to_index(); - add_index_entry(); You can modify the index bypassing the above API functions (e.g. find an existing cache entry from the index and modify it in place). You need to call cache_tree_invalidate_path() yourself in such a case. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-14 01:02:21 -07:00
Carlos Rica	6640f88165	Move make_cache_entry() from merge-recursive.c into read-cache.c The function make_cache_entry() is too useful to be hidden away in merge-recursive. So move it to libgit.a (exposing it via cache.h). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-09-12 13:25:07 -07:00
Alexandre Julliard	d616813d75	git-add: Add support for --refresh option. This allows to refresh only a subset of the project files, based on the specified pathspecs. Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-08-13 12:58:38 -07:00
Junio C Hamano	af3785dc5a	Optimize "diff --cached" performance. The read_tree() function is called only from the call chain to run "git diff --cached" (this includes the internal call made by git-runstatus to run_diff_index()). The function vacates stage without any funky "merge" magic. The caller then goes and compares stage #1 entries from the tree with stage #0 entries from the original index. When adding the cache entries this way, it used the general purpose add_cache_entry(). This function looks for an existing entry to replace or if there is none to find where to insert the new entry, resolves D/F conflict and all the other things. For the purpose of reading entries into an empty stage, none of that processing is needed. We can instead append everything and then sort the result at the end. This commit changes read_tree() to first make sure that there is no existing cache entries at specified stage, and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). This new flag tells add_cache_entry() to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and instead append the given cache entry straight at the end of the active cache. The caller of course is expected to sort the resulting cache at the end before using the result. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-08-10 11:44:23 -07:00
Junio C Hamano	0781b8a9b2	add_file_to_index: skip rehashing if the cached stat already matches An earlier commit `366bfcb6` broke git-add by moving read_cache() call down, because it wanted the directory walking code to grab paths that are already in the index. The change serves its purpose, but introduces a regression because the responsibility of avoiding unnecessary reindexing by matching the cached stat is shifted nowhere. This makes it the job of add_file_to_index() function. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-30 17:49:50 -07:00
Johannes Schindelin	2031427167	git add: respect core.filemode with unmerged entries When a merge left unmerged entries, git add failed to pick up the file mode from the index, when core.filemode == 0. If more than one unmerged entry is there, the order of stage preference is 2, 1, 3. Noticed by Johannes Sixt. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-01 13:26:05 -07:00
Junio C Hamano	a6080a0a44	War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-07 00:04:01 -07:00
Martin Waitz	302b9282c9	rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-05-21 23:34:54 -07:00
Luiz Fernando N. Capitulino	3511a3774e	read_cache_from(): small simplification This change 'opens' the code block which maps the index file into memory, making the code clearer and easier to read. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-25 13:44:27 -07:00
Junio C Hamano	4aab5b46f4	Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-22 22:53:54 -07:00
Junio C Hamano	228e94f935	Move index-related variables into a structure. This defines a index_state structure and moves index-related global variables into it. Currently there is one instance of it, the_index, and everybody accesses it, so there is no code change. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-22 22:53:54 -07:00
Linus Torvalds	a8ee75bc7a	Fix gitlink index entry filesystem matching The code to match up index entries with the filesystem was stupidly broken. We shouldn't compare the filesystem stat() information with S_IFDIRLNK, since that's purely a git-internal value, and not what the filesystem uses (on the filesystem, it's just a regular directory). Also, don't bother to make the stat() time comparisons etc for DIRLNK entries in ce_match_stat_basic(), since we do an exact match for these things, and the hints in the stat data simply doesn't matter. This fixes "git status" with submodules that haven't been checked out in the supermodule. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-14 03:14:12 -07:00
Linus Torvalds	095952585c	Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-11 19:09:55 -07:00
Linus Torvalds	1833a92548	Fix thinko in subproject entry sorting This fixes a total thinko in my original series: subprojects do not sort like directories, because the index is sorted purely by full pathname, and since a subproject shows up in the index as a normal NUL-terminated string, it never has the issues with sorting with the '/' at the end. So if you have a subproject "proj" and a file "proj.c", the subproject sorts alphabetically before the file in the index (and must thus also sort that way in a tree object, since trees sort as the index). In contrast, it you have two files "proj/file" and "proj.c", the "proj.c" will sort alphabetically before "proj/file" in the index. The index itself, of course, does not actually contain an entry "proj/", but in the tree that gets written out, the tree entry "proj" will sort after the file entry "proj.c", which is the only real magic sorting rule. In other words: the magic sorting rule only affects tree entries, and only affects tree entries that point to other trees (ie are of the type S_IFDIR). Anyway, that thinko just means that we should remove the special case to make S_ISDIRLNK entries sort like S_ISDIR entries. They don't. They sort like normal files. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-11 17:21:12 -07:00
Linus Torvalds	f35a6d3bce	Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-10 13:50:43 -07:00
Junio C Hamano	640ee0d1cd	Merge branch 'jc/read-tree-df' (early part) * 'jc/read-tree-df' (early part): Fix switching to a branch with D/F when current branch has file D. Fix twoway_merge that passed d/f conflict marker to merged_entry(). Fix read-tree --prefix=dir/. unpack-trees: get rid of *indpos parameter. unpack_trees.c: pass unpack_trees_options structure to keep_entry() as well. add_cache_entry(): removal of file foo does not conflict with foo/bar	2007-04-07 23:52:40 -07:00
Junio C Hamano	fd1c3bf053	Rename add_file_to_index() to add_file_to_cache() This function was not called "add_file_to_cache()" only because an ancient program, update-cache, used that name as an internal function name that does something slightly different. Now that is gone, we can take over the better name. The plan is to name all functions that operate on the default index xxx_cache(). Later patches create a variant of them that take an explicit parameter xxx_index(), and then turn xxx_cache() functions into macros that use "the_index". Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-05 15:07:16 -07:00
Junio C Hamano	ec0cc70469	Propagate cache error internal to refresh_cache() via parameter. The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-05 15:07:16 -07:00
Junio C Hamano	0424138d57	Fix bogus error message from merge-recursive error path This error message should not usually trigger, but the function make_cache_entry() called by add_cacheinfo() can return early without calling into refresh_cache_entry() that sets cache_errno. Also the error message had a wrong function name reported, and it did not say anything about which path failed either. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-05 15:07:16 -07:00
Junio C Hamano	21cd8d00b6	add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-04 00:19:28 -07:00
Shawn O. Pearce	dc49cd769b	Cast 64 bit off_t to 32 bit size_t Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-03-07 11:15:26 -08:00
Johannes Sixt	78a8d641c1	Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-03-02 16:58:05 -08:00
Junio C Hamano	53bca91a7d	index_fd(): pass optional path parameter as hint for blob conversion Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-02-28 12:00:00 -08:00
Junio C Hamano	edaec3fbe8	index_fd(): use enum object_type instead of type name string. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-02-28 12:00:00 -08:00
Nicolas Pitre	21666f1aae	convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-02-27 01:34:21 -08:00
Junio C Hamano	185c975faa	Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-02-16 22:56:06 -08:00
Linus Torvalds	2cdf9509df	write-cache: do not leak the serialized cache-tree data. It is not used after getting written, and just is leaking every time we write the index out. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-01-11 12:25:16 -08:00
Andy Whitcroft	93822c2239	short i/o: fix calls to write to use xwrite or write_in_full We have a number of badly checked write() calls. Often we are expecting write() to write exactly the size we requested or fail, this fails to handle interrupts or short writes. Switch to using the new write_in_full(). Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xwrite(). Note, the changes to config handling are much larger and handled in the next patch in the sequence. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-01-08 15:44:47 -08:00
Shawn O. Pearce	5fe5c8300d	Cleanup read_cache_from error handling. When I converted the mmap() call to xmmap() I failed to cleanup the way this routine handles errors and left some crufty code behind. This is a small cleanup, suggested by Johannes. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-12-29 11:36:45 -08:00
Shawn O. Pearce	c4712e4553	Replace mmap with xmmap, better handling MAP_FAILED. In some cases we did not even bother to check the return value of mmap() and just assume it worked. This is bad, because if we are out of virtual address space the kernel returned MAP_FAILED and we would attempt to dereference that address, segfaulting without any real error output to the user. We are replacing all calls to mmap() with xmmap() and moving all MAP_FAILED checking into that single location. If a mmap call fails we try to release enough least-recently-used pack windows to possibly succeed, then retry the mmap() attempt. If we cannot mmap even after releasing pack memory then we die() as none of our callers have any reasonable recovery strategy for a failed mmap. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-12-29 11:36:45 -08:00
Junio C Hamano	81a361be3b	Fix check_file_directory_conflict(). When replacing an existing file A with a directory A that has a file A/B in it in the index, 'update-index --replace --add A/B' did not properly remove the file to make room for the new directory. There was a trivial logic error, most likely a cut & paste one, dating back to quite early days of git. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-12-17 01:14:44 -08:00
Junio C Hamano	c33ab0dd10	git-add: remove conflicting entry when adding. When replacing an existing file A with a directory A that has a file A/B in it in the index, 'git add' did not succeed because it forgot to pass the allow-replace flag to add_cache_entry(). It might be safer to leave this as an error and require the user to explicitly remove the existing A first before adding A/B since it is an unusual case, but doing that automatically is much easier to use. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-12-17 01:14:43 -08:00
Junio C Hamano	790fa0e297	update-index: make D/F conflict error a bit more verbose. When you remove a directory D that has a tracked file D/F out of the way to create a file D and try to "git update-index --add D", it used to say "cannot add" which was not very helpful. This issues an extra error message to explain the situation before the final "fatal" message. Since D/F conflicts are relatively rare event, extra verbosity would not make things too noisy. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-12-17 01:14:43 -08:00
Junio C Hamano	2bbaaed9ee	trust-executable-bit: fix breakage for symlinks An earlier commit f28b34a broke symlinks when trust-executable-bit is not set because it incorrectly assumed that everything was a regular file. Reported by Juergen Ruehle. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-11-22 16:36:49 -08:00
Rene Scharfe	a6e8a76770	sparse fix: non-ANSI function declaration The declaration of discard_cache() in cache.h already has its "void". Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-11-18 11:40:00 -08:00
Shawn Pearce	fd28b34afd	Ignore executable bit when adding files if filemode=0. If the user has configured core.filemode=0 then we shouldn't set the execute bit in the index when adding a new file as the user has indicated that the local filesystem can't be trusted. This means that when adding files that should be marked executable in a repository with core.filemode=0 the user must perform a 'git update-index --chmod=+x' on the file before committing the addition. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-09-26 22:42:52 -07:00
Junio C Hamano	1e49cb8ad4	Merge branch 'js/c-merge-recursive' * js/c-merge-recursive: (21 commits) discard_cache(): discard index, even if no file was mmap()ed merge-recur: do not die unnecessarily merge-recur: try to merge older merge bases first merge-recur: if there is no common ancestor, fake empty one merge-recur: do not setenv("GIT_INDEX_FILE") merge-recur: do not call git-write-tree merge-recursive: fix rename handling .gitignore: git-merge-recur is a built file. merge-recur: virtual commits shall never be parsed merge-recur: use the unpack_trees() interface instead of exec()ing read-tree merge-recur: fix thinko in unique_path() Makefile: git-merge-recur depends on xdiff libraries. merge-recur: Explain why sha_eq() and struct stage_data cannot go merge-recur: Cleanup last mixedCase variables... merge-recur: Fix compiler warning with -pedantic merge-recur: Remove dead code merge-recur: Get rid of debug code merge-recur: Convert variable names to lower_case Cumulative update of merge-recursive in C recur vs recursive: help testing without touching too many stuff. ... This is an evil merge that removes TEST script from the toplevel.	2006-08-27 20:33:46 -07:00
David Rientjes	a89fccd281	Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length. Introduces global inline: hashcmp(const unsigned char sha1, const unsigned char sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-17 14:23:53 -07:00
Junio C Hamano	8e3abd4c97	Merge branch 'jc/racy' * jc/racy: Remove the "delay writing to avoid runtime penalty of racy-git avoidance" Add check program "git-check-racy" Documentation/technical/racy-git.txt avoid nanosleep(2)	2006-08-16 14:00:34 -07:00
Junio C Hamano	0fc82cff12	Remove the "delay writing to avoid runtime penalty of racy-git avoidance" The work-around should not be needed. Even if it turns out we would want it later, git will remember the patch for us ;-). Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-15 22:12:54 -07:00
Junio C Hamano	42f774063d	Add check program "git-check-racy" This will help counting the racily clean paths, but it should be useless for daily use. Do not even enable it in the makefile. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-15 21:38:07 -07:00
David Rientjes	96f1e58f52	remove unnecessary initializations [jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-15 21:22:20 -07:00
Junio C Hamano	789a09b487	avoid nanosleep(2) On Solaris nanosleep(2) is not available in libc; you need to link with -lrt to get it. The purpose of the loop is to wait until the next filesystem timestamp granularity, and the code uses subsecond sleep in the hope that it can shorten the delay to 0.5 seconds on average instead of a full second. It is probably not worth depending on an extra library for this. We might want to yank out the whole "racy-git avoidance is costly later at runtime, so let's delay writing the index out" codepath later, but that is a separate issue and needs some testing on large trees to figure it out. After playing with the kernel tree, I have a feeling that the whole thing may not be worth it. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-15 03:39:47 -07:00
David Rientjes	968a1d65f4	read-cache.c cleanup Removes conditional returns. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-14 18:41:12 -07:00
Junio C Hamano	eed94a570e	Merge branch 'master' into js/c-merge-recursive Adjust to hold_lock_file_for_update() change on the master.	2006-08-12 18:35:14 -07:00
Johannes Schindelin	4147d801db	discard_cache(): discard index, even if no file was mmap()ed Since add_cacheinfo() can be called without a mapped index file, discard_cache() _has_ to discard the entries, even when cache_mmap == NULL. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-10 14:30:01 -07:00
Junio C Hamano	6015c28b1d	read-cache: tweak racy-git delay logic Instead of looping over the entries and writing out, use a separate loop after all entries have been written out to check how many entries are racily clean. Make sure that the newly created index file gets the right timestamp when we check by flushing the buffered data by ce_write(). Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-08 17:17:04 -07:00
Junio C Hamano	b7e58b17b5	Racy git: avoid having to be always too careful Immediately after a bulk checkout, most of the paths in the working tree would have the same timestamp as the index file, and this would force ce_match_stat() to take slow path for all of them. When writing an index file out, if many of the paths have very new (read: the same timestamp as the index file being written out) timestamp, we are better off delaying the return from the command, to make sure that later command to touch the working tree files will leave newer timestamps than recorded in the index, thereby avoiding to take the slow path. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-07 01:58:53 -07:00
Linus Torvalds	7f8508e8d3	Fix double "close()" in ce_compare_data Doing an "strace" on "git diff" shows that we close() a file descriptor twice (getting EBADFD on the second one) when we end up in ce_compare_data if the index does not match the checked-out stat information. The "index_fd()" function will already have closed the fd for us, so we should not close it again. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-07-31 11:55:56 -07:00
Junio C Hamano	c1a788acee	Merge branch 'js/read-tree' into js/c-merge-recursive * js/read-tree: (107 commits) read-tree: move merge functions to the library read-trees: refactor the unpack_trees() part tar-tree: illustrate an obscure feature better git.c: allow alias expansion without a git directory setup_git_directory_gently: do not barf when GIT_DIR is given. Build on Debian GNU/kFreeBSD Call setup_git_directory() much earlier Call setup_git_directory() early Display an error from update-ref if target ref name is invalid. Fix http-fetch t4103: fix binary patch application test. git-apply -R: binary patches are irreversible for now. Teach git-apply about '-R' Makefile: ssh-pull.o depends on ssh-fetch.c log and diff family: honor config even from subdirectories git-reset: detect update-ref error and report it. lost-found: use fsck-objects --full Teach git-http-fetch the --stdin switch Teach git-local-fetch the --stdin switch Make pull() support fetching multiple targets at once ...	2006-07-30 23:42:10 -07:00
Johannes Schindelin	11be42a476	Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-07-26 13:36:36 -07:00
Johannes Schindelin	8fd2cb4069	Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-07-26 13:36:36 -07:00
Johannes Schindelin	6d297f8137	Status update on merge-recursive in C This is just an update for people being interested. Alex and me were busy with that project for a few days now. While it has progressed nicely, there are quite a couple TODOs in merge-recursive.c, just search for "TODO". For impatient people: yes, it passes all the tests, and yes, according to the evil test Alex did, it is faster than the Python script. But no, it is not yet finished. Biggest points are: - there are still three external calls - in the end, it should not be necessary to write the index more than once (just before exiting) - a lot of things can be refactored to make the code easier and shorter BTW we cannot just plug in git-merge-tree yet, because git-merge-tree does not handle renames at all. This patch is meant for testing, and as such, - it compile the program to git-merge-recur - it adjusts the scripts and tests to use git-merge-recur instead of git-merge-recursive - it provides "TEST", a script to execute the tests regarding -recursive - it inlines the changes to read-cache.c (read_cache_from(), discard_cache() and refresh_cache_entry()) Brought to you by Alex Riesen and Dscho Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-07-13 23:10:19 -07:00
Pavel Roskin	a9486b02ec	Avoid C99 comments, use old-style C comments instead. This doesn't make the code uglier or harder to read, yet it makes the code more portable. This also simplifies checking for other potential incompatibilities. "gcc -std=c89 -pedantic" can flag many incompatible constructs as warnings, but C99 comments will cause it to emit an error. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-07-10 00:47:13 -07:00
Florian Forster	1d7f171c3a	Remove all void-pointer arithmetic. ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in various ways. Usually the strategy that required the least changes was used. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-06-20 01:59:46 -07:00
Junio C Hamano	7d55561986	Merge branch 'jc/dirwalk-n-cache-tree' into jc/cache-tree * jc/dirwalk-n-cache-tree: (212 commits) builtin-rm: squelch compiler warnings. Add builtin "git rm" command Move pathspec matching from builtin-add.c into dir.c Prevent bogus paths from being added to the index. builtin-add: fix unmatched pathspec warnings. Remove old "git-add.sh" remnants builtin-add: warn on unmatched pathspecs Do "git add" as a builtin Clean up git-ls-file directory walking library interface libify git-ls-files directory traversal Add a conversion tool to migrate remote information into the config fetch, pull: ask config for remote information Fix build procedure for builtin-init-db read-tree -m -u: do not overwrite or remove untracked working tree files. apply --cached: do not check newly added file in the working tree Implement a --dry-run option to git-quiltimport Implement git-quiltimport Revert "builtin-grep: workaround for non GNU grep." builtin-grep: workaround for non GNU grep. builtin-grep: workaround for non GNU grep. ...	2006-05-28 22:34:34 -07:00
Dennis Stosberg	ac58c7b18e	git-write-tree writes garbage on sparc64 In the "next" branch, write_index_ext_header() writes garbage on a 64-bit big-endian machine; the written index file will be unreadable. I noticed this on NetBSD/sparc64. Reproducible with: $ git init-db $ :>file $ git-update-index --add file $ git-write-tree $ git-update-index error: index uses extension, which we do not understand fatal: index file corrupt Signed-off-by: Dennis Stosberg <dennis@stosberg.net> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-05-28 13:31:50 -07:00
Junio C Hamano	93872e0700	Merge branch 'lt/dirwalk' into jc/dirwalk-n-cache-tree This commit is what this branch is all about. It records the evil merge needed to adjust built-in git-add and git-rm for the cache-tree extension. * lt/dirwalk: Add builtin "git rm" command Move pathspec matching from builtin-add.c into dir.c Prevent bogus paths from being added to the index. builtin-add: fix unmatched pathspec warnings. Remove old "git-add.sh" remnants builtin-add: warn on unmatched pathspecs Do "git add" as a builtin Clean up git-ls-file directory walking library interface libify git-ls-files directory traversal Conflicts: Makefile builtin.h git.c update-index.c	2006-05-20 01:52:19 -07:00
Junio C Hamano	283c8eef6c	Merge branch 'jc/cache-tree' into jc/dirwalk-n-cache-tree * jc/cache-tree: (24 commits) Fix crash when reading the empty tree fsck-objects: do not segfault on missing tree in cache-tree cache-tree: a bit more debugging support. read-tree: invalidate cache-tree entry when a new index entry is added. Fix test-dump-cache-tree in one-tree disappeared case. fsck-objects: mark objects reachable from cache-tree cache-tree: replace a sscanf() by two strtol() calls cache-tree.c: typefix test-dump-cache-tree: validate the cached data as well. cache_tree_update: give an option to update cache-tree only. read-tree: teach 1-way merege and plain read to prime cache-tree. read-tree: teach 1 and 2 way merges about cache-tree. update-index: when --unresolve, smudge the relevant cache-tree entries. test-dump-cache-tree: report number of subtrees. cache-tree: sort the subtree entries. Teach fsck-objects about cache-tree. index: make the index file format extensible. cache-tree: protect against "git prune". Add test-dump-cache-tree Use cache-tree in update-index. ...	2006-05-20 00:56:11 -07:00
Linus Torvalds	405e5b2fe0	Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-05-19 15:59:18 -07:00
Linus Torvalds	8dcf39c46e	Prevent bogus paths from being added to the index. With this one, it's now a fatal error to try to add a pathname that cannot be added with "git add", i.e. [torvalds@g5 git]$ git add .git/config fatal: unable to add .git/config to index and [torvalds@g5 git]$ git add foo/../bar fatal: unable to add foo/../bar to index instead of the old "Ignoring path xyz" warning that would end up silently succeeding on any other paths. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-05-18 12:07:31 -07:00
Yakov Lerner	28cc4ab422	read-cache.c: use xcalloc() not calloc() Elsewhere we use xcalloc(); we should consistently do so. Signed-off-by: Yakov Lerner <iler.ml@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-05-09 06:28:59 -07:00
Junio C Hamano	bad68ec924	index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-04-24 21:24:13 -07:00
Junio C Hamano	1af1c2b63d	read-cache/write-cache: optionally return cache checksum SHA1. read_cache_1() and write_cache_1() takes an extra parameter *sha1 that returns the checksum of the index file when non-NULL. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-04-23 16:57:40 -07:00
Junio C Hamano	7b80be150c	cache_name_compare() compares name and stage, nothing else. The code was a bit unclear in expressing what it wants to compare. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-12 23:46:25 -08:00
Junio C Hamano	5f73076c1a	"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-08 21:54:42 -08:00
Junio C Hamano	4b3511b0f8	ce_smudge_racily_clean_entry: explain why it works. This is a tricky code and warrants extra commenting. I wasted 30 minutes trying to break it until I realized why it works. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-12-20 14:18:47 -08:00
Junio C Hamano	407c8eb0d0	Racy GIT (part #2 ) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-12-20 12:12:18 -08:00
Junio C Hamano	29e4d36357	Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-12-20 00:22:28 -08:00
Linus Torvalds	e1b10391ea	Use git config file for committer name and email info This starts using the "user.name" and "user.email" config variables if they exist as the default name and email when committing. This means that you don't have to use the GIT_COMMITTER_EMAIL environment variable to override your email - you can just edit the config file instead. The patch looks bigger than it is because it makes the default name and email information non-static and renames it appropriately. And it moves the common git environment variables into a new library file, so that you can link against libgit.a and get the git environment without having to link in zlib and libcrypt. In short, most of it is renaming and moving, the real change core is just a few new lines in "git_default_config()" that copies the user config values to the new base. It also changes "git-var -l" to list the config variables. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-10-11 18:47:34 -07:00
Junio C Hamano	3e09cdfd11	Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-10-11 18:45:33 -07:00
Linus Torvalds	17712991a5	Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-10-10 16:31:08 -07:00
Linus Torvalds	5d1a5c02e8	[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-10-01 23:55:47 -07:00
Junio C Hamano	6b5ee137e5	Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-09-24 23:50:43 -07:00
Junio C Hamano	b0391890d2	Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-09-20 15:07:53 -07:00
Qingning Huo	2c865d9aa7	[PATCH] Fix buffer overflow in ce_flush(). Add a check before appending SHA1 signature to write_buffer, flush it first if necessary. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-09-11 10:51:13 -07:00
Linus Torvalds	f332726eaa	[PATCH] Improve handling of "." and ".." in git-diff-* This fixes up usage of ".." (without an ending slash) and "." (with or without the ending slash) in the git diff family. It also fixes pathspec matching for the case of an empty pathspec, since a "." in the top-level directory (or enough ".." under subdirectories) will result in an empty pathspec. We used to not match it against anything, but it should in fact match everything. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-08-16 21:33:25 -07:00
Pavel Roskin	e35f982415	[PATCH] mmap error handling I have reviewed all occurrences of mmap() in git and fixed three types of errors/defects: 1) The result is not checked. 2) The file descriptor is closed if mmap() succeeds, but not when it fails. 3) Various casts applied to -1 are used instead of MAP_FAILED, which is specifically defined to check mmap() return value. [jc: This is a second round of Pavel's patch. He fixed up the problem that close() potentially clobbering the errno from mmap, which the first round had.] Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-07-29 17:21:48 -07:00
Linus Torvalds	c0fd1f517e	Make "ce_match_path()" a generic helper function ... and make git-diff-files use it too. This all _should_ make the diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the path matching early interally.	2005-07-14 16:55:06 -07:00
Junio C Hamano	b155725dae	[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:52:16 -07:00
Linus Torvalds	2160a025d2	read-cache.c: remove stray debugging printf Pointed out by Junio, part of my debugging of the rewrite of the file/dir conflict handling.	2005-06-18 23:34:12 -07:00
Linus Torvalds	12676608fe	Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day.	2005-06-18 20:21:34 -07:00
Junio C Hamano	025a0709b6	[PATCH] Bugfix: read-cache.c:write_cache() misrecords number of entries. When we choose to omit deleted entries, we should subtract numbers of such entries from the total number in the header. Signed-off-by: Junio C Hamano <junkio@cox.net> Oops. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-10 07:42:45 -07:00
Linus Torvalds	aa16021efc	git-read-tree: remove deleted files in the working directory Only when "-u" is used of course.	2005-06-09 15:34:04 -07:00
Timo Hirvonen	972d1bb067	[PATCH] Use ntohs instead of htons to convert ce_flags to host byte order Use ntohs instead of htons to convert ce_flags to host byte order Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-07 13:44:11 -07:00

1 2 3 4 5

201 Commits