Commit Graph

13301 Commits

Author SHA1 Message Date
Junio C Hamano
bb8eebb94f Revert "filter-branch docs: remove brackets so not to imply revision arg is optional"
This reverts commit c41b439244, as
we decided to default to HEAD when revision parameters are missing
and they are no longer mandatory.
2008-01-31 13:51:42 -08:00
Jean-Luc Herren
c02792edd6 Documentation/git-cvsserver: Fix typo
Signed-off-by: Jean-Luc Herren <jlh@gmx.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-30 18:50:41 -08:00
Brandon Casey
0f047f3b47 filter-branch: assume HEAD if no revision supplied
filter-branch previously took the first non-option argument as the name for
a new branch. Since dfd05e38, it now takes a revision or a revision range
and modifies the current branch. Update to operate on HEAD by default to
conform with standard git interface practice.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-30 18:50:25 -08:00
Brandon Casey
c41b439244 filter-branch docs: remove brackets so not to imply revision arg is optional
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-30 17:21:02 -08:00
Shawn O. Pearce
733f1815ab Use 'printf %s $x' notation in t5401
We only care about getting what should be an empty string and
sending it to a file, without a trailing LF, so the empty string
translates into a 0 byte file.  Earlier when I originally wrote
these lines Mac OS X allowed the format string of printf to be
the empty string, but more recent versions appear to have been
'improved' with error messages if the format is not given.

This may cause problems if we ever wind up with changes to the hook
tests.  A minor cleanup makes the test more safe on all systems,
by conforming to accepted printf conventions.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-30 17:17:39 -08:00
Brandon Casey
def16e7181 filter-branch.sh: remove temporary directory on failure
One of the first things filter-branch does is to create a temporary
directory. This directory is eventually removed by the script during
normal operation, but is not removed if the script encounters an error.

Set a trap to remove it when the script terminates for any reason.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-30 11:56:12 -08:00
Brandon Casey
0eab8ca68a git-relink: avoid hard linking in objects/info directory
git-relink is intended to search for packs and loose objects in
common between two repositories and to replace the one set with
hard links to the other. Files other than packs and loose objects
should not be touched, so add the "info" sub-directory to the
pattern of directory excludes.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-30 00:01:27 -08:00
Bruno Ribas
c1dcf7ebf2 gitweb: Make use of the $git_dir variable at sub git_get_project_description
Signed-off-by: Bruno Ribas <ribas@c3sl.ufpr.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 23:55:18 -08:00
Jakub Narebski
d661146ac2 gitweb: Add info about $projectroot and $projects_list to gitweb/README
Those two configuration variables are important enough that it is
worth to explicitely write about them in the "Gitweb config file
variables" section even if they are usually set during build by
GITWEB_PROJECTROOT and GITWEB_LIST build (Makefile) configuration
variables.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 22:01:20 -08:00
Jim Meyering
a5d86f7406 fix doc typos
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 22:00:34 -08:00
Junio C Hamano
bda3a31cc7 reflog-expire: Avoid creating new files in a directory inside readdir(3) loop
"git reflog expire --all" opened a directory in $GIT_DIR/logs/,
read reflog files in there readdir(3), and rewrote the file by
creating a new file and renaming it back inside the loop.  This
code structure can cause the newly created file to be returned
by subsequent call to readdir(3), and fall into an infinite loop
in the worst case.

This separates the processing to two phase.  Running
for_each_reflog() to find out and collect all refs, and then
iterate over them, calling expire_reflog().  This way, the
program would behave exactly the same way as if all the refs
were given by the user from the command line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 21:48:57 -08:00
Yasushi SHOJI
7720224ceb gitweb: Convert generated contents to utf8 in commitdiff_plain
If the commit message, or commit author contains non-ascii, it must be
converted from Perl internal representation to utf-8, to follow what
got declared in HTTP header.  Use to_utf8() to do the conversion.

This necessarily replaces here-doc with "print" statements.

Signed-off-by: Yasushi SHOJI <yashi@atmark-techno.com>
Acked-by: İsmail Dönmez <ismail@pardus.org.tr>
Acked-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 21:23:57 -08:00
Christian Couder
ab989adf6a instaweb: use 'browser.<tool>.path' config option if it's set.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 00:49:28 -08:00
Christian Couder
f7ff09d718 Documentation: help: specify supported html browsers.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 00:49:27 -08:00
Christian Couder
584627b4a6 Documentation: config: add "browser.<tool>.path".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 00:49:23 -08:00
Johannes Schindelin
752527f513 Add test for rebase -i with commits that do not pass pre-commit
This accompanies c5b09feb78 (Avoid
update hook during git-rebase --interactive) to make sure that
any regression to make Debian's Bug#458782 (git-core: git-rebase
doesn't work when trying to squash changes into commits created
with --no-verify) resurface will be caught.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-28 11:04:00 -08:00
Jeff King
c0d4528119 t9001: add missing && operators
The exit value of some commands was not being used for the
test output.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-28 11:02:59 -08:00
Junio C Hamano
bf5aeb1506 GIT 1.5.4-rc5
Hopefully the last rc before the final...

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 22:48:03 -08:00
Johannes Schindelin
c85c79279d pull --rebase: be cleverer with rebased upstream branches
When the upstream branch is tracked, we can detect if that branch
was rebased since it was last fetched.  Teach git to use that
information to rebase from the old remote head onto the new remote head.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 18:24:24 -08:00
Steffen Prohaska
e509db990b cvsserver: Fix for histories with multiple roots
Git histories may have multiple roots, which can cause
git merge-base to fail and this caused git cvsserver to die.

This commit teaches git cvsserver to handle a failing git
merge-base gracefully, and modifies the test case to verify this.
All the test cases now use a history with two roots.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>

 git-cvsserver.perl              |    9 ++++++++-
 t/t9400-git-cvsserver-server.sh |   10 +++++++++-
 2 files changed, 17 insertions(+), 2 deletions(-)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 17:58:18 -08:00
Steffen Prohaska
7549376587 t9400-git-cvsserver-server: Wrap setup into test case
It is preferable to have the test setup in a test case.  The
setup itself may fail and having it as a test case handles this
situation more gracefully.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 17:44:05 -08:00
Mike Hommey
ab8daa1836 Documentation: add a bit about sendemail.to configuration
While there is information about this in the configuration section, it was
missing in the options section.

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 10:59:07 -08:00
Pierre Habouzit
3a9f0f41db parse-options: catch likely typo in presense of aggregated options.
If options are aggregated, and that the whole token is an exact
prefix of a long option that is longer than 2 letters, reject
it.  This is to prevent a common typo:

	$ git commit -amend

to get interpreted as "commit all with message 'end'".

The typo check isn't performed if there is no aggregation,
because the stuck form is the recommended one.  If we have `-o`
being a valid short option that takes an argument, and --option
a long one, then we _MUST_ accept -option as "'o' option with
argument 'ption'", which is our official recommended form.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 10:53:31 -08:00
Mike Hommey
923e3ec84a Add a missing dependency on http.h
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-26 10:52:40 -08:00
Miklos Vajna
10eb64f5fd git pull manpage: don't include -n from fetch-options.txt
The -n option stands for --no-summary in git pull

[jes: reworded the description to avoid mentioning 'git-fetch';
 also exclude '-n' conditional on git-pull -- ugly because of
 the missing "else" statement in asciidoc]

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-25 22:42:36 -08:00
Sam Vilain
2b0d1033a3 git-svn(1): update instructions for resuming a git-svn clone
git-svn expects its references under refs/remotes/*; but these will
not be copied or set by "git clone"; put in this man page the manual
fiddling that is required with current git-svn to get this to work.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-25 22:31:52 -08:00
Jakub Narebski
3cf3237400 autoconf: define NO_SYS_SELECT_H on systems without <sys/select.h>.
Pre-POSIX.1-2001 systems don't have <sys/select.h>, but select(2)
is declared in <sys/time.h>, which git-compat-util.h includes.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-25 22:26:08 -08:00
Robert Schiele
81cc66a526 Makefile: customization for supporting HP-UX
Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-24 21:58:53 -08:00
Robert Schiele
2600973f2c pre-POSIX.1-2001 systems do not have <sys/select.h>
POSIX.1-2001 has declaration of select(2) in <sys/select.h>, but
in the previous version of SUS, it was declared in <sys/time.h>
(which is already included in git-compat-util.h).

This introduces NO_SYS_SELECT_H macro in the Makefile to be set
on older systems, to skip inclusion of <sys/select.h> that does
not exist on them.

We could check _POSIX_VERSION with 200112L and do this
automatically, but earlier it was reported that the approach
does not work well on some vintage of HP-UX.  Other systems may
get _POSIX_VERSION itself wrong.  At least for now, this manual
configuration is safer.

Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-24 14:01:28 -08:00
Junio C Hamano
cab31fa076 Merge git://repo.or.cz/git-gui
* git://repo.or.cz/git-gui:
  git-gui: Correctly cleanup msgfmt '1 message untranslated' output
  git-gui: Make the statistics of po2msg match those of msgfmt
  git-gui: Fallback to Tcl based po2msg.sh if msgfmt isn't available
  git-gui: Work around random missing scrollbar in revision list
2008-01-23 21:37:12 -08:00
Brandon Casey
5a9dd3998f git-commit: exit non-zero if we fail to commit the index
In certain rare cases, the creation of the commit object
and update of HEAD can succeed, but then installing the
updated index will fail. This is most likely caused by a
full disk or exceeded disk quota. When this happens the
new index file will be removed, and the repository will
be left with the original now-out-of-sync index. The
user can recover with a "git reset HEAD" once the disk
space issue is resolved.

We should detect this failure and offer the user some
helpful guidance.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-23 10:10:11 -08:00
Miklos Vajna
28678b4f2f git-clone -s: document problems with git gc --prune
There is a scenario when using git clone -s and git gc --prune togother is
dangerous. Document this.

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-23 10:09:54 -08:00
Junio C Hamano
9cb76b8cdc lazy index hashing
This delays the hashing of index names until it becomes necessary for
the first time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 23:01:13 -08:00
Linus Torvalds
cf558704fb Create pathname-based hash-table lookup into index
This creates a hash index of every single file added to the index.
Right now that hash index isn't actually used for much: I implemented a
"cache_name_exists()" function that uses it to efficiently look up a
filename in the index without having to do the O(logn) binary search,
but quite frankly, that's not why this patch is interesting.

No, the whole and only reason to create the hash of the filenames in the
index is that by modifying the hash function, you can fairly easily do
things like making it always hash equivalent names into the same bucket.

That, in turn, means that suddenly questions like "does this name exist
in the index under an _equivalent_ name?" becomes much much cheaper.

Guiding principles behind this patch:

 - it shouldn't be too costly. In fact, my primary goal here was to
   actually speed up "git commit" with a fully populated kernel tree, by
   being faster at checking whether a file already existed in the index. I
   did succeed, but only barely:

	Best before:
		[torvalds@woody linux]$ time git commit > /dev/null
		real    0m0.255s
		user    0m0.168s
		sys     0m0.088s

	Best after:

		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
		real    0m0.233s
		user    0m0.144s
		sys     0m0.088s

   so some things are actually faster (~8%).

   Caveat: that's really the best case. Other things are invariably going
   to be slightly slower, since we populate that index cache, and quite
   frankly, few things really use it to look things up.

   That said, the cost is really quite small. The worst case is probably
   doing a "git ls-files", which will do very little except puopulate the
   index, and never actually looks anything up in it, just lists it.

	Before:
		[torvalds@woody linux]$ time git ls-files > /dev/null
		real    0m0.016s
		user    0m0.016s
		sys     0m0.000s

	After:
		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
		real    0m0.021s
		user    0m0.012s
		sys     0m0.008s

   and while the thing has really gotten relatively much slower, we're
   still talking about something almost unmeasurable (eg 5ms). And that
   really should be pretty much the worst case.

   So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
   poison - this patch has the advantage that it will _likely_ speed up
   the cases that are complex and expensive more than it slows down the
   cases that are already so fast that nobody cares. But if you look at
   relative speedups/slowdowns, it doesn't look so good.

 - It should be simple and clean

   The code may be a bit subtle (the reasons I do hash removal the way I
   do etc), but it re-uses the existing hash.c files, so it really is
   fairly small and straightforward apart from a few odd details.

Now, this patch on its own doesn't really do much, but I think it's worth
looking at, if only because if done correctly, the name hashing really can
make an improvement to the whole issue of "do we have a filename that
looks like this in the index already". And at least it gets real testing
by being used even by default (ie there is a real use-case for it even
without any insane filesystems).

NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
not ashamed of it enough to really care. I took all the numbers out of my
nether regions - I'm sure it's good enough that it works in practice, but
the whole point was that you can make a really much fancier hash that
hashes characters not directly, but by their upper-case value or something
like that, and thus you get a case-insensitive hash, while still keeping
the name and the index itself totally case sensitive.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:46:30 -08:00
Junio C Hamano
6d91da6d3c read-cache.c: introduce is_racy_timestamp() helper
This moves a common boolean expression into a helper function,
and makes the comparison between filesystem timestamp and index
timestamp done in the function in line with the other places.
st.st_mtime should be casted to (unsigned int) when compared to
an index timestamp ce_mtime.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:26:40 -08:00
Junio C Hamano
077c48df8a read-cache.c: fix a couple more CE_REMOVE conversion
It is a D/F conflict if you want to add "foo/bar" to the index
when "foo" already exists.  Also it is a conflict if you want to
add a file "foo" when "foo/bar" exists.

An exception is when the existing entry is there only to mark "I
used to be here but I am being removed".  This is needed for
operations such as "git read-tree -m -u" that update the index
and then reflect the result to the work tree --- we need to
remember what to remove somewhere, and we use the index for
that.  In such a case, an existing file "foo" is being removed
and we can create "foo/" directory and hang "bar" underneath it
without any conflict.

We used to use (ce->ce_mode == 0) to mark an entry that is being
removed, but (CE_REMOVE & ce->ce_flags) is used for that purpose
these days.  An earlier commit forgot to convert the logic in
the code that checks D/F conflict condition.

The old code knew that "to be removed" entries cannot be at
higher stage and actively checked that condition, but it was an
unnecessary check.  This patch removes the extra check as well.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:24:21 -08:00
Shawn O. Pearce
3b8f19a02c git-gui: Correctly cleanup msgfmt '1 message untranslated' output
In the multiple message case we remove the word "messages" from the
statistics output of msgfmt as it looks cleaner on the tty when you
are watching the build process.  However we failed to strip the word
"message" when only 1 message was found to be untranslated or fuzzy,
as msgfmt does not produce the 's' suffix.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-01-22 23:56:15 -05:00
Shawn O. Pearce
2cd9ad2e71 git-gui: Make the statistics of po2msg match those of msgfmt
The strings we were showing from po2msg didn't exactly match those
of msgfmt's --statistics output so we didn't show quite the same
results when building git-gui's message files.  Now we're closer
to what msgfmt shows (at least for an en_US locale) so the make
output matches.

I noticed that the fuzzy translation count is off by one for the
current po/zh_cn.po file.  Not sure why and I'm not going to try
and debug it at this time as the po2msg is strictly a fallback,
users building from source really should prefer msgfmt.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-01-22 23:52:07 -05:00
Shawn O. Pearce
3470adabad git-gui: Fallback to Tcl based po2msg.sh if msgfmt isn't available
If msgfmt fails with exit code 127 that typically means the program
is not found in the user's PATH and thus cannot be executed by make.
In such a case we can try to fallback to the Tcl based po2msg program
that we distributed with git-gui, as it does a "good enough" job.

We still don't default to po2msg.sh however as it does not perform
a lot of the sanity checks that msgfmt does, and quite a few of
those are too useful to give up.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-01-22 23:44:36 -05:00
Shawn O. Pearce
3ddff72e58 git-gui: Work around random missing scrollbar in revision list
If the horizontal scrollbar isn't currently visible (because it has
not been needed) but we get an update to the scroll port we may find
the scrollbar window exists but the Tcl command doesn't.  Apparently
it is possible for Tk to have partially destroyed the scrollbar by
removing the Tcl procedure name but still leaving the widget name in
the window registry.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-01-22 23:37:15 -05:00
Eric Wong
3b839fd861 git-svn: default to repacking every 1000 commits
This should reduce disk space usage when doing large imports.
We'll be switching to "gc --auto" post-1.5.4 to handle
repacking for us.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 01:45:40 -08:00
Junio C Hamano
4f5f998fbd Clarify that http-push being temporarily disabled with older cURL
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 00:48:29 -08:00
Nicolas Pitre
6fc74703de pack-objects: Fix segfault when object count is less than thread count
When partitioning the work amongst threads, dividing the number of
objects by the number of threads may return 0 when there are less
objects than threads; this will cause the subsequent code to segfault
when accessing list[sub_size-1].  Allow some threads to have
zero objects to work on instead of barfing, while letting others
to have more.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 17:24:12 -08:00
Alex Riesen
9288bedafa Make t5710 more strict when creating nested repos
The test 'creating too deep nesting' can fail even when cloning the repos,
but is not its main purpose (it has to prepare nested repos and ensure
the last one is invalid). So split the test into the creation and
invalidity checking parts.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 17:24:12 -08:00
Gustaf Hendeby
97394ee430 send-email, fix breakage in combination with --compose
This fixes the subtile bug in git send-email that was introduced into
git send-email with aa54892f5a (send-email:
detect invocation errors earlier), which caused no patches to be sent
out if the --compose flag was used.

Signed-off-by: Gustaf Hendeby <hendeby@isy.liu.se>
Tested-by: Seth Falcon <seth@userprimary.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-21 17:24:12 -08:00
Johannes Schindelin
204ce979a5 Also use unpack_trees() in do_diff_cache()
As in run_diff_index(), we call unpack_trees() with the oneway_diff()
function in do_diff_cache() now.  This makes the function diff_cache()
obsolete.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 13:09:07 -08:00
Linus Torvalds
d1f2d7e8ca Make run_diff_index() use unpack_trees(), not read_tree()
A plain "git commit" would still run lstat() a lot more than necessary,
because wt_status_print() would cause the index to be repeatedly flushed
and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit
to be lost, resulting in the files in the index being lstat'ed three
times each.

The reason why wt-status.c ended up invalidating and re-reading the
cache multiple times was that it uses "run_diff_index()", which in turn
uses "read_tree()" to populate the index with *both* the old index and
the tree we want to compare against.

So this patch re-writes run_diff_index() to not use read_tree(), but
instead use "unpack_trees()" to diff the index to a tree.  That, in
turn, means that we don't need to modify the index itself, which then
means that we don't need to invalidate it and re-read it!

This, together with the lstat() optimizations, means that "git commit"
on the kernel tree really only needs to lstat() the index entries once.
That noticeably cuts down on the cached timings.

Best time before:

	[torvalds@woody linux]$ time git commit > /dev/null
	real    0m0.399s
	user    0m0.232s
	sys     0m0.164s

Best time after:

	[torvalds@woody linux]$ time git commit > /dev/null
	real    0m0.254s
	user    0m0.140s
	sys     0m0.112s

so it's a noticeable improvement in addition to being a nice conceptual
cleanup (it's really not that pretty that "run_diff_index()" dirties the
index!)

Doing an "strace -c" on it also shows that as it cuts the number of
lstat() calls by two thirds, it goes from being lstat()-limited to being
limited by getdents() (which is the readdir system call):

Before:
	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 60.69    0.000704           0     69230        31 lstat
	 23.62    0.000274           0      5522           getdents
	  8.36    0.000097           0      5508      2638 open
	  2.59    0.000030           0      2869           close
	  2.50    0.000029           0       274           write
	  1.47    0.000017           0      2844           fstat

After:
	% time     seconds  usecs/call     calls    errors syscall
	------ ----------- ----------- --------- --------- ----------------
	 45.17    0.000276           0      5522           getdents
	 26.51    0.000162           0     23112        31 lstat
	 19.80    0.000121           0      5503      2638 open
	  4.91    0.000030           0      2864           close
	  1.48    0.000020           0       274           write
	  1.34    0.000018           0      2844           fstat
	...

It passes the test-suite for me, but this is another of one of those
really core functions, and certainly pretty subtle, so..

NOTE! The Linux lstat() system call is really quite cheap when everything
is cached, so the fact that this is quite noticeable on Linux is likely to
mean that it is *much* more noticeable on other operating systems. I bet
you'll see a much bigger performance improvement from this on Windows in
particular.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 13:05:27 -08:00
Junio C Hamano
eadb583134 Avoid running lstat(2) on the same cache entry.
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  This
patch is not about reducing them.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.  If somebody later walks the index and
   wants to see if the work tree has changes, they do not have
   to be checked with lstat(2) again.

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * refresh_cache_ent() also marks the cache entry that are clean
   with CE_UPTODATE.

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano
7fec10b7f4 index: be careful when handling long names
We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Linus Torvalds
7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00