Commit Graph

27599 Commits

Author SHA1 Message Date
Jeff King
90108a2441 upload-pack: avoid parsing tag destinations
When upload-pack advertises refs, it dereferences any tags
it sees, and shows the resulting sha1 to the client. It does
this by calling deref_tag. That function must load and parse
each tag object to find the sha1 of the tagged object.
However, it also ends up parsing the tagged object itself,
which is not strictly necessary for upload-pack's use.

Each tag produces two object loads (assuming it is not a
recursive tag), when it could get away with only a single
one. Dropping the second load halves the effort we spend.

The downside is that we are no longer verifying the
resulting object by loading it. In particular:

  1. We never cross-check the "type" field given in the tag
     object with the type of the pointed-to object.  If the
     tag says it points to a tag but doesn't, then we will
     keep peeling and realize the error.  If the tag says it
     points to a non-tag but actually points to a tag, we
     will stop peeling and just advertise the pointed-to
     tag.

  2. If we are missing the pointed-to object, we will not
     realize (because we never even look it up in the object
     db).

However, both of these are errors in the object database,
and both will be detected if a client actually requests the
broken objects in question. So we are simply pushing the
verification away from the advertising stage, and down to
the actual fetching stage.

On my test repo with 120K refs, this drops the time to
advertise the refs from ~3.2s to ~2.0s.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-06 13:28:57 -08:00
Jeff King
926f1dd954 upload-pack: avoid parsing objects during ref advertisement
When we advertise a ref, the first thing we do is parse the
pointed-to object. This gives us two things:

  1. a "struct object" we can use to store flags

  2. the type of the object, so we know whether we need to
     dereference it as a tag

Instead, we can just use lookup_unknown_object to get an
object struct, and then fill in just the type field using
sha1_object_info (which, in the case of packed files, can
find the information without actually inflating the object
data).

This can save time if you have a large number of refs, and
the client isn't actually going to request those refs (e.g.,
because most of them are already up-to-date).

The downside is that we are no longer verifying objects that
we advertise by fully parsing them (however, we do still
know we actually have them, because sha1_object_info must
find them to get the type). While we might fail to detect a
corrupt object here, if the client actually fetches the
object, we will parse (and verify) it then.

On a repository with 120K refs, the advertisement portion of
upload-pack goes from ~3.4s to 3.2s (the failure to speed up
more is largely due to the fact that most of these refs are
tags, which need dereferenced to find the tag destination
anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-06 13:28:55 -08:00
Jeff King
ccdc6037fe parse_object: try internal cache before reading object db
When parse_object is called, we do the following:

  1. read the object data into a buffer via read_sha1_file

  2. call parse_object_buffer, which then:

     a. calls the appropriate lookup_{commit,tree,blob,tag}
	to either create a new "struct object", or to find
	an existing one. We know the appropriate type from
	the lookup in step 1.

     b. calls the appropriate parse_{commit,tree,blob,tag}
        to parse the buffer for the new (or existing) object

In step 2b, all of the called functions are no-ops for
object "X" if "X->object.parsed" is set. I.e., when we have
already parsed an object, we end up going to a lot of work
just to find out at a low level that there is nothing left
for us to do (and we throw away the data from read_sha1_file
unread).

We can optimize this by moving the check for "do we have an
in-memory object" from 2a before the expensive call to
read_sha1_file in step 1.

This might seem circular, since step 2a uses the type
information determined in step 1 to call the appropriate
lookup function. However, we can notice that all of the
lookup_* functions are backed by lookup_object. In other
words, all of the objects are kept in a master hash table,
and we don't actually need the type to do the "do we have
it" part of the lookup, only to do the "and create it if it
doesn't exist" part.

This can save time whenever we call parse_object on the same
sha1 twice in a single program. Some code paths already
perform this optimization manually, with either:

  if (!obj->parsed)
	  obj = parse_object(obj->sha1);

if you already have a "struct object", or:

  struct object *obj = lookup_unknown_object(sha1);
  if (!obj || !obj->parsed)
	  obj = parse_object(sha1);

if you don't.  This patch moves the optimization into
parse_object itself.

Most git operations won't notice any impact. Either they
don't parse a lot of duplicate sha1s, or the calling code
takes special care not to re-parse objects. I timed two
code paths that do benefit (there may be more, but these two
were immediately obvious and easy to time).

The first is fast-export, which calls parse_object on each
object it outputs, like this:

  object = parse_object(sha1);
  if (!object)
	  die(...);
  if (object->flags & SHOWN)
	  return;

which means that just to realize we have already shown an
object, we will read the whole object from disk!

With this patch, my best-of-five time for "fast-export --all" on
git.git dropped from 26.3s to 21.3s.

The second case is upload-pack, which will call parse_object
for each advertised ref (because it needs to peel tags to
show "^{}" entries). This doesn't matter for most
repositories, because they don't have a lot of refs pointing
to the same objects. However, if you have a big alternates
repository with a shared object db for a number of child
repositories, then the alternates repository will have
duplicated refs representing each of its children.

For example, GitHub's alternates repository for git.git has
~120,000 refs, of which only ~3200 are unique. The time for
upload-pack to print its list of advertised refs dropped
from 3.4s to 0.76s.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-05 13:30:54 -08:00
Junio C Hamano
247f9d23da Merge branch 'maint'
* maint:
  t5550: repack everything into one file
  Catch invalid --depth option passed to clone or fetch
2012-01-04 11:21:42 -08:00
Clemens Buchacher
1327d83954 t5550: repack everything into one file
Subsequently we assume that there is only one pack. Currently this is
true only by accident. Pass '-a -d' to repack in order to guarantee that
assumption to hold true.

The prune-packed command is now redundant since repack -d already calls
it.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-04 10:04:59 -08:00
Junio C Hamano
6ea9385426 Merge branch 'nd/maint-parse-depth' into maint
* nd/maint-parse-depth:
  Catch invalid --depth option passed to clone or fetch
2012-01-04 09:43:26 -08:00
Nguyễn Thái Ngọc Duy
e7622ce8c4 Catch invalid --depth option passed to clone or fetch
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-04 09:39:36 -08:00
Junio C Hamano
4570aeb0d8 Merge branch 'pw/p4-docs-and-tests'
* pw/p4-docs-and-tests:
  git-p4: document and test submit options
  git-p4: test and document --use-client-spec
  git-p4: test --keep-path
  git-p4: test --max-changes
  git-p4: document and test --import-local
  git-p4: honor --changesfile option and test
  git-p4: document and test clone --branch
  git-p4: test cloning with two dirs, clarify doc
  git-p4: clone does not use --git-dir
  git-p4: introduce asciidoc documentation
  rename git-p4 tests
2012-01-03 14:09:28 -08:00
Junio C Hamano
228c341835 Merge branch 'maint'
* maint:
  docs: describe behavior of relative submodule URLs
  fix hang in git fetch if pointed at a 0 length bundle
  Documentation: read-tree --prefix works with existing subtrees
  Add MYMETA.json to perl/.gitignore
2012-01-03 13:48:00 -08:00
Junio C Hamano
bc0fe84b06 Merge branch 'maint-1.7.7' into maint
* maint-1.7.7:
  docs: describe behavior of relative submodule URLs
  Documentation: read-tree --prefix works with existing subtrees
  Add MYMETA.json to perl/.gitignore
2012-01-03 13:47:46 -08:00
Junio C Hamano
c07aa5b218 Merge branch 'maint-1.7.6' into maint-1.7.7
* maint-1.7.6:
  Documentation: read-tree --prefix works with existing subtrees
  Add MYMETA.json to perl/.gitignore
2012-01-03 13:47:15 -08:00
Jens Lehmann
9e6ed475e7 docs: describe behavior of relative submodule URLs
Since the relative submodule URLs have been introduced in f31a522a2d, they
do not conform to the rules for resolving relative URIs but rather to
those of relative directories.

Document that behavior.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-03 12:47:58 -08:00
Brian Harring
54440e154f fix hang in git fetch if pointed at a 0 length bundle
git-repo if interupted at the exact wrong time will generate zero
length bundles- literal empty files.  git-repo is wrong here, but
git fetch shouldn't effectively spin loop if pointed at a zero
length bundle.

Signed-off-by: Brian Harring <ferringb@chromium.org>
Helped-by: Johannes Sixt
Helped-by: Nguyen Thai Ngoc Duy
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-03 12:13:28 -08:00
Clemens Buchacher
5c951ef47b Documentation: read-tree --prefix works with existing subtrees
Since 34110cd4 (Make 'unpack_trees()' have a separate source and
destination index) it is no longer true that a subdirectory with
the same prefix must not exist.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-01 01:18:53 -08:00
Jack Nagel
0eddcbf161 Add MYMETA.json to perl/.gitignore
ExtUtils::MakeMaker generates MYMETA.json in addition to MYMETA.yml
since version 6.57_07. As it suggests, it is just meta information about
the build and is cleaned up with 'make clean', so it should be ignored.

Signed-off-by: Jack Nagel <jacknagel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-29 13:08:47 -08:00
Junio C Hamano
17b4e93d5b Update draft release notes to 1.7.9
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-28 12:07:22 -08:00
Junio C Hamano
48de6569eb Sync with 1.7.8.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-28 12:04:25 -08:00
Junio C Hamano
f3f778df69 Git 1.7.8.2
Contains accumulated fixes since 1.7.8 that have been merged to the
'master' branch in preparation for the 1.7.9 release.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-28 12:03:24 -08:00
Junio C Hamano
9a8e485430 Merge branch 'jv/maint-config-set' into maint
* jv/maint-config-set:
  Fix an incorrect reference to --set-all.
2011-12-28 12:03:19 -08:00
Junio C Hamano
0d57085943 Merge branch 'jk/follow-rename-score' into maint
* jk/follow-rename-score:
  use custom rename score during --follow
2011-12-28 11:49:37 -08:00
Junio C Hamano
9b0b0b4f45 Merge branch 'jc/checkout-m-twoway' into maint
* jc/checkout-m-twoway:
  t/t2023-checkout-m.sh: fix use of test_must_fail
  checkout_merged(): squelch false warning from some gcc
  Test 'checkout -m -- path'
  checkout -m: no need to insist on having all 3 stages
2011-12-28 11:44:54 -08:00
Junio C Hamano
00754b20f9 Merge branch 'tr/doc-sh-setup' into maint
* tr/doc-sh-setup:
  git-sh-setup: make require_clean_work_tree part of the interface
2011-12-28 11:42:51 -08:00
Junio C Hamano
b42e81afe2 Merge branch 'jk/maint-strbuf-missing-init' into maint
* jk/maint-strbuf-missing-init:
  commit, merge: initialize static strbuf
2011-12-28 11:42:46 -08:00
Junio C Hamano
4a242d6cb7 Merge branch 'jk/maint-push-v-is-verbose' into maint
* jk/maint-push-v-is-verbose:
  make "git push -v" actually verbose
2011-12-28 11:42:42 -08:00
Junio C Hamano
b5c12797b4 Merge branch 'jk/http-push-to-empty' into maint
* jk/http-push-to-empty:
  remote-curl: don't pass back fake refs

Conflicts:
	remote-curl.c
2011-12-28 11:42:37 -08:00
Junio C Hamano
81eaa0655f Merge branch 'jk/doc-fsck' into maint
* jk/doc-fsck:
  docs: brush up obsolete bits of git-fsck manpage
2011-12-28 11:42:33 -08:00
Junio C Hamano
23838b8a15 Merge branch 'jc/maint-lf-to-crlf-keep-crlf' into maint
* jc/maint-lf-to-crlf-keep-crlf:
  lf_to_crlf_filter(): resurrect CRLF->CRLF hack
2011-12-28 11:42:27 -08:00
Junio C Hamano
e8f6b51a6b Merge branch 'ef/setenv-putenv' into maint
* ef/setenv-putenv:
  compat/setenv.c: error if name contains '='
  compat/setenv.c: update errno when erroring out
2011-12-28 11:42:24 -08:00
Junio C Hamano
3c06ab69b1 Merge branch 'jc/advice-doc' into maint
* jc/advice-doc:
  advice: Document that they all default to true
2011-12-28 11:32:39 -08:00
Junio C Hamano
770dd00ebd Merge branch 'jn/maint-sequencer-fixes' into maint
* jn/maint-sequencer-fixes:
  revert: stop creating and removing sequencer-old directory
  Revert "reset: Make reset remove the sequencer state"
  revert: do not remove state until sequence is finished
  revert: allow single-pick in the middle of cherry-pick sequence
  revert: pass around rev-list args in already-parsed form
  revert: allow cherry-pick --continue to commit before resuming
  revert: give --continue handling its own function
2011-12-28 11:32:39 -08:00
Junio C Hamano
7fc1495b18 Merge branch 'jk/maint-snprintf-va-copy' into maint
* jk/maint-snprintf-va-copy:
  compat/snprintf: don't look at va_list twice
2011-12-28 11:32:38 -08:00
Junio C Hamano
f1c12e1b4a Merge branch 'jk/maint-push-over-dav' into maint
* jk/maint-push-over-dav:
  http-push: enable "proactive auth"
  t5540: test DAV push with authentication
2011-12-28 11:32:37 -08:00
Junio C Hamano
699eb54876 Merge branch 'jk/maint-mv' into maint
* jk/maint-mv:
  mv: be quiet about overwriting
  mv: improve overwrite warning
  mv: make non-directory destination error more clear
  mv: honor --verbose flag
  docs: mention "-k" for both forms of "git mv"
2011-12-28 11:32:36 -08:00
Junio C Hamano
7a5638a159 Merge branch 'jk/fetch-no-tail-match-refs' into maint
* jk/fetch-no-tail-match-refs:
  connect.c: drop path_match function
  fetch-pack: match refs exactly
  t5500: give fully-qualified refs to fetch-pack
  drop "match" parameter from get_remote_heads
2011-12-28 11:32:36 -08:00
Junio C Hamano
2cb1ff9ac3 Merge branch 'ew/keepalive' into maint
* ew/keepalive:
  enable SO_KEEPALIVE for connected TCP sockets
2011-12-28 11:32:36 -08:00
Junio C Hamano
474294963e Merge branch 'ci/stripspace-docs' into maint
* ci/stripspace-docs:
  Update documentation for stripspace
2011-12-28 11:32:35 -08:00
Junio C Hamano
9ddb7ead52 Merge branch 'jh/fast-import-notes' into maint
* jh/fast-import-notes:
  fast-import: Fix incorrect fanout level when modifying existing notes refs
  t9301: Add 2nd testcase exposing bugs in fast-import's notes fanout handling
  t9301: Fix testcase covering up a bug in fast-import's notes fanout handling
2011-12-28 11:32:35 -08:00
Junio C Hamano
d9d73b37f3 Merge branch 'aw/rebase-i-stop-on-failure-to-amend' into maint
* aw/rebase-i-stop-on-failure-to-amend:
  rebase -i: interrupt rebase when "commit --amend" failed during "reword"
2011-12-28 11:32:34 -08:00
Junio C Hamano
4df989f953 Merge branch 'tj/maint-imap-send-remove-unused' into maint
* tj/maint-imap-send-remove-unused:
  imap-send: Remove unused 'use_namespace' variable
2011-12-28 11:32:34 -08:00
Junio C Hamano
79587741cb Merge branch 'jn/branch-move-to-self' into maint
* jn/branch-move-to-self:
  Allow checkout -B <current-branch> to update the current branch
  branch: allow a no-op "branch -M <current-branch> HEAD"
2011-12-28 11:32:33 -08:00
Junio C Hamano
e39888ba21 Merge branch 'na/strtoimax' into maint
* na/strtoimax:
  Support sizes >=2G in various config options accepting 'g' sizes.
  Compatibility: declare strtoimax() under NO_STRTOUMAX
  Add strtoimax() compatibility function.
2011-12-28 11:32:33 -08:00
Junio C Hamano
786a9611f4 Merge branch 'jk/refresh-porcelain-output' into maint
* jk/refresh-porcelain-output:
  refresh_index: make porcelain output more specific
  refresh_index: rename format variables
  read-cache: let refresh_cache_ent pass up changed flags
2011-12-28 11:32:32 -08:00
Jelmer Vernooij
67e223edc4 Fix an incorrect reference to --set-all.
Signed-off-by: Jelmer Vernooij <jelmer@samba.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 11:14:18 -08:00
Pete Wyckoff
28755dbaa5 git-p4: document and test submit options
Clarify there is a -M option, but no -C.  These are both
configurable through variables.

Explain that the allowSubmit variable takes a comma-separated
list of branch names.

Catch earlier an invalid branch name given as an argument to
"git p4 clone".

Test option --origin, variable allowSubmit, and explicit master
branch name.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:31 -08:00
Pete Wyckoff
09fca77b9e git-p4: test and document --use-client-spec
The depot path is required, even with this option.  Make sure
git-p4 fails and exits with non-zero.

Contents in the specified depot path will be rearranged according
to the client spec.  Test this and add a note in the docs.

Leave an XXX suggesting that this is somewhat confusing behavior
that might be good to fix later.

Function stripRepoPath() looks at self.useClientSpec.  Make sure
this is set both for command-line option --use-client-spec and
for configuration variable git-p4.useClientSpec.  Test this.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:31 -08:00
Pete Wyckoff
ae3f41f20a git-p4: test --keep-path
Make sure it leaves the path, below //depot, in git.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:31 -08:00
Pete Wyckoff
7fbe1ce9e2 git-p4: test --max-changes
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:30 -08:00
Pete Wyckoff
5a92a6ce90 git-p4: document and test --import-local
Explain that it is needed on future syncs to find p4 branches
in refs/heads.  Test this behavior.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:30 -08:00
Pete Wyckoff
58c8bc7c1a git-p4: honor --changesfile option and test
When an explicit list of changes is given, it makes no sense to
use @all or @3,5 or any of the other p4 revision specifiers.
Make the code notice when this happens, instead of just ignoring
--changesfile.  Test it.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:30 -08:00
Pete Wyckoff
1471c6b155 git-p4: document and test clone --branch
Clone with --branch will not checkout HEAD, unless the branch
happens to be called the default refs/remotes/p4/master.  The
--branch option is most useful with sync; give an example of
that.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-27 10:19:30 -08:00