Use the ARRAY_SIZE macro to get the number
of elements in an array.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improvements to our hash table to get it to meet the needs of the
msysgit fscache project, with some nice performance improvements.
* kb/fast-hashmap:
name-hash: retire unused index_name_exists()
hashmap.h: use 'unsigned int' for hash-codes everywhere
test-hashmap.c: drop unnecessary #includes
.gitignore: test-hashmap is a generated file
read-cache.c: fix memory leaks caused by removed cache entries
builtin/update-index.c: cleanup update_one
fix 'git update-index --verbose --again' output
remove old hash.[ch] implementation
name-hash.c: remove cache entries instead of marking them CE_UNHASHED
name-hash.c: use new hash map implementation for cache entries
name-hash.c: remove unreferenced directory entries
name-hash.c: use new hash map implementation for directories
diffcore-rename.c: use new hash map implementation
diffcore-rename.c: simplify finding exact renames
diffcore-rename.c: move code around to prepare for the next patch
buitin/describe.c: use new hash map implementation
add a hashtable implementation that supports O(1) removal
submodule: don't access the .gitmodules cache entry after removing it
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 5fee995244 introduced the stage_updated_gitmodules() function to
add submodule configuration updates to the index. It assumed that even
after calling remove_cache_entry_at() the same cache entry would still be
valid. This was true in the old days, as cache entries could never be
freed, but that is not so sure in the present as there is ongoing work to
free removed cache entries, which makes this code segfault.
Fix that by calling add_file_to_cache() instead of open coding it. Also
remove the "could not find .gitmodules in index" warning, as that won't
happen in regular use cases (and by then just silently adding it to the
index we do the right thing).
Thanks-to: Karsten Blees <karsten.blees@gmail.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.
* jl/submodule-mv: (53 commits)
rm: delete .gitmodules entry of submodules removed from the work tree
mv: update the path entry in .gitmodules for moved submodules
submodule.c: add .gitmodules staging helper functions
mv: move submodules using a gitfile
mv: move submodules together with their work trees
rm: do not set a variable twice without intermediate reading.
t6131 - skip tests if on case-insensitive file system
parse_pathspec: accept :(icase)path syntax
pathspec: support :(glob) syntax
pathspec: make --literal-pathspecs disable pathspec magic
pathspec: support :(literal) syntax for noglob pathspec
kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
parse_pathspec: make sure the prefix part is wildcard-free
rename field "raw" to "_raw" in struct pathspec
tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
remove match_pathspec() in favor of match_pathspec_depth()
remove init_pathspec() in favor of parse_pathspec()
remove diff_tree_{setup,release}_paths
convert common_prefix() to use struct pathspec
...
Git fails due to a segmentation fault if a submodule path is empty.
Here is an example .gitmodules that will cause a segmentation fault:
[submodule "foo-module"]
path
url = http://host/repo.git
$ git status
Segmentation fault (core dumped)
This is because the parsing of "submodule.*.path" is not prepared to
see a value-less "true" and assumes that the value is always
non-NULL (parsing of "ignore" has the same problem).
Fix it by checking the NULL-ness of value and complain with
config_error_nonbool().
Signed-off-by: Jharrod LaFon <jlafon@eyesopen.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently using "git rm" on a submodule removes the submodule's work tree
from that of the superproject and the gitlink from the index. But the
submodule's section in .gitmodules is left untouched, which is a leftover
of the now removed submodule and might irritate users (as opposed to the
setting in .git/config, this must stay as a reminder that the user showed
interest in this submodule so it will be repopulated later when an older
commit is checked out).
Let "git rm" help the user by not only removing the submodule from the
work tree but by also removing the "submodule.<submodule name>" section
from the .gitmodules file and stage both. This doesn't happen when the
"--cached" option is used, as it would modify the work tree. This also
silently does nothing when no .gitmodules file is found and only issues a
warning when it doesn't have a section for this submodule. This is because
the user might just use plain gitlinks without the .gitmodules file or has
already removed the section by hand before issuing the "git rm" command
(in which case the warning reminds him that rm would have done that for
him). Only when .gitmodules is found and contains merge conflicts the rm
command will fail and tell the user to resolve the conflict before trying
again.
Also extend the man page to inform the user about this new feature. While
at it promote the submodule sub-section to a chapter as it made not much
sense under "REMOVING FILES THAT HAVE DISAPPEARED FROM THE FILESYSTEM".
In t7610 three uses of "git rm submod" had to be replaced with "git rm
--cached submod" because that test expects .gitmodules and the work tree
to stay untouched. Also in t7400 the tests for the remaining settings in
the .gitmodules file had to be changed to assert that these settings are
missing.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently using "git mv" on a submodule moves the submodule's work tree in
that of the superproject. But the submodule's path setting in .gitmodules
is left untouched, which is now inconsistent with the work tree and makes
git commands that rely on the proper path -> name mapping (like status and
diff) behave strangely.
Let "git mv" help here by not only moving the submodule's work tree but
also updating the "submodule.<submodule name>.path" setting from the
.gitmodules file and stage both. This doesn't happen when no .gitmodules
file is found and only issues a warning when it doesn't have a section for
this submodule. This is because the user might just use plain gitlinks
without the .gitmodules file or has already updated the path setting by
hand before issuing the "git mv" command (in which case the warning
reminds him that mv would have done that for him). Only when .gitmodules
is found and contains merge conflicts the mv command will fail and tell
the user to resolve the conflict before trying again.
Also extend the man page to inform the user about this new feature.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the new is_staging_gitmodules_ok() and stage_updated_gitmodules()
functions to submodule.c. The first makes it possible for call sites to
see if the .gitmodules file did contain any unstaged modifications they
would accidentally stage in addition to those they intend to stage
themselves. The second function stages all modifications to the
.gitmodules file, both will be used by subsequent patches for the mv
and rm commands.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When moving a submodule which uses a gitfile to point to the git directory
stored in .git/modules/<name> of the superproject two changes must be made
to make the submodule work: the .git file and the core.worktree setting
must be adjusted to point from work tree to git directory and back.
Achieve that by remembering which submodule uses a gitfile by storing the
result of read_gitfile() of each submodule. If that is not NULL the new
function connect_work_tree_and_git_dir() is called after renaming the
submodule's work tree which updates the two settings to the new values.
Extend the man page to inform the user about that feature (and while at it
change the description to not talk about a script anymore, as mv is a
builtin for quite some time now).
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"log --format=" did not honor i18n.logoutputencoding configuration
and this attempts to fix it.
* as/log-output-encoding-in-user-format:
t4205 (log-pretty-formats): avoid using `sed`
t6006 (rev-list-format): add tests for "%b" and "%s" for the case i18n.commitEncoding is not set
t4205, t6006, t7102: make functions better readable
t4205 (log-pretty-formats): revert back single quotes
t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1
t4205: replace .\+ with ..* in sed commands
pretty: --format output should honor logOutputEncoding
pretty: Add failing tests: --format output should honor logOutputEncoding
t4205 (log-pretty-formats): don't hardcode SHA-1 in expected outputs
t7102 (reset): don't hardcode SHA-1 in expected outputs
t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is
- diff-lib.c: setting CE_UPTODATE
- name-hash.c: setting CE_HASHED
- preload-index.c, read-cache.c, unpack-trees.c and
builtin/update-index: obvious
- entry.c: write_entry() may refresh the checked out entry via
fill_stat_cache_info(). This causes "non-const struct cache_entry
*" in builtin/apply.c, builtin/checkout-index.c and
builtin/checkout.c
- builtin/ls-files.c: --with-tree changes stagemask and may set
CE_UPDATE
Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.
So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.
The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:
diff --git a/cache.h b/cache.h
index 430d021..1692891 100644
--- a/cache.h
+++ b/cache.h
@@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
#define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
struct index_state {
- struct cache_entry **cache;
+ const struct cache_entry **cache;
unsigned int version;
unsigned int cache_nr, cache_alloc, cache_changed;
struct string_list *resolve_undo;
will help quickly identify them without bogus warnings.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One can set an alias
$ git config [--global] alias.lg "log --graph --pretty=format:'%Cred%h%Creset
-%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)<%an>%Creset'
--abbrev-commit --date=local"
to see the log as a pretty tree (like *gitk* but in a terminal).
However, log messages written in an encoding i18n.commitEncoding which differs
from terminal encoding are shown corrupted even when i18n.logOutputEncoding
and terminal encoding are the same (e.g. log messages committed on a Cygwin box
with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa).
To simplify an example we can say the following two commands are expected
to give the same output to a terminal:
$ git log --oneline --no-color
$ git log --pretty=format:'%h %s'
However, the former pays attention to i18n.logOutputEncoding
configuration, while the latter does not when it formats "%s".
The same corruption is true for
$ git diff --submodule=log
and
$ git rev-list --pretty=format:%s HEAD
and
$ git reset --hard
This patch makes pretty --format honor logOutputEncoding when it formats
log message.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define memory ownership and lifetime rules for what for-each-ref
feeds to its callbacks (in short, "you do not own it, so make a
copy if you want to keep it").
* mh/reflife: (25 commits)
refs: document the lifetime of the args passed to each_ref_fn
register_ref(): make a copy of the bad reference SHA-1
exclude_existing(): set existing_refs.strdup_strings
string_list_add_refs_by_glob(): add a comment about memory management
string_list_add_one_ref(): rename first parameter to "refname"
show_head_ref(): rename first parameter to "refname"
show_head_ref(): do not shadow name of argument
add_existing(): do not retain a reference to sha1
do_fetch(): clean up existing_refs before exiting
do_fetch(): reduce scope of peer_item
object_array_entry: fix memory handling of the name field
find_first_merges(): remove unnecessary code
find_first_merges(): initialize merges variable using initializer
fsck: don't put a void*-shaped peg in a char*-shaped hole
object_array_remove_duplicates(): rewrite to reduce copying
revision: use object_array_filter() in implementation of gc_boundary()
object_array: add function object_array_filter()
revision: split some overly-long lines
cmd_diff(): make it obvious which cases are exclusive of each other
cmd_diff(): rename local variable "list" -> "entry"
...
read_cache already performs the same check and returns immediately if
the cache has already been loaded.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No names are ever set for the object_array_entries in merges, so there
is no need to pretend to copy them to the result array.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running "git log -p --submodule=log", the submodule log is not
indented by the graph output, although all other lines are. Fix this by
prepending the current line prefix to each line of the submodule log.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two uses of the "left" and "right" commit variables that
make it hard to be sure what values they have (both for the reader,
and for gcc, which wrongly complains that they might be used
uninitialized).
The function starts with a cascading if statement, checking that the
input sha1s exist, and finally working up to preparing a revision
walk. We only prepare the walk if the cascading conditional did not
find any problems, which we check by seeing whether it set the
"message" variable or not. It's simpler and more obvious to just add
a condition to the end of the cascade.
Later, we check the same "message" variable when deciding whether to
clear commit marks on the left/right commits; if it is set, we
presumably never started the walk. This is wrong, though; we might
have started the walk and munged commit flags, only to encounter an
error afterwards. We should always clear the flags on left/right if
they exist, whether the walk was successful or not.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We keep a strbuf for the name of the submodule, even though
we only ever add one string to it. Let's just use xmemdupz
instead, which is slightly more efficient and makes it
easier to follow what is going on.
Unfortunately, we still end up having to deal with some
memory ownership issues in some code branches, as we have to
allocate the string in order to do a string list lookup, and
then only sometimes want to hand ownership of that string
over to the string_list. Still, making that explicit in the
code (as opposed to sometimes detaching the strbuf, and then
always releasing it) makes it a little more obvious what is
going on.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the code a lot simpler to read by dropping a
whole bunch of constant offsets.
As a bonus, it means we also feed the whole config variable
name to our error functions:
[before]
$ git -c submodule.foo.fetchrecursesubmodules=bogus checkout
fatal: bad foo.fetchrecursesubmodules argument: bogus
[after]
$ git -c submodule.foo.fetchrecursesubmodules=bogus checkout
fatal: bad submodule.foo.fetchrecursesubmodules argument: bogus
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, 'git diff --submodule' displays output with a bold diff
header for non-submodules. So this part is in bold:
diff --git a/file1 b/file1
index 30b2f6c..2638038 100644
--- a/file1
+++ b/file1
For submodules, the header looks like this:
Submodule submodule1 012b072..248d0fd:
Unfortunately, it's easy to miss in the output because it's not bold.
Change this.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rm submodule" cannot blindly remove a submodule directory as
its working tree may have local changes, and worse yet, it may even
have its repository embedded in it. Teach it some special cases
where it is safe to remove a submodule, specifically, when there is
no local changes in the submodule working tree, and its repository
is not embedded in its working tree but is elsewhere and uses the
gitfile mechanism to point at it.
* jl/submodule-rm:
submodule: teach rm to remove submodules unless they contain a git directory
Currently using "git rm" on a submodule - populated or not - fails with
this error:
fatal: git rm: '<submodule path>': Is a directory
This made sense in the past as there was no way to remove a submodule
without possibly removing unpushed parts of the submodule's history
contained in its .git directory too, so erroring out here protected the
user from possible loss of data.
But submodules cloned with a recent git version do not contain the .git
directory anymore, they use a gitfile to point to their git directory
which is safely stored inside the superproject's .git directory. The work
tree of these submodules can safely be removed without losing history, so
let's teach git to do so.
Using rm on an unpopulated submodule now removes the empty directory from
the work tree and the gitlink from the index. If the submodule's directory
is missing from the work tree, it will still be removed from the index.
Using rm on a populated submodule using a gitfile will apply the usual
checks for work tree modification adapted to submodules (unless forced).
For a submodule that means that the HEAD is the same as recorded in the
index, no tracked files are modified and no untracked files that aren't
ignored are present in the submodules work tree (ignored files are deemed
expendable and won't stop a submodule's work tree from being removed).
That logic has to be applied in all nested submodules too.
Using rm on a submodule which has its .git directory inside the work trees
top level directory will just error out like it did before to protect the
repository, even when forced. In the future git could either provide a
message informing the user to convert the submodule to use a gitfile or
even attempt to do the conversion itself, but that is not part of this
change.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use argv-array API in "git fetch" implementation.
* jk/argv-array:
submodule: use argv_array instead of hand-building arrays
fetch: use argv_array instead of hand-building arrays
argv-array: fix bogus cast when freeing array
argv-array: add pop function
Optimise the "merge-base" computation a bit, and also update its
users that do not need the full merge-base information to call a
cheaper subset.
* jc/merge-bases:
reduce_heads(): reimplement on top of remove_redundant()
merge-base: "--is-ancestor A B"
get_merge_bases_many(): walk from many tips in parallel
in_merge_bases(): use paint_down_to_common()
merge_bases_many(): split out the logic to paint history
in_merge_bases(): omit unnecessary redundant common ancestor reduction
http-push: use in_merge_bases() for fast-forward check
receive-pack: use in_merge_bases() for fast-forward check
in_merge_bases(): support only one "other" commit
fetch_populated_submodules() allocates the full argv array it uses to
recurse into the submodules from the number of given options plus the six
argv values it is going to add. It then initializes it with those values
which won't change during the iteration and copies the given options into
it. Inside the loop the two argv values different for each submodule get
replaced with those currently valid.
However, this technique is brittle and error-prone (as the comment to
explain the magic number 6 indicates), so let's replace it with an
argv_array. Instead of replacing the argv values, push them to the
argv_array just before the run_command() call (including the option
separating them) and pop them from the argv_array right after that.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In early days of its life, I planned to make it possible to compute
"is a commit contained in all of these other commits?" with this
function, but it turned out that no caller needed it.
Just make it take two commit objects and add a comment to say what
these two functions do.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09). The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.
Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.
Note that the function can still die().
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When peeking into object stores of submodules, the code forgot that they
might borrow objects from alternate object stores on their own.
By Heiko Voigt
* hv/submodule-alt-odb:
teach add_submodule_odb() to look for alternates
Since we allow to link other object databases when loading a submodules
database we should also load possible alternates.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using this option git will search for all submodules that
have changed in the revisions to be send. It will then try to
push the currently checked out branch of each submodule.
This helps when a user has finished working on a change which
involves submodules and just wants to push everything in one go.
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Mentored-by: Jens Lehmann <Jens.Lehmann@web.de>
Mentored-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to tell the user which submodules have not been pushed.
Additionally this is helpful when we want to automatically try to push
submodules that have not been pushed.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously it was not possible to iterate revisions twice using the
revision walking api. We add a reset_revision_walk() which clears the
used flags. This allows us to do multiple sequencial revision walks.
We add the appropriate calls to the existing submodule machinery doing
revision walks. This is done to avoid surprises if future code wants to
call these functions more than once during the processes lifetime.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use diff_tree_combined_merge() instead of open-coding it.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Maintaining an array of hashes is easier using sha1_array than
open-coding it. This patch also fixes a leak of the SHA1 array
in diff_tree_combined_merge().
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff and status run "git status --porcelain" inside each populated
submodule to see if it contains changes (unless told not to do so via
config or command line option). When that fails, e.g. due to a corrupt
submodule .git directory, it just prints "git status --porcelain failed"
or "Could not run git status --porcelain" without giving the user a clue
where that happened.
Add '"in submodule %s", path' to these error strings to tell the user
where exactly the problem occurred.
Reported-by: Seth Robertson <in-gitvger@baka.org>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Both of these free() calls are freeing a "const unsigned char (*)[20]"
type while free() expects a "void *". This results in the following
warning under clang 2.9:
builtin/diff.c:185:7: warning: passing 'const unsigned char (*)[20]' to parameter of type 'void *' discards qualifiers
free(parent);
^~~~~~
submodule.c:394:7: warning: passing 'const unsigned char (*)[20]' to parameter of type 'void *' discards qualifiers
free(parents);
^~~~~~~
This free()-ing without a cast was added by Jim Meyering to
builtin/diff.c in v1.7.6-rc3~4 and later by Fredrik Gustafsson in
submodule.c in v1.7.7-rc1~25^2.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The submodule merge search is not useful during virtual merges because
the results cannot be used automatically. Furthermore any suggestions
made by the search may apply to commits different than HEAD:sub and
MERGE_HEAD:sub, thus confusing the user. Skip searching for submodule
merges during a virtual merge such as that between B and C while merging
the heads of:
B---BC
/ \ /
A X
\ / \
C---CB
Run the search only when the recursion level is zero (!o->call_depth).
This fixes known breakage tested in t7405-submodule-merge.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/argv-array:
run_hook: use argv_array API
checkout: use argv_array API
bisect: use argv_array API
quote: provide sq_dequote_to_argv_array
refactor argv_array into generic code
quote.h: fix bogus comment
add sha1_array API docs
The submodule code recently grew generic code to build a
dynamic argv array. Many other parts of the code can reuse
this, too, so let's make it generically available.
There are two enhancements not found in the original code:
1. We now handle the NULL-termination invariant properly,
even when no strings have been pushed (before, you
could have an empty, NULL argv). This was not a problem
for the submodule code, which always pushed at least
one argument, but was not sufficiently safe for
generic code.
2. There is a formatted variant of the "push" function.
This is a convenience function which was not needed by
the submodule code, but will make it easier to port
other users to the new code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent versions of git can be slow to fetch repositories with a
large number of refs (or when they already have a large
number of refs). For example, GitHub makes pull-requests
available as refs, which can lead to a large number of
available refs. This slowness goes away when submodule
recursion is turned off:
$ git ls-remote git://github.com/rails/rails.git | wc -l
3034
[this takes ~10 seconds of CPU time to complete]
git fetch --recurse-submodules=no \
git://github.com/rails/rails.git "refs/*:refs/*"
[this still isn't done after 10 _minutes_ of pegging the CPU]
git fetch \
git://github.com/rails/rails.git "refs/*:refs/*"
You can produce a quicker and simpler test case like this:
doit() {
head=`git rev-parse HEAD`
for i in `seq 1 $1`; do
echo $head refs/heads/ref$i
done >.git/packed-refs
echo "==> $1"
rm -rf dest
git init -q --bare dest &&
(cd dest && time git.compile fetch -q .. refs/*:refs/*)
}
rm -rf repo
git init -q repo && cd repo &&
>file && git add file && git commit -q -m one
doit 100
doit 200
doit 400
doit 800
doit 1600
doit 3200
Which yields timings like:
# refs seconds of CPU
100 0.06
200 0.24
400 0.95
800 3.39
1600 13.66
3200 54.09
Notice that although the number of refs doubles in each
trial, the CPU time spent quadruples.
The problem is that the submodule recursion code works
something like:
- for each ref we fetch
- for each commit in git rev-list $new_sha1 --not --all
- add modified submodules to list
- fetch any newly referenced submodules
But that means if we fetch N refs, we start N revision
walks. Worse, because we use "--all", the number of refs we
must process that constitute "--all" keeps growing, too. And
you end up doing O(N^2) ref resolutions.
Instead, this patch structures the code like this:
- for each sha1 we already have
- add $old_sha1 to list $old
- for each ref we fetch
- add $new_sha1 to list $new
- for each commit in git rev-list $new --not $old
- add modified submodules to list
- fetch any newly referenced submodules
This yields timings like:
# refs seconds of CPU
100 0.00
200 0.04
400 0.04
800 0.10
1600 0.21
3200 0.39
Note that the amount of effort doubles as the number of refs
doubles. Similarly, the fetch of rails.git takes about as
much time as it does with --recurse-submodules=no.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>