* lt/config-fsync:
Add config option to enable 'fsync()' of object files
Split up default "i18n" and "branch" config parsing into helper routines
Split up default "user" config parsing into helper routine
Split up default "core" config parsing into helper routine
Using find_pack_entry_one() to get object offsets is rather suboptimal
when nth_packed_object_offset() can be used directly.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shell version used to use "mkdir -p" to create the repo
path, but the C version just calls "mkdir". Let's replicate
the old behavior. We have to create the git and worktree
leading dirs separately; while most of the time, the
worktree dir contains the git dir (as .git), the user can
override this using GIT_WORK_TREE.
We can reuse safe_create_leading_directories, but we need to
make a copy of our const buffer to do so. Since
merge-recursive uses the same pattern, we can factor this
out into a global function. This has two other cleanup
advantages for merge-recursive:
1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually
creates bar, but this function just creates the leading
directories.
2. mkdir_p took a mode argument, but it was completely
ignored.
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We should be able to fall back to loose objects or alternative packs when
a pack becomes corrupted. This is especially true when an object exists
in one pack only as a delta but its base object is corrupted. Currently
there is no way to retrieve the former object even if the later is
available in another pack or loose.
This patch allows for a delta to be resolved (with a performance cost)
using a base object from a source other than the pack where that delta
is located. Same thing for non-delta objects: rather than failing
outright, a search is made in other packs or used loose when the
currently active pack has it but corrupted.
Of course git will become extremely noisy with error messages when that
happens. However, if the operation succeeds nevertheless, a simple
'git repack -a -f -d' will "fix" the corrupted repository given that all
corrupted objects have a good duplicate somewhere in the object store,
possibly manually copied from another source.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once we find the absolute paths for git_dir and work_tree, we can make
git_dir a relative path since we know pwd will be work_tree. This should
save the kernel some time traversing the path to work_tree all the time
if git_dir is inside work_tree.
Daniel's patch didn't apply for me as-is, so I recreated it with some
differences, and here are the numbers from ten runs each.
There is some IO for me - probably due to more-or-less random flushing of
the journal - so the variation is bigger than I'd like, but whatever:
Before:
real 0m8.135s
real 0m7.933s
real 0m8.080s
real 0m7.954s
real 0m7.949s
real 0m8.112s
real 0m7.934s
real 0m8.059s
real 0m7.979s
real 0m8.038s
After:
real 0m7.685s
real 0m7.968s
real 0m7.703s
real 0m7.850s
real 0m7.995s
real 0m7.817s
real 0m7.963s
real 0m7.955s
real 0m7.848s
real 0m7.969s
Now, going by "best of ten" (on the assumption that the longer numbers
are all due to IO), I'm saying a 7.933s -> 7.685s reduction, and it does
seem to be outside of the noise (ie the "after" case never broke 8s, while
the "before" case did so half the time).
So looks like about 3% to me.
Doing it for a slightly smaller test-case (just the "arch" subdirectory)
gets more stable numbers probably due to not filling the journal with
metadata updates, so we have:
Before:
real 0m1.633s
real 0m1.633s
real 0m1.633s
real 0m1.632s
real 0m1.632s
real 0m1.630s
real 0m1.634s
real 0m1.631s
real 0m1.632s
real 0m1.632s
After:
real 0m1.610s
real 0m1.609s
real 0m1.610s
real 0m1.608s
real 0m1.607s
real 0m1.610s
real 0m1.609s
real 0m1.611s
real 0m1.608s
real 0m1.611s
where I'ld just take the averages and say 1.632 vs 1.610, which is just
over 1% peformance improvement.
So it's not in the noise, but it's not as big as I initially thought and
measured.
(That said, it obviously depends on how deep the working directory path is
too, and whether it is behind NFS or something else that might need to
cause more work to look up).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As explained in the documentation[*] this is totally useless on
filesystems that do ordered/journalled data writes, but it can be a
useful safety feature on filesystems like HFS+ that only journal the
metadata, not the actual file contents.
It defaults to off, although we could presumably in theory some day
auto-enable it on a per-filesystem basis.
[*] Yes, I updated the docs for the thing. Hell really _has_ frozen
over, and the four horsemen are probably just beyond the horizon.
EVERYBODY PANIC!
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was implemented as a thin wrapper around an otherwise unused
helper function parse_pack_index_file(). The code becomes simpler
and easier to read by consolidating the two.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
write_sha1_from_fd() and write_sha1_to_fd() were dead code nobody called,
neither the latter's helper repack_object() was.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Particularly for the "alternates" file, if one will be created, we
want a path that doesn't depend on the current directory, but we want
to retain any symlinks in the path as given and any in the user's view
of the current directory when the path was given.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This means that we can depend on packs always being stable on disk,
simplifying a lot of the object serialization worries. And unlike loose
objects, serializing pack creation IO isn't going to be a performance
killer.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/add-n-u:
Make git add -n and git -u -n output consistent
"git-add -n -u" should not add but just report
Conflicts:
builtin-add.c
builtin-mv.c
cache.h
read-cache.c
* db/clone-in-c:
Add test for cloning with "--reference" repo being a subset of source repo
Add a test for another combination of --reference
Test that --reference actually suppresses fetching referenced objects
clone: fall back to copying if hardlinking fails
builtin-clone.c: Need to closedir() in copy_or_link_directory()
builtin-clone: fix initial checkout
Build in clone
Provide API access to init_db()
Add a function to set a non-default work tree
Allow for having for_each_ref() list extra refs
Have a constant extern refspec for "--tags"
Add a library function to add an alternate to the alternates file
Add a lockfile function to append to a file
Mark the list of refs to fetch as const
Conflicts:
cache.h
t/t5700-clone-reference.sh
* js/ignore-submodule:
Ignore dirty submodule states during rebase and stash
Teach update-index about --ignore-submodules
diff options: Introduce --ignore-submodules
* bc/repack:
Documentation/git-repack.txt: document new -A behaviour
let pack-objects do the writing of unreachable objects as loose objects
add a force_object_loose() function
builtin-gc.c: deprecate --prune, it now really has no effect
git-gc: always use -A when manually repacking
repack: modify behavior of -A option to leave unreferenced objects unpacked
Conflicts:
builtin-pack-objects.c
* ar/add-unreadable:
Add a config option to ignore errors for git-add
Add a test for git-add --ignore-errors
Add --ignore-errors to git-add to allow it to skip files with read errors
Extend interface of add_files_to_cache to allow ignore indexing errors
Make the exit code of add_file_to_index actually useful
Like with the diff machinery, update-index should sometimes just
ignore submodules (e.g. to determine a clean state before a rebase).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sb/committer:
commit: Show committer if automatic
commit: Show author if different from committer
Preparation to call determine_author_info from prepare_to_commit
git_config() only had a function parameter, but no callback data
parameter. This assumes that all callback functions only modify
global variables.
With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is meant to force the creation of a loose object even if it
already exists packed. Needed for the next commit.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/core-optim:
Optimize symlink/directory detection
Avoid some unnecessary lstat() calls
is_racy_timestamp(): do not check timestamp for gitlinks
diff-lib.c: rename check_work_tree_entity()
diff: a submodule not checked out is not modified
Add t7506 to test submodule related functions for git-status
t4027: test diff for submodule with empty directory
Make git-add behave more sensibly in a case-insensitive environment
When adding files to the index, add support for case-independent matches
Make unpack-tree update removed files before any updated files
Make branch merging aware of underlying case-insensitive filsystems
Add 'core.ignorecase' option
Make hash_name_lookup able to do case-independent lookups
Make "index_name_exists()" return the cache_entry it found
Move name hashing functions into a file of its own
Make unpack_trees_options bit flags actual bitfields
Change cd67e4d4 introduced a new configuration parameter that told
pull to automatically perform a rebase instead of a merge. This
change provides a configuration option to enable this feature
automatically when creating a new branch.
If the variable branch.autosetuprebase applies for a branch that's
being created, that branch will have branch.<name>.rebase set to true.
Signed-off-by: Dustin Sallings <dustin@spy.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the base for making symlink detection in the middle fo a pathname
saner and (much) more efficient.
Under various loads, we want to verify that the full path leading up to a
filename is a real directory tree, and that when we successfully do an
'lstat()' on a filename, we don't get a false positive due to a symlink in
the middle of the path that git should have seen as a symlink, not as a
normal path component.
The 'has_symlink_leading_path()' function already did this, and cached
a single level of symlink information, but didn't cache the _lack_ of a
symlink, so the normal behaviour was actually the wrong way around, and we
ended up doing an 'lstat()' on each path component to check that it was a
real directory.
This caches the last detected full directory and symlink entries, and
speeds up especially deep directory structures a lot by avoiding to
lstat() all the directories leading up to each entry in the index.
[ This can - and should - probably be extended upon so that we eventually
never do a bare 'lstat()' on any path entries at *all* when checking the
index, but always check the full path carefully. Right now we do not
generally check the whole path for all our normal quick index
revalidation.
We should also make sure that we're careful about all the invalidation,
ie when we remove a link and replace it by a directory we should
invalidate the symlink cache if it matches (and vice versa for the
directory cache).
But regardless, the basic function needs to be sane to do that. The old
'has_symlink_leading_path()' was not capable enough - or indeed the code
readable enough - to really do that sanely. So I'm pushing this as not
just an optimization, but as a base for further work. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit sequence used to do
if (file_exists(p->path))
add_file_to_cache(p->path, 0);
where both "file_exists()" and "add_file_to_cache()" needed to do a
lstat() on the path to do their work.
This cuts down 'lstat()' calls for the partial commit case by two
for each path we know about (because we do this twice per path).
Just move the lstat() to the caller instead (that's all that
"file_exists()" really does), and pass the stat information down to the
add_to_cache() function.
This essentially makes 'add_to_index()' the core function that adds a path
to the index, getting the index pointer, the pathname and the stat
information as arguments. There are then shorthand helper functions that
use this core function:
- 'add_to_cache()' is just 'add_to_index()' with the default index
- 'add_file_to_cache/index()' is the same, but does the lstat() call
itself, so you can pass just the pathname if you don't already have the
stat information available.
So old users of the 'add_file_to_xyzzy()' are essentially left unchanged,
and this just exposes the more generic helper function that can take
existing stat information into account.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/case-insensitive:
Make git-add behave more sensibly in a case-insensitive environment
When adding files to the index, add support for case-independent matches
Make unpack-tree update removed files before any updated files
Make branch merging aware of underlying case-insensitive filsystems
Add 'core.ignorecase' option
Make hash_name_lookup able to do case-independent lookups
Make "index_name_exists()" return the cache_entry it found
Move name hashing functions into a file of its own
Make unpack_trees_options bit flags actual bitfields
To warn the user in case he/she might be using an unintended
committer identity.
Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lh/git-file:
Teach GIT-VERSION-GEN about the .git file
Teach git-submodule.sh about the .git file
Teach resolve_gitlink_ref() about the .git file
Add platform-independent .git "symlink"
The caller first calls set_git_dir() to specify the GIT_DIR, and then
calls init_db() to initialize it. This also cleans up various parts of
the code to account for the fact that everything is done with GIT_DIR
set, so it's unnecessary to pass the specified directory around.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function may only be used before the work tree is used.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in the core so that, if the alternates file has already been
read, the addition can be parsed and put into effect for the current
process.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This takes care of copying the original contents into the replacement
file after the lock is held, so that concurrent additions can't miss
each other's changes.
[jc: munged to drop mmap in favor of copy_file.]
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xread() and xwrite() return ssize_t values as their native POSIX
counterparts read(2) and write(2).
To be consistent, read_in_full() and write_in_full() should also return
ssize_t values.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes a struct ref able to represent a symref, and makes http.c
able to recognize one, and makes transport.c look for "HEAD" as a ref
in the list, and makes it dereference symrefs for the resulting ref,
if any.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git init --shared=0xxx, where '0xxx' is an octal number, will create
a repository with file modes set to '0xxx'. Users with a safe umask
value (0077) can use this option to force file modes. For example,
'0640' is a group-readable but not group-writable regardless of
user's umask value. Values compatible with old Git versions are written
as they were before, for compatibility reasons. That is, "1" for
"group" and "2" for "everybody".
"git config core.sharedRepository 0xxx" is also handled.
Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new function can be used by config parsers to tell if a variable
is simply set, set to 1, or set to "true".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch allows .git to be a regular textfile containing the path of
the real git directory (prefixed with "gitdir: "), which can be useful on
platforms lacking support for real symlinks.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.
In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a
git add .
(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").
However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows verify_absent() in unpack_trees() to use the hash chains
rather than looking it up using the binary search.
Perhaps more importantly, it's also going to be useful for the next phase,
where we actually start looking at the cache entry when we do
case-insensitive lookups and checking the result.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.
It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/unpack-trees:
unpack_trees(): fix diff-index regression.
traverse_trees_recursive(): propagate merge errors up
unpack_trees(): minor memory leak fix in unused destination index
Make 'unpack_trees()' have a separate source and destination index
Make 'unpack_trees()' take the index to work on as an argument
Add 'const' where appropriate to index handling functions
Fix tree-walking compare_entry() in the presense of --prefix
Move 'unpack_trees()' over to 'traverse_trees()' interface
Make 'traverse_trees()' traverse conflicting DF entries in parallel
Add return value to 'traverse_tree()' callback
Make 'traverse_tree()' use linked structure rather than 'const char *base'
Add 'df_name_compare()' helper function
This is in an effort to make the source index of 'unpack_trees()' as
being const, and thus making the compiler help us verify that we only
access it for reading.
The constification also extended to some of the hashing helpers that get
called indirectly.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>