Commit Graph

643 Commits

Author SHA1 Message Date
David Reiss
ae299be0e5 Implement normalize_absolute_path
normalize_absolute_path removes several oddities form absolute paths,
giving nice clean paths like "/dir/sub1/sub2".  Also add a test case
for this utility, based on a new test program (in the style of test-sha1).

Signed-off-by: David Reiss <dreiss@facebook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-23 14:11:20 -07:00
Junio C Hamano
9d880582ee Merge branch 'ar/add-unreadable'
* ar/add-unreadable:
  Add a config option to ignore errors for git-add
  Add a test for git-add --ignore-errors
  Add --ignore-errors to git-add to allow it to skip files with read errors
  Extend interface of add_files_to_cache to allow ignore indexing errors
  Make the exit code of add_file_to_index actually useful
2008-05-21 14:16:46 -07:00
Junio C Hamano
38ed1d89f7 "git-add -n -u" should not add but just report
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-21 12:04:41 -07:00
Johannes Schindelin
5fdeacb0ca Teach update-index about --ignore-submodules
Like with the diff machinery, update-index should sometimes just
ignore submodules (e.g. to determine a clean state before a rebase).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-15 16:12:43 -07:00
Junio C Hamano
b66ae7955c Merge branch 'sb/committer'
* sb/committer:
  commit: Show committer if automatic
  commit: Show author if different from committer
  Preparation to call determine_author_info from prepare_to_commit
2008-05-14 13:45:20 -07:00
Johannes Schindelin
ef90d6d420 Provide git_config with a callback-data parameter
git_config() only had a function parameter, but no callback data
parameter.  This assumes that all callback functions only modify
global variables.

With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-14 12:34:44 -07:00
Nicolas Pitre
bbac73117e add a force_object_loose() function
This is meant to force the creation of a loose object even if it
already exists packed.  Needed for the next commit.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-13 22:42:33 -07:00
Alex Riesen
7ae02a30e8 Extend interface of add_files_to_cache to allow ignore indexing errors
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-12 20:54:52 -07:00
Junio C Hamano
dccb3a6acb Merge branch 'lt/core-optim'
* lt/core-optim:
  Optimize symlink/directory detection
  Avoid some unnecessary lstat() calls
  is_racy_timestamp(): do not check timestamp for gitlinks
  diff-lib.c: rename check_work_tree_entity()
  diff: a submodule not checked out is not modified
  Add t7506 to test submodule related functions for git-status
  t4027: test diff for submodule with empty directory
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-11 12:08:20 -07:00
Dustin Sallings
c998ae9baa Allow tracking branches to set up rebase by default.
Change cd67e4d4 introduced a new configuration parameter that told
pull to automatically perform a rebase instead of a merge.  This
change provides a configuration option to enable this feature
automatically when creating a new branch.

If the variable branch.autosetuprebase applies for a branch that's
being created, that branch will have branch.<name>.rebase set to true.

Signed-off-by: Dustin Sallings <dustin@spy.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-11 09:28:52 -07:00
Linus Torvalds
c40641b77b Optimize symlink/directory detection
This is the base for making symlink detection in the middle fo a pathname
saner and (much) more efficient.

Under various loads, we want to verify that the full path leading up to a
filename is a real directory tree, and that when we successfully do an
'lstat()' on a filename, we don't get a false positive due to a symlink in
the middle of the path that git should have seen as a symlink, not as a
normal path component.

The 'has_symlink_leading_path()' function already did this, and cached
a single level of symlink information, but didn't cache the _lack_ of a
symlink, so the normal behaviour was actually the wrong way around, and we
ended up doing an 'lstat()' on each path component to check that it was a
real directory.

This caches the last detected full directory and symlink entries, and
speeds up especially deep directory structures a lot by avoiding to
lstat() all the directories leading up to each entry in the index.

[ This can - and should - probably be extended upon so that we eventually
  never do a bare 'lstat()' on any path entries at *all* when checking the
  index, but always check the full path carefully. Right now we do not
  generally check the whole path for all our normal quick index
  revalidation.

  We should also make sure that we're careful about all the invalidation,
  ie when we remove a link and replace it by a directory we should
  invalidate the symlink cache if it matches (and vice versa for the
  directory cache).

  But regardless, the basic function needs to be sane to do that. The old
  'has_symlink_leading_path()' was not capable enough - or indeed the code
  readable enough - to really do that sanely. So I'm pushing this as not
  just an optimization, but as a base for further work. ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10 18:16:31 -07:00
Linus Torvalds
d177cab048 Avoid some unnecessary lstat() calls
The commit sequence used to do

	if (file_exists(p->path))
		add_file_to_cache(p->path, 0);

where both "file_exists()" and "add_file_to_cache()" needed to do a
lstat() on the path to do their work.

This cuts down 'lstat()' calls for the partial commit case by two
for each path we know about (because we do this twice per path).

Just move the lstat() to the caller instead (that's all that
"file_exists()" really does), and pass the stat information down to the
add_to_cache() function.

This essentially makes 'add_to_index()' the core function that adds a path
to the index, getting the index pointer, the pathname and the stat
information as arguments. There are then shorthand helper functions that
use this core function:

 - 'add_to_cache()' is just 'add_to_index()' with the default index

 - 'add_file_to_cache/index()' is the same, but does the lstat() call
   itself, so you can pass just the pathname if you don't already have the
   stat information available.

So old users of the 'add_file_to_xyzzy()' are essentially left unchanged,
and this just exposes the more generic helper function that can take
existing stat information into account.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10 18:16:30 -07:00
Junio C Hamano
380a742679 Merge branch 'lt/case-insensitive'
* lt/case-insensitive:
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-10 18:14:28 -07:00
Junio C Hamano
31a3c6bb45 Merge branch 'db/learn-HEAD'
* db/learn-HEAD:
  Make ls-remote http://... list HEAD, like for git://...
  Make walker.fetch_ref() take a struct ref.
2008-05-08 20:06:23 -07:00
Santi Béjar
bb1ae3f6ff commit: Show committer if automatic
To warn the user in case he/she might be using an unintended
committer identity.

Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-06 16:50:17 -07:00
Junio C Hamano
e2e2defc14 Merge branch 'lh/git-file'
* lh/git-file:
  Teach GIT-VERSION-GEN about the .git file
  Teach git-submodule.sh about the .git file
  Teach resolve_gitlink_ref() about the .git file
  Add platform-independent .git "symlink"
2008-05-05 19:16:16 -07:00
Daniel Barkalow
f225aeb278 Provide API access to init_db()
The caller first calls set_git_dir() to specify the GIT_DIR, and then
calls init_db() to initialize it. This also cleans up various parts of
the code to account for the fact that everything is done with GIT_DIR
set, so it's unnecessary to pass the specified directory around.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:45 -07:00
Daniel Barkalow
19757d80e5 Add a function to set a non-default work tree
This function may only be used before the work tree is used.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:44 -07:00
Daniel Barkalow
bef70b22ba Add a library function to add an alternate to the alternates file
This is in the core so that, if the alternates file has already been
read, the addition can be parsed and put into effect for the current
process.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:44 -07:00
Daniel Barkalow
ea3cd5c7c6 Add a lockfile function to append to a file
This takes care of copying the original contents into the replacement
file after the lock is held, so that concurrent additions can't miss
each other's changes.

[jc: munged to drop mmap in favor of copy_file.]

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:44 -07:00
Heikki Orsila
0104ca09e3 Make read_in_full() and write_in_full() consistent with xread() and xwrite()
xread() and xwrite() return ssize_t values as their native POSIX
counterparts read(2) and write(2).

To be consistent, read_in_full() and write_in_full() should also return
ssize_t values.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-29 23:11:57 -07:00
Daniel Barkalow
be885d96fe Make ls-remote http://... list HEAD, like for git://...
This makes a struct ref able to represent a symref, and makes http.c
able to recognize one, and makes transport.c look for "HEAD" as a ref
in the list, and makes it dereference symrefs for the resulting ref,
if any.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-26 17:36:18 -07:00
Heikki Orsila
06cbe85503 Make core.sharedRepository more generic
git init --shared=0xxx, where '0xxx' is an octal number, will create
a repository with file modes set to '0xxx'. Users with a safe umask
value (0077) can use this option to force file modes. For example,
'0640' is a group-readable but not group-writable regardless of
user's umask value. Values compatible with old Git versions are written
as they were before, for compatibility reasons. That is, "1" for
"group" and "2" for "everybody".

"git config core.sharedRepository 0xxx" is also handled.

Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-16 18:23:54 -07:00
Junio C Hamano
a53f2ec617 git_config_bool_or_int()
This new function can be used by config parsers to tell if a variable
is simply set, set to 1, or set to "true".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-12 18:38:40 -07:00
Lars Hjemli
b44ebb19e3 Add platform-independent .git "symlink"
This patch allows .git to be a regular textfile containing the path of
the real git directory (prefixed with "gitdir: "), which can be useful on
platforms lacking support for real symlinks.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:50 -07:00
Linus Torvalds
1102952b45 Make git-add behave more sensibly in a case-insensitive environment
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.

In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a

	git add .

(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").

However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
0a9b88b7de Add 'core.ignorecase' option
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
cd2fef59ed Make hash_name_lookup able to do case-independent lookups
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
df292c791a Make "index_name_exists()" return the cache_entry it found
This allows verify_absent() in unpack_trees() to use the hash chains
rather than looking it up using the binary search.

Perhaps more importantly, it's also going to be useful for the next phase,
where we actually start looking at the cache entry when we do
case-insensitive lookups and checking the result.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
96872bc200 Move name hashing functions into a file of its own
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.

It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Junio C Hamano
b85997d14d Merge branch 'lt/unpack-trees'
* lt/unpack-trees:
  unpack_trees(): fix diff-index regression.
  traverse_trees_recursive(): propagate merge errors up
  unpack_trees(): minor memory leak fix in unused destination index
  Make 'unpack_trees()' have a separate source and destination index
  Make 'unpack_trees()' take the index to work on as an argument
  Add 'const' where appropriate to index handling functions
  Fix tree-walking compare_entry() in the presense of --prefix
  Move 'unpack_trees()' over to 'traverse_trees()' interface
  Make 'traverse_trees()' traverse conflicting DF entries in parallel
  Add return value to 'traverse_tree()' callback
  Make 'traverse_tree()' use linked structure rather than 'const char *base'
  Add 'df_name_compare()' helper function
2008-03-11 22:13:44 -07:00
Junio C Hamano
5d921e2931 Merge branch 'jc/cherry-pick' (early part)
* 'jc/cherry-pick' (early part):
  expose a helper function peel_to_type().
  merge-recursive: split low-level merge functions out.

Conflicts:

	Makefile
	builtin-merge-recursive.c
	sha1_name.c
2008-03-11 02:05:12 -07:00
Linus Torvalds
d1f128b050 Add 'const' where appropriate to index handling functions
This is in an effort to make the source index of 'unpack_trees()' as
being const, and thus making the compiler help us verify that we only
access it for reading.

The constification also extended to some of the hashing helpers that get
called indirectly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:48 -08:00
Linus Torvalds
0ab9e1e8cd Add 'df_name_compare()' helper function
This new helper is identical to base_name_compare(), except it compares
conflicting directory/file entries as equal in order to help handling DF
conflicts (thus the name).

Note that while a directory name compares as equal to a regular file
with the new helper, they then individually compare _differently_ to a
filename that has a dot after the basename (because '\0' < '.' < '/').

So a directory called "foo/" will compare equal to a file "foo", even
though "foo.c" will compare after "foo" and before "foo/"

This will be used by routines that want to traverse the git namespace
but then handle conflicting entries together when possible.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:46 -08:00
Junio C Hamano
eadbcd498a Merge branch 'mk/maint-parse-careful'
* mk/maint-parse-careful:
  receive-pack: use strict mode for unpacking objects
  index-pack: introduce checking mode
  unpack-objects: prevent writing of inconsistent objects
  unpack-object: cache for non written objects
  add common fsck error printing function
  builtin-fsck: move common object checking code to fsck.c
  builtin-fsck: reports missing parent commits
  Remove unused object-ref code
  builtin-fsck: move away from object-refs to fsck_walk
  add generic, type aware object chain walker

Conflicts:

	Makefile
	builtin-fsck.c
2008-03-02 15:11:07 -08:00
Junio C Hamano
60b188a984 Merge branch 'js/branch-track'
* js/branch-track:
  doc: documentation update for the branch track changes
  branch: optionally setup branch.*.merge from upstream local branches

Conflicts:

	Documentation/config.txt
	Documentation/git-branch.txt
	Documentation/git-checkout.txt
	builtin-branch.c
	cache.h
	t/t7201-co.sh
2008-02-27 13:02:57 -08:00
Junio C Hamano
5a4d707a6d Merge branch 'db/checkout'
* db/checkout: (21 commits)
  checkout: error out when index is unmerged even with -m
  checkout: show progress when checkout takes long time while switching branches
  Add merge-subtree back
  checkout: updates to tracking report
  builtin-checkout.c: Remove unused prefix arguments in switch_branches path
  checkout: work from a subdirectory
  checkout: tone down the "forked status" diagnostic messages
  Clean up reporting differences on branch switch
  builtin-checkout.c: fix possible usage segfault
  checkout: notice when the switched branch is behind or forked
  Build in checkout
  Move code to clean up after a branch change to branch.c
  Library function to check for unmerged index entries
  Use diff -u instead of diff in t7201
  Move create_branch into a library file
  Build-in merge-recursive
  Add "skip_unmerged" option to unpack_trees.
  Discard "deleted" cache entries after using them to update the working tree
  Send unpack-trees debugging output to stderr
  Add flag to make unpack_trees() not print errors.
  ...

Conflicts:

	Makefile
2008-02-27 12:53:26 -08:00
Junio C Hamano
d87aa32935 Merge branch 'jk/help-alias'
* jk/help-alias:
  help: respect aliases
  make alias lookup a public, procedural function
  help: use parseopt
2008-02-27 11:55:43 -08:00
Junio C Hamano
2db511fdbd Merge branch 'maint'
* maint:
  Documentation/git-am.txt: Pass -r in the example invocation of rm -f .dotest
  timezone_names[]: fixed the tz offset for New Zealand.
  filter-branch documentation: non-zero exit status in command abort the filter
  rev-parse: fix potential bus error with --parseopt option spec handling
  Use a single implementation and API for copy_file()
  Documentation/git-filter-branch: add a new msg-filter example
  Correct fast-export file mode strings to match fast-import standard
2008-02-26 00:14:22 -08:00
Martin Koegler
355885d531 add generic, type aware object chain walker
The requirements are:
* it may not crash on NULL pointers
* a callback function is needed, as index-pack/unpack-objects
  need to do different things
* the type information is needed to check the expected <-> real type
  and print better error messages

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 23:57:34 -08:00
Daniel Barkalow
1468bd4783 Use a single implementation and API for copy_file()
Originally by Kristian Hï¿œgsberg; I fixed the conversion of rerere, which
had a different API.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 13:06:49 -08:00
Jeff King
94351118c0 make alias lookup a public, procedural function
This converts git_config_alias to the public alias_lookup
function. Because of the nature of our config parser, we
still have to rely on setting static data. However, that
interface is wrapped so that you can just say

  value = alias_lookup(key);

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-24 18:31:49 -08:00
Junio C Hamano
e38f892d18 Merge branch 'jc/apply-whitespace'
* jc/apply-whitespace:
  ws_fix_copy(): move the whitespace fixing function to ws.c
  apply: do not barf on patch with too large an offset
  core.whitespace: cr-at-eol
  git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run
  builtin-apply.c: pass ws_rule down to match_fragment()
  builtin-apply.c: move copy_wsfix() function a bit higher.
  builtin-apply.c: do not feed copy_wsfix() leading '+'
  builtin-apply.c: simplify calling site to apply_line()
  builtin-apply.c: clean-up apply_one_fragment()
  builtin-apply.c: mark common context lines in lineinfo structure.
  builtin-apply.c: optimize match_beginning/end processing a bit.
  builtin-apply.c: make it more line oriented
  builtin-apply.c: push match-beginning/end logic down
  builtin-apply.c: restructure "offset" matching
  builtin-apply.c: refactor small part that matches context
2008-02-24 17:23:17 -08:00
Junio C Hamano
fe3403c320 ws_fix_copy(): move the whitespace fixing function to ws.c
This is used by git-apply but we can use it elsewhere by slightly
generalizing it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-23 16:59:16 -08:00
Linus Torvalds
eb7a2f1d50 Use helper function for copying index entry information
We used to just memcpy() the index entry when we copied the stat() and
SHA1 hash information, which worked well enough back when the index
entry was just an exact bit-for-bit representation of the information on
disk.

However, these days we actually have various management information in
the cache entry too, and we should be careful to not overwrite it when
we copy the stat information from another index entry.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Linus Torvalds
d070e3a31b Name hash fixups: export (and rename) remove_hash_entry
This makes the name hash removal function (which really just sets the
bit that disables lookups of it) available to external routines, and
makes read_cache_unmerged() use it when it drops an unmerged entry from
the index.

It's renamed to remove_index_entry(), and we drop the (unused) 'istate'
argument.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Linus Torvalds
a22c637124 Fix name re-hashing semantics
We handled the case of removing and re-inserting cache entries badly,
which is something that merging commonly needs to do (removing the
different stages, and then re-inserting one of them as the merged
state).

We even had a rather ugly special case for this failure case, where
replace_index_entry() basically turned itself into a no-op if the new
and the old entries were the same, exactly because the hash routines
didn't handle it on their own.

So what this patch does is to not just have the UNHASHED bit, but a
HASHED bit too, and when you insert an entry into the name hash, that
involves:

 - clear the UNHASHED bit, because now it's valid again for lookup
   (which is really all that UNHASHED meant)

 - if we're being lazy, we're done here (but we still want to clear the
   UNHASHED bit regardless of lazy mode, since we can become unlazy
   later, and so we need the UNHASHED bit to always be set correctly,
   even if we never actually insert the entry into the hash list)

 - if it was already hashed, we just leave it on the list

 - otherwise mark it HASHED and insert it into the list

this all means that unhashing and rehashing a name all just works
automatically.  Obviously, you cannot change the name of an entry (that
would be a serious bug), but nothing can validly do that anyway (you'd
have to allocate a new struct cache_entry anyway since the name length
could change), so that's not a new limitation.

The code actually gets simpler in many ways, although the lazy hashing
does mean that there are a few odd cases (ie something can be marked
unhashed even though it was never on the hash in the first place, and
isn't actually marked hashed!).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Jay Soffian
9ed36cfa35 branch: optionally setup branch.*.merge from upstream local branches
"git branch" and "git checkout -b" now honor --track option even when
the upstream branch is local.  Previously --track was silently ignored
when forking from a local branch.  Also the command did not error out
when --track was explicitly asked for but the forked point specified
was not an existing branch (i.e. when there is no way to set up the
tracking configuration), but now it correctly does.

The configuration setting branch.autosetupmerge can now be set to
"always", which is equivalent to using --track from the command line.
Setting branch.autosetupmerge to "true" will retain the former behavior
of only setting up branch.*.merge for remote upstream branches.

Includes test cases for the new functionality.

Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-19 21:17:45 -08:00
Junio C Hamano
8177631547 expose a helper function peel_to_type().
This helper function is the core of "$object^{type}" parser.
Now it is made available to callers outside sha1_name.c
2008-02-18 00:51:05 -08:00
Junio C Hamano
2ac4b4b222 Merge branch 'sp/safecrlf'
* sp/safecrlf:
  safecrlf: Add mechanism to warn about irreversible crlf conversions
2008-02-16 17:59:20 -08:00
Junio C Hamano
987e315a6b Merge branch 'jc/gitignore-ends-with-slash'
* jc/gitignore-ends-with-slash:
  gitignore: lazily find dtype
  gitignore(5): Allow "foo/" in ignore list to match directory "foo"
2008-02-16 17:57:06 -08:00
Junio C Hamano
fef1c4c0a0 Merge branch 'jk/noetcconfig'
* jk/noetcconfig:
  fix config reading in tests
  allow suppressing of global and system config

Conflicts:

	cache.h
2008-02-16 17:56:51 -08:00
Junio C Hamano
d5558581d2 Merge branch 'maint'
* maint:
  commit: discard index after setting up partial commit
  filter-branch: handle filenames that need quoting
  diff: Fix miscounting of --check output
  hg-to-git: fix parent analysis
  mailinfo: feed only one line to handle_filter() for QP input
  diff.c: add "const" qualifier to "char *cmd" member of "struct ll_diff_driver"
  Add "const" qualifier to "char *excludes_file".
  Add "const" qualifier to "char *editor_program".
  Add "const" qualifier to "char *pager_program".
  config: add 'git_config_string' to refactor string config variables.
  diff.c: remove useless check for value != NULL
  fast-import: check return value from unpack_entry()
  Validate nicknames of remote branches to prohibit confusing ones
  diff.c: replace a 'strdup' with 'xstrdup'.
  diff.c: fixup garding of config parser from value=NULL
2008-02-16 00:20:37 -08:00
Christian Couder
dfb068be8d Add "const" qualifier to "char *excludes_file".
Also use "git_config_string" to simplify "config.c" code
where "excludes_file" is set.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:54 -08:00
Christian Couder
ee9601e6be Add "const" qualifier to "char *editor_program".
Also use "git_config_string" to simplify "config.c" code
where "editor_program" is set.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:53 -08:00
Christian Couder
872da32d80 Add "const" qualifier to "char *pager_program".
Also use "git_config_string" to simplify "config.c" code
where "pager_program" is set.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:53 -08:00
Christian Couder
ea5105a5e3 config: add 'git_config_string' to refactor string config variables.
In many places we just check if a value from the config file is not
NULL, then we duplicate it and return 0. This patch introduces the new
'git_config_string' function to do that.

This function is also used to refactor some code in 'config.c'.
Refactoring other files is left for other patches.

Also not all the code in "config.c" is refactored, because the function
takes a "const char **" as its first parameter, but in many places a
"char *" is used instead of a "const char *". (And C does not allow
using a "char **" instead of a "const char **" without a warning.)

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:53 -08:00
Junio C Hamano
e0197c9aae Merge branch 'lt/in-core-index'
* lt/in-core-index:
  lazy index hashing
  Create pathname-based hash-table lookup into index
  read-cache.c: introduce is_racy_timestamp() helper
  read-cache.c: fix a couple more CE_REMOVE conversion
  Also use unpack_trees() in do_diff_cache()
  Make run_diff_index() use unpack_trees(), not read_tree()
  Avoid running lstat(2) on the same cache entry.
  index: be careful when handling long names
  Make on-disk index representation separate from in-core one
2008-02-11 16:46:20 -08:00
Junio C Hamano
40ea4ed903 Add config_error_nonbool() helper function
This is used to report misconfigured configuration file that does not
give any value to a non-boolean variable, e.g.

	[section]
		var

It is perfectly fine to say it if the section.var is a boolean (it means
true), but if a variable expects a string value it should be flagged as
a configuration error.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-11 13:11:36 -08:00
Daniel Barkalow
94a5728cfb Library function to check for unmerged index entries
It's small, but it was in three places already, so it should be in the
library.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Jeff King
ab88c36321 allow suppressing of global and system config
The GIT_CONFIG_NOGLOBAL and GIT_CONFIG_NOSYSTEM environment
variables are magic undocumented switches that can be used
to ensure a totally clean environment. This is necessary for
running reliable tests, since those config files may contain
settings that change the outcome of tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-06 14:52:23 -08:00
Steffen Prohaska
21e5ad50fc safecrlf: Add mechanism to warn about irreversible crlf conversions
CRLF conversion bears a slight chance of corrupting data.
autocrlf=true will convert CRLF to LF during commit and LF to
CRLF during checkout.  A file that contains a mixture of LF and
CRLF before the commit cannot be recreated by git.  For text
files this is the right thing to do: it corrects line endings
such that we have only LF line endings in the repository.
But for binary files that are accidentally classified as text the
conversion can corrupt data.

If you recognize such corruption early you can easily fix it by
setting the conversion type explicitly in .gitattributes.  Right
after committing you still have the original file in your work
tree and this file is not yet corrupted.  You can explicitly tell
git that this file is binary and git will handle the file
appropriately.

Unfortunately, the desired effect of cleaning up text files with
mixed line endings and the undesired effect of corrupting binary
files cannot be distinguished.  In both cases CRLFs are removed
in an irreversible way.  For text files this is the right thing
to do because CRLFs are line endings, while for binary files
converting CRLFs corrupts data.

This patch adds a mechanism that can either warn the user about
an irreversible conversion or can even refuse to convert.  The
mechanism is controlled by the variable core.safecrlf, with the
following values:

 - false: disable safecrlf mechanism
 - warn: warn about irreversible conversions
 - true: refuse irreversible conversions

The default is to warn.  Users are only affected by this default
if core.autocrlf is set.  But the current default of git is to
leave core.autocrlf unset, so users will not see warnings unless
they deliberately chose to activate the autocrlf mechanism.

The safecrlf mechanism's details depend on the git command.  The
general principles when safecrlf is active (not false) are:

 - we warn/error out if files in the work tree can modified in an
   irreversible way without giving the user a chance to backup the
   original file.

 - for read-only operations that do not modify files in the work tree
   we do not not print annoying warnings.

There are exceptions.  Even though...

 - "git add" itself does not touch the files in the work tree, the
   next checkout would, so the safety triggers;

 - "git apply" to update a text file with a patch does touch the files
   in the work tree, but the operation is about text files and CRLF
   conversion is about fixing the line ending inconsistencies, so the
   safety does not trigger;

 - "git diff" itself does not touch the files in the work tree, it is
   often run to inspect the changes you intend to next "git add".  To
   catch potential problems early, safety triggers.

The concept of a safety check was originally proposed in a similar
way by Linus Torvalds.  Thanks to Dimitry Potapov for insisting
on getting the naked LF/autocrlf=true case right.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
2008-02-06 13:07:28 -08:00
Junio C Hamano
d6b8fc303b gitignore(5): Allow "foo/" in ignore list to match directory "foo"
A pattern "foo/" in the exclude list did not match directory
"foo", but a pattern "foo" did.  This attempts to extend the
exclude mechanism so that it would while not matching a regular
file or a symbolic link "foo".  In order to differentiate a
directory and non directory, this passes down the type of path
being checked to excluded() function.

A downside is that the recursive directory walk may need to run
lstat(2) more often on systems whose "struct dirent" do not give
the type of the entry; earlier it did not have to do so for an
excluded path, but we now need to figure out if a path is a
directory before deciding to exclude it.  This is especially bad
because an idea similar to the earlier CE_UPTODATE optimization
to reduce number of lstat(2) calls would by definition not apply
to the codepaths involved, as (1) directories will not be
registered in the index, and (2) excluded paths will not be in
the index anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:46:49 -08:00
Junio C Hamano
b2979ff599 core.whitespace: cr-at-eol
This new error mode allows a line to have a carriage return at the
end of the line when checking and fixing trailing whitespace errors.

Some people like to keep CRLF line ending recorded in the repository,
and still want to take advantage of the automated trailing whitespace
stripping.  We still show ^M in the diff output piped to "less" to
remind them that they do have the CR at the end, but these carriage
return characters at the end are no longer flagged as errors.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:38:41 -08:00
Junio C Hamano
9cb76b8cdc lazy index hashing
This delays the hashing of index names until it becomes necessary for
the first time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 23:01:13 -08:00
Linus Torvalds
cf558704fb Create pathname-based hash-table lookup into index
This creates a hash index of every single file added to the index.
Right now that hash index isn't actually used for much: I implemented a
"cache_name_exists()" function that uses it to efficiently look up a
filename in the index without having to do the O(logn) binary search,
but quite frankly, that's not why this patch is interesting.

No, the whole and only reason to create the hash of the filenames in the
index is that by modifying the hash function, you can fairly easily do
things like making it always hash equivalent names into the same bucket.

That, in turn, means that suddenly questions like "does this name exist
in the index under an _equivalent_ name?" becomes much much cheaper.

Guiding principles behind this patch:

 - it shouldn't be too costly. In fact, my primary goal here was to
   actually speed up "git commit" with a fully populated kernel tree, by
   being faster at checking whether a file already existed in the index. I
   did succeed, but only barely:

	Best before:
		[torvalds@woody linux]$ time git commit > /dev/null
		real    0m0.255s
		user    0m0.168s
		sys     0m0.088s

	Best after:

		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
		real    0m0.233s
		user    0m0.144s
		sys     0m0.088s

   so some things are actually faster (~8%).

   Caveat: that's really the best case. Other things are invariably going
   to be slightly slower, since we populate that index cache, and quite
   frankly, few things really use it to look things up.

   That said, the cost is really quite small. The worst case is probably
   doing a "git ls-files", which will do very little except puopulate the
   index, and never actually looks anything up in it, just lists it.

	Before:
		[torvalds@woody linux]$ time git ls-files > /dev/null
		real    0m0.016s
		user    0m0.016s
		sys     0m0.000s

	After:
		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
		real    0m0.021s
		user    0m0.012s
		sys     0m0.008s

   and while the thing has really gotten relatively much slower, we're
   still talking about something almost unmeasurable (eg 5ms). And that
   really should be pretty much the worst case.

   So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
   poison - this patch has the advantage that it will _likely_ speed up
   the cases that are complex and expensive more than it slows down the
   cases that are already so fast that nobody cares. But if you look at
   relative speedups/slowdowns, it doesn't look so good.

 - It should be simple and clean

   The code may be a bit subtle (the reasons I do hash removal the way I
   do etc), but it re-uses the existing hash.c files, so it really is
   fairly small and straightforward apart from a few odd details.

Now, this patch on its own doesn't really do much, but I think it's worth
looking at, if only because if done correctly, the name hashing really can
make an improvement to the whole issue of "do we have a filename that
looks like this in the index already". And at least it gets real testing
by being used even by default (ie there is a real use-case for it even
without any insane filesystems).

NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
not ashamed of it enough to really care. I took all the numbers out of my
nether regions - I'm sure it's good enough that it works in practice, but
the whole point was that you can make a really much fancier hash that
hashes characters not directly, but by their upper-case value or something
like that, and thus you get a case-insensitive hash, while still keeping
the name and the index itself totally case sensitive.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:46:30 -08:00
Junio C Hamano
eadb583134 Avoid running lstat(2) on the same cache entry.
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  This
patch is not about reducing them.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.  If somebody later walks the index and
   wants to see if the work tree has changes, they do not have
   to be checked with lstat(2) again.

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * refresh_cache_ent() also marks the cache entry that are clean
   with CE_UPTODATE.

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano
7fec10b7f4 index: be careful when handling long names
We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Linus Torvalds
7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Shawn O. Pearce
c9ced051c3 Fix random fast-import errors when compiled with NO_MMAP
fast-import was relying on the fact that on most systems mmap() and
write() are synchronized by the filesystem's buffer cache.  We were
relying on the ability to mmap() 20 bytes beyond the current end
of the file, then later fill in those bytes with a future write()
call, then read them through the previously obtained mmap() address.

This isn't always true with some implementations of NFS, but it is
especially not true with our NO_MMAP=YesPlease build time option used
on some platforms.  If fast-import was built with NO_MMAP=YesPlease
we used the malloc()+pread() emulation and the subsequent write()
call does not update the trailing 20 bytes of a previously obtained
"mmap()" (aka malloc'd) address.

Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to
be unable to read an object header (or data) that has been unlucky
enough to be written to the packfile at a location such that it
is in the trailing 20 bytes of a window previously opened on that
same packfile.

This bug has gone unnoticed for a very long time as it is highly data
dependent.  Not only does the object have to be placed at the right
position, but it also needs to be positioned behind some other object
that has been accessed due to a branch cache invalidation.  In other
words the stars had to align just right, and if you did run into
this bug you probably should also have purchased a lottery ticket.

Fortunately the workaround is a lot easier than the bug explanation.

Before we allow unpack_entry() to read data from a pack window
that has also (possibly) been modified through write() we force
all existing windows on that packfile to be closed.  By closing
the windows we ensure that any new access via the emulated mmap()
will reread the packfile, updating to the current file content.

This comes at a slight performance degredation as we cannot reuse
previously cached windows when we update the packfile.  But it
is a fairly minor difference as the window closes happen at only
two points:

 - When the packfile is finalized and its .idx is generated:

   At this stage we are getting ready to update the refs and any
   data access into the packfile is going to be random, and is
   going after only the branch tips (to ensure they are valid).
   Our existing windows (if any) are not likely to be positioned
   at useful locations to access those final tip commits so we
   probably were closing them before anyway.

 - When the branch cache missed and we need to reload:

   At this point fast-import is getting change commands for the next
   commit and it needs to go re-read a tree object it previously
   had written out to the packfile.  What windows we had (if any)
   are not likely to cover the tree in question so we probably were
   closing them before anyway.

We do try to avoid unnecessarily closing windows in the second case
by checking to see if the packfile size has increased since the
last time we called unpack_entry() on that packfile.  If the size
has not changed then we have not written additional data, and any
existing window is still vaild.  This nicely handles the cases where
fast-import is going through a branch cache reload and needs to read
many trees at once.  During such an event we are not likely to be
updating the packfile so we do not cycle the windows between reads.

With this change in place t9301-fast-export.sh (which was broken
by c3b0dec509) finally works again.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-17 22:39:20 -08:00
Brandon Casey
d6cf61bfd4 close_lock_file(): new function in the lockfile API
The lockfile API is a handy way to obtain a file that is cleaned
up if you die().  But sometimes you would need this sequence to
work:

 1. hold_lock_file_for_update() to get a file descriptor for
    writing;

 2. write the contents out, without being able to decide if the
    results should be committed or rolled back;

 3. do something else that makes the decision --- and this
    "something else" needs the lockfile not to have an open file
    descriptor for writing (e.g. Windows do not want a open file
    to be renamed);

 4. call commit_lock_file() or rollback_lock_file() as
    appropriately.

This adds close_lock_file() you can call between step 2 and 3 in
the above sequence.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-16 15:35:03 -08:00
Wincent Colaiuta
c1795bb08a Unify whitespace checking
This commit unifies three separate places where whitespace checking was
performed:

 - the whitespace checking previously done in builtin-apply.c is
extracted into a function in ws.c

 - the equivalent logic in "git diff" is removed

 - the emit_line_with_ws() function is also removed because that also
rechecks the whitespace, and its functionality is rolled into ws.c

The new function is called check_and_emit_line() and it does two things:
checks a line for whitespace errors and optionally emits it. The checking
is based on lines of content rather than patch lines (in other words, the
caller must strip the leading "+" or "-"); this was suggested by Junio on
the mailing list to allow for a future extension to "git show" to display
whitespace errors in blobs.

At the same time we teach it to report all classes of whitespace errors
found for a given line rather than reporting only the first found error.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-13 23:43:58 -08:00
Jeff King
6e9af863ee Support GIT_PAGER_IN_USE environment variable
When deciding whether or not to turn on automatic color
support, git_config_colorbool checks whether stdout is a
tty. However, because we run a pager, if stdout is not a
tty, we must check whether it is because we started the
pager. This used to be done by checking the pager_in_use
variable.

This variable was set only when the git program being run
started the pager; there was no way for an external program
running git indicate that it had already started a pager.
This patch allows a program to set GIT_PAGER_IN_USE to a
true value to indicate that even though stdout is not a tty,
it is because a pager is being used.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-11 00:42:05 -08:00
Junio C Hamano
4eb39e9bcc Merge branch 'jc/spht'
* jc/spht:
  Use gitattributes to define per-path whitespace rule
  core.whitespace: documentation updates.
  builtin-apply: teach whitespace_rules
  builtin-apply: rename "whitespace" variables and fix styles
  core.whitespace: add test for diff whitespace error highlighting
  git-diff: complain about >=8 consecutive spaces in initial indent
  War on whitespace: first, a bit of retreat.

Conflicts:

	cache.h
	config.c
	diff.c
2007-12-09 01:23:48 -08:00
Junio C Hamano
774751a8bc Re-fix "builtin-commit: fix --signoff"
An earlier fix to the said commit was incomplete; it mixed up the
meaning of the flag parameter passed to the internal fmt_ident()
function, so this corrects it.

git_author_info() and git_committer_info() can be told to issue a
warning when no usable user information is found, and optionally can be
told to error out.  Operations that actually use the information to
record a new commit or a tag will still error out, but the caller to
leave reflog record will just silently use bogus user information.

Not warning on misconfigured user information while writing a reflog
entry is somewhat debatable, but it is probably nicer to the users to
silently let it pass, because the only information you are losing is who
checked out the branch.

 * git_author_info() and git_committer_info() used to take 1 (positive
   int) to error out with a warning on misconfiguration; this is now
   signalled with a symbolic constant IDENT_ERROR_ON_NO_NAME.

 * These functions used to take -1 (negative int) to warn but continue;
   this is now signalled with a symbolic constant IDENT_WARN_ON_NO_NAME.

 * fmt_ident() function implements the above error reporting behaviour
   common to git_author_info() and git_committer_info().  A symbolic
   constant IDENT_NO_DATE can be or'ed in to the flag parameter to make
   it return only the "Name <email@address.xz>".

 * fmt_name() is a thin wrapper around fmt_ident() that always passes
   IDENT_ERROR_ON_NO_NAME and IDENT_NO_DATE.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-09 00:55:55 -08:00
Junio C Hamano
cf1b7869f0 Use gitattributes to define per-path whitespace rule
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]).  This attribute gives you finer
control per path.

For example, if you have these in the .gitattributes:

    frotz   whitespace
    nitfol  -whitespace
    xyzzy   whitespace=-trailing

all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'.  A project with mixed Python and C might want
to have:

    *.c    whitespace
    *.py   whitespace=-indent-with-non-tab

in its toplevel .gitattributes file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-06 00:45:30 -08:00
Junio C Hamano
31cbb5d961 Merge branch 'kh/commit'
* kh/commit: (33 commits)
  git-commit --allow-empty
  git-commit: Allow to amend a merge commit that does not change the tree
  quote_path: fix collapsing of relative paths
  Make git status usage say git status instead of git commit
  Fix --signoff in builtin-commit differently.
  git-commit: clean up die messages
  Do not generate full commit log message if it is not going to be used
  Remove git-status from list of scripts as it is builtin
  Fix off-by-one error when truncating the diff out of the commit message.
  builtin-commit.c: export GIT_INDEX_FILE for launch_editor as well.
  Add a few more tests for git-commit
  builtin-commit: Include the diff in the commit message when verbose.
  builtin-commit: fix partial-commit support
  Fix add_files_to_cache() to take pathspec, not user specified list of files
  Export three helper functions from ls-files
  builtin-commit: run commit-msg hook with correct message file
  builtin-commit: do not color status output shown in the message template
  file_exists(): dangling symlinks do exist
  Replace "runstatus" with "status" in the tests
  t7501-commit: Add test for git commit <file> with dirty index.
  ...
2007-12-04 17:16:33 -08:00
Junio C Hamano
9bbe6db85f Merge branch 'sp/refspec-match'
* sp/refspec-match:
  refactor fetch's ref matching to use refname_match()
  push: use same rules as git-rev-parse to resolve refspecs
  add refname_match()
  push: support pushing HEAD to real branch name
2007-12-04 17:07:10 -08:00
Christian Couder
b319ce4c14 Trace and quote with argv: get rid of unneeded count argument.
Now that str_buf takes care of all the allocations, there is
no more gain to pass an argument count.

So this patch removes the "count" argument from:
	- "sq_quote_argv"
	- "trace_argv_printf"
and all the callers.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-03 22:11:53 -08:00
Junio C Hamano
d9ccfe7711 Fix --signoff in builtin-commit differently.
Introduce fmt_name() specifically meant for formatting the name and
email pair, to add signed-off-by value.  This reverts parts of
13208572fb (builtin-commit: fix --signoff)
so that an empty datestamp string given to fmt_ident() by mistake will
error out as before.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-02 23:35:46 -08:00
Junio C Hamano
b45563a229 rename: Break filepairs with different types.
When we consider if a path has been totally rewritten, we did not
touch changes from symlinks to files or vice versa.  But a change
that modifies even the type of a blob surely should count as a
complete rewrite.

While we are at it, modernise diffcore-break to be aware of gitlinks (we
do not want to touch them).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-02 02:24:46 -08:00
Junio C Hamano
fd200790dc Merge branch 'jk/send-pack'
* jk/send-pack: (24 commits)
  send-pack: cluster ref status reporting
  send-pack: fix "everything up-to-date" message
  send-pack: tighten remote error reporting
  make "find_ref_by_name" a public function
  Fix warning about bitfield in struct ref
  send-pack: assign remote errors to each ref
  send-pack: check ref->status before updating tracking refs
  send-pack: track errors for each ref
  git-push: add documentation for the newly added --mirror mode
  Add tests for git push'es mirror mode
  Update the tracking references only if they were succesfully updated on remote
  Add a test checking if send-pack updated local tracking branches correctly
  git-push: plumb in --mirror mode
  Teach send-pack a mirror mode
  send-pack: segfault fix on forced push
  Reteach builtin-ls-remote to understand remotes
  send-pack: require --verbose to show update of tracking refs
  receive-pack: don't mention successful updates
  more terse push output
  Build in ls-remote
  ...
2007-11-24 16:45:37 -08:00
Junio C Hamano
b6ec1d619f Fix add_files_to_cache() to take pathspec, not user specified list of files
This separates the logic to limit the extent of change to the
index by where you are (controlled by "prefix") and what you
specify from the command line (controlled by "pathspec").

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-22 17:05:05 -08:00
Junio C Hamano
ee425e4643 Export three helper functions from ls-files
This exports three helper functions from ls-files.

 * pathspec_match() checks if a given path matches a set of pathspecs
   and optionally records which pathspec was used.  This function used
   to be called "match()" but renamed to be a bit less vague.

 * report_path_error() takes a set of pathspecs and the record
   pathspec_match() above leaves, and gives error message.  This
   was split out of the main function of ls-files.

 * overlay_tree_on_cache() takes a tree-ish (typically "HEAD")
   and overlays it on the current in-core index.  By iterating
   over the resulting index, the caller can find out the paths
   in either the index or the HEAD.  This function used to be
   called "overlay_tree()" but renamed to be a bit more
   descriptive.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-22 17:05:05 -08:00
Steffen Prohaska
605b4978a1 refactor fetch's ref matching to use refname_match()
The old rules used by fetch were coded as a series of ifs.  The old
rules are:
1) match full refname if it starts with "refs/" or matches "HEAD"
2) verify that full refname starts with "refs/"
3) match abbreviated name in "refs/" if it starts with "heads/",
    "tags/", or "remotes/".
4) match abbreviated name in "refs/heads/"

This is replaced by the new rules
a) match full refname
b) match abbreviated name prefixed with "refs/"
c) match abbreviated name prefixed with "refs/heads/"

The details of the new rules are different from the old rules.  We no
longer verify that the full refname starts with "refs/".  The new rule
(a) matches any full string.  The old rules (1) and (2) were stricter.
Now, the caller is responsible for using sensible full refnames.  This
should be the case for the current code.  The new rule (b) is less
strict than old rule (3).  The new rule accepts abbreviated names that
start with a non-standard prefix below "refs/".

Despite this modifications the new rules should handle all cases as
expected.  Two tests are added to verify that fetch does not resolve
short tags or HEAD in remotes.

We may even think about loosening the rules a bit more and unify them
with the rev-parse rules.  This would be done by replacing
ref_ref_fetch_rules with ref_ref_parse_rules.  Note, the two new test
would break.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 18:39:01 -08:00
Steffen Prohaska
79803322c1 add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/').  git-rev-parse and git-fetch
use slightly different rules.

This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).

abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true.  rules is a NULL-terminate
list of format patterns with "%.*s", for example:

    const char *ref_rev_parse_rules[] = {
               "%.*s",
               "refs/%.*s",
               "refs/tags/%.*s",
               "refs/heads/%.*s",
               "refs/remotes/%.*s",
               "refs/remotes/%.*s/HEAD",
               NULL
    };

Asterisks are included in the format strings because this is the form
required in sha1_name.c.  Sharing the list with the functions there is
a good idea to avoid duplicating the rules.  Hopefully this
facilitates unified matching rules in the future.

This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison.  Before this change, the rules
were buried in get_sha1*() and dwim_ref().

A follow-up commit will refactor the rules used by fetch.

refname_match() will be used for matching refspecs in git-send-pack.

Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 18:39:00 -08:00
Jeff King
2a0fe89a97 send-pack: tighten remote error reporting
Previously, we set all ref pushes to 'OK', and then marked
them as errors if the remote reported so. This has the
problem that if the remote dies or fails to report a ref, we
just assume it was OK.

Instead, we use a new non-OK state to indicate that we are
expecting status (if the remote doesn't support the
report-status feature, we fall back on the old behavior).
Thus we can flag refs for which we expected a status, but
got none (conversely, we now also print a warning for refs
for which we get a status, but weren't expecting one).

This also allows us to simplify the receive_status exit
code, since each ref is individually marked with failure
until we get a success response. We can just print the usual
status table, so the user still gets a sense of what we were
trying to do when the failure happened.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 02:34:52 -08:00
Jeff King
cda69f481d make "find_ref_by_name" a public function
This was a static in remote.c, but is generally useful.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 02:34:34 -08:00
Shawn O. Pearce
9f8a15c734 Fix warning about bitfield in struct ref
cache.h:503: warning: type of bit-field 'force' is a GCC extension
cache.h:504: warning: type of bit-field 'merge' is a GCC extension
cache.h:505: warning: type of bit-field 'nonfastforward' is a GCC extension
cache.h:506: warning: type of bit-field 'deletion' is a GCC extension

So we change it to an 'unsigned int' which is not a GCC extension.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 02:23:25 -08:00
Jeff King
ca74c458a3 send-pack: assign remote errors to each ref
This lets us show remote errors (e.g., a denied hook) along
with the usual push output.

There is a slightly clever optimization in receive_status
that bears explanation. We need to correlate the returned
status and our ref objects, which naively could be an O(m*n)
operation. However, since the current implementation of
receive-pack returns the errors to us in the same order that
we sent them, we optimistically look for the next ref to be
looked up to come after the last one we have found. So it
should be an O(m+n) merge if the receive-pack behavior
holds, but we fall back to a correct but slower behavior if
it should change.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-17 12:10:50 -08:00
Jeff King
8736a84890 send-pack: track errors for each ref
Instead of keeping the 'ret' variable, we instead have a
status flag for each ref that tracks what happened to it.
We then print the ref status after all of the refs have
been examined.

This paves the way for three improvements:
  - updating tracking refs only for non-error refs
  - incorporating remote rejection into the printed status
  - printing errors in a different order than we processed
    (e.g., consolidating non-ff errors near the end with
    a special message)

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-17 12:10:50 -08:00
Johannes Sixt
506b17b136 Introduce git_etc_gitconfig() that encapsulates access of ETC_GITCONFIG.
In a subsequent patch the path to the system-wide config file will be
computed. This is a preparation for that change. It turns all accesses
of ETC_GITCONFIG into function calls. There is no change in behavior.

As a consequence, config.c is the only file that needs the definition of
ETC_GITCONFIG. Hence, -DETC_GITCONFIG is removed from the CFLAGS and a
special build rule for config.c is introduced. As a side-effect, changing
the defintion of ETC_GITCONFIG (e.g. in config.mak) does not trigger a
complete rebuild anymore.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:18:39 -08:00
Johannes Schindelin
4723ee992c Close files opened by lock_file() before unlinking.
This is needed on Windows since open files cannot be unlinked.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:18:39 -08:00
Junio C Hamano
039bc64e88 core.excludesfile clean-up
There are inconsistencies in the way commands currently handle
the core.excludesfile configuration variable.  The problem is
the variable is too new to be noticed by anything other than
git-add and git-status.

 * git-ls-files does not notice any of the "ignore" files by
   default, as it predates the standardized set of ignore files.
   The calling scripts established the convention to use
   .git/info/exclude, .gitignore, and later core.excludesfile.

 * git-add and git-status know about it because they call
   add_excludes_from_file() directly with their own notion of
   which standard set of ignore files to use.  This is just a
   stupid duplication of code that need to be updated every time
   the definition of the standard set of ignore files is
   changed.

 * git-read-tree takes --exclude-per-directory=<gitignore>,
   not because the flexibility was needed.  Again, this was
   because the option predates the standardization of the ignore
   files.

 * git-merge-recursive uses hardcoded per-directory .gitignore
   and nothing else.  git-clean (scripted version) does not
   honor core.* because its call to underlying ls-files does not
   know about it.  git-clean in C (parked in 'pu') doesn't either.

We probably could change git-ls-files to use the standard set
when no excludes are specified on the command line and ignore
processing was asked, or something like that, but that will be a
change in semantics and might break people's scripts in a subtle
way.  I am somewhat reluctant to make such a change.

On the other hand, I think it makes perfect sense to fix
git-read-tree, git-merge-recursive and git-clean to follow the
same rule as other commands.  I do not think of a valid use case
to give an exclude-per-directory that is nonstandard to
read-tree command, outside a "negative" test in the t1004 test
script.

This patch is the first step to untangle this mess.

The next step would be to teach read-tree, merge-recursive and
clean (in C) to use setup_standard_excludes().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:08:04 -08:00
Junio C Hamano
c78a24986d Merge branch 'jc/maint-add-sync-stat'
* jc/maint-add-sync-stat:
  t2200: test more cases of "add -u"
  git-add: make the entry stat-clean after re-adding the same contents
  ce_match_stat, run_diff_files: use symbolic constants for readability

Conflicts:

	builtin-add.c
2007-11-14 14:15:40 -08:00
Junio C Hamano
a108e53861 Merge branch 'db/remote-builtin' into jk/send-pack
* db/remote-builtin:
  Reteach builtin-ls-remote to understand remotes
  Build in ls-remote
  Use built-in send-pack.
  Build-in send-pack, with an API for other programs to call.
  Build-in peek-remote, using transport infrastructure.
  Miscellaneous const changes and utilities

Conflicts:

	transport.c
2007-11-14 03:09:52 -08:00
Junio C Hamano
4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
Junio C Hamano
0380074315 Merge branch 'ds/maint-deflatebound'
* ds/maint-deflatebound:
  Improve accuracy of check for presence of deflateBound.
2007-11-07 18:17:20 -08:00
David Symonds
609a2289d7 Improve accuracy of check for presence of deflateBound.
ZLIB_VERNUM isn't defined in some zlib versions, so this patch does a proper
linking test in autoconf to see whether deflateBound exists in zlib. Also,
setting NO_DEFLATE_BOUND will also work for folk not using autoconf.

Signed-off-by: David Symonds <dsymonds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-07 17:06:14 -08:00
Mike Hommey
59f0f2f33a Refactor working tree setup
Create a setup_work_tree() that can be used from any command requiring
a working tree conditionally.

Signed-off-by: Mike Hommey <mh@glandium.org>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-05 22:47:57 -08:00
Daniel Barkalow
4577370e9b Miscellaneous const changes and utilities
The list of remote refs in struct transport should be const, because
builtin-fetch will get confused if it changes.

The url in git_connect should be const (and work on a copy) instead of
requiring the caller to copy it.

match_refs doesn't modify the refspecs it gets.

get_fetch_map and get_remote_ref don't change the list they get.

Allow transport get_refs_list methods to modify the struct transport.

Add a function to copy a list of refs, when a function needs a mutable
copy of a const list.

Add a function to check the type of a ref, as per the code in connect.c

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-02 22:40:43 -07:00
Junio C Hamano
459fa6d0fe git-diff: complain about >=8 consecutive spaces in initial indent
This introduces a new whitespace error type, "indent-with-non-tab".
The error is about starting a line with 8 or more SP, instead of
indenting it with a HT.

This is not enabled by default, as some projects employ an
indenting policy to use only SPs and no HTs.

The kernel folks and git contributors may want to enable this
detection with:

	[core]
		whitespace = indent-with-non-tab

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-02 17:58:09 -07:00
Junio C Hamano
a9cc857ada War on whitespace: first, a bit of retreat.
This introduces core.whitespace configuration variable that lets
you specify the definition of "whitespace error".

Currently there are two kinds of whitespace errors defined:

 * trailing-space: trailing whitespaces at the end of the line.

 * space-before-tab: a SP appears immediately before HT in the
   indent part of the line.

You can specify the desired types of errors to be detected by
listing their names (unique abbreviations are accepted)
separated by comma.  By default, these two errors are always
detected, as that is the traditional behaviour.  You can disable
detection of a particular type of error by prefixing a '-' in
front of the name of the error, like this:

	[core]
		whitespace = -trailing-space

This patch teaches the code to output colored diff with
DIFF_WHITESPACE color to highlight the detected whitespace
errors to honor the new configuration.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-02 17:58:08 -07:00
Johannes Sixt
98158e9cfd Change git_connect() to return a struct child_process instead of a pid_t.
This prepares the API of git_connect() and finish_connect() to operate on
a struct child_process. Currently, we just use that object as a placeholder
for the pid that we used to return. A follow-up patch will change the
implementation of git_connect() and finish_connect() to make full use
of the object.

Old code had early-return-on-error checks at the calling sites of
git_connect(), but since git_connect() dies on errors anyway, these checks
were removed.

[sp: Corrected style nit of "conn == NULL" to "!conn"]

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-21 01:30:39 -04:00
Shawn O. Pearce
2e13e5d892 Merge branch 'master' into db/fetch-pack
There's a number of tricky conflicts between master and
this topic right now due to the rewrite of builtin-push.
Junio must have handled these via rerere; I'd rather not
deal with them again so I'm pre-merging master into the
topic.  Besides this topic somehow started to depend on
the strbuf series that was in next, but is now in master.
It no longer compiles on its own without the strbuf API.

* master: (184 commits)
  Whip post 1.5.3.4 maintenance series into shape.
  Minor usage update in setgitperms.perl
  manual: use 'URL' instead of 'url'.
  manual: add some markup.
  manual: Fix example finding commits referencing given content.
  Fix wording in push definition.
  Fix some typos, punctuation, missing words, minor markup.
  manual: Fix or remove em dashes.
  Add a --dry-run option to git-push.
  Add a --dry-run option to git-send-pack.
  Fix in-place editing functions in convert.c
  instaweb: support for Ruby's WEBrick server
  instaweb: allow for use of auto-generated scripts
  Add 'git-p4 commit' as an alias for 'git-p4 submit'
  hg-to-git speedup through selectable repack intervals
  git-svn: respect Subversion's [auth] section configuration values
  gtksourceview2 support for gitview
  fix contrib/hooks/post-receive-email hooks.recipients error message
  Support cvs via git-shell
  rebase -i: use diff plumbing instead of porcelain
  ...

Conflicts:

	Makefile
	builtin-push.c
	rsh.c
2007-10-16 00:15:25 -04:00
Junio C Hamano
66d4035e10 Merge branch 'ph/strbuf'
* ph/strbuf: (44 commits)
  Make read_patch_file work on a strbuf.
  strbuf_read_file enhancement, and use it.
  strbuf change: be sure ->buf is never ever NULL.
  double free in builtin-update-index.c
  Clean up stripspace a bit, use strbuf even more.
  Add strbuf_read_file().
  rerere: Fix use of an empty strbuf.buf
  Small cache_tree_write refactor.
  Make builtin-rerere use of strbuf nicer and more efficient.
  Add strbuf_cmp.
  strbuf_setlen(): do not barf on setting length of an empty buffer to 0
  sq_quote_argv and add_to_string rework with strbuf's.
  Full rework of quote_c_style and write_name_quoted.
  Rework unquote_c_style to work on a strbuf.
  strbuf API additions and enhancements.
  nfv?asprintf are broken without va_copy, workaround them.
  Fix the expansion pattern of the pseudo-static path buffer.
  builtin-for-each-ref.c::copy_name() - do not overstep the buffer.
  builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion.
  Use xmemdupz() in many places.
  ...
2007-10-03 03:06:02 -07:00
Junio C Hamano
0341091a9e Merge branch 'jc/autogc'
* jc/autogc:
  git-gc --auto: run "repack -A -d -l" as necessary.
  git-gc --auto: restructure the way "repack" command line is built.
  git-gc --auto: protect ourselves from accumulated cruft
  git-gc --auto: add documentation.
  git-gc --auto: move threshold check to need_to_gc() function.
  repack -A -d: use --keep-unreachable when repacking
  pack-objects --keep-unreachable
  Export matches_pack_name() and fix its return value
  Invoke "git gc --auto" from commit, merge, am and rebase.
  Implement git gc --auto
2007-10-03 03:05:32 -07:00
Andy Parkins
856665f827 parse_date_format(): convert a format name to an enum date_mode
Factor out the code to parse --date=<format> parameter to revision
walkers into a separate function, parse_date_format().  This function
is passed a string and converts it to an enum date_format:

 - "relative"         => DATE_RELATIVE
 - "iso8601" or "iso" => DATE_ISO8601
 - "rfc2822"          => DATE_RFC2822
 - "short"            => DATE_SHORT
 - "local"            => DATE_LOCAL
 - "default"          => DATE_NORMAL

In the event that none of these strings is found, the function die()s.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-29 20:31:59 -07:00
Carlos Rica
102c2338da Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-26 13:42:10 -07:00
Pierre Habouzit
19247e5510 nfv?asprintf are broken without va_copy, workaround them.
* drop nfasprintf.
* move nfvasprintf into imap-send.c back, and let it work on a 8k buffer,
  and die() in case of overflow. It should be enough for imap commands, if
  someone cares about imap-send, he's welcomed to fix it properly.
* replace nfvasprintf use in merge-recursive with a copy of the strbuf_addf
  logic, it's one place, we'll live with it.
  To ease the change, output_buffer string list is replaced with a strbuf ;)
* rework trace.c to call vsnprintf itself.  It's used to format strerror()s
  and git command names, it should never be more than a few octets long, let
  it work on a 8k static buffer with vsnprintf or die loudly.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
2007-09-20 23:17:22 -07:00
Daniel Barkalow
b888d61c83 Make fetch a builtin
Thanks to Johannes Schindelin for review and fixes, and Julian
Phillips for the original C translation.

This changes a few small bits of behavior:

branch.<name>.merge is parsed as if it were the lhs of a fetch
refspec, and does not have to exactly match the actual lhs of a
refspec, so long as it is a valid abbreviation for the same ref.

branch.<name>.merge is no longer ignored if the remote is configured
with a branches/* file. Neither behavior is useful, because there can
only be one ref that gets fetched, but this is more consistant.

Also, fetch prints different information to standard out.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-19 03:22:30 -07:00
Junio C Hamano
39bd2eb56a Merge branch 'master' into ph/strbuf
* master: (94 commits)
  Fixed update-hook example allow-users format.
  Documentation/git-svn: updated design philosophy notes
  t/t4014: test "am -3" with mode-only change.
  git-commit.sh: Shell script cleanup
  preserve executable bits in zip archives
  Fix lapsus in builtin-apply.c
  git-push: documentation and tests for pushing only branches
  git-svnimport: Use separate arguments in the pipe for git-rev-parse
  contrib/fast-import: add perl version of simple example
  contrib/fast-import: add simple shell example
  rev-list --bisect: Bisection "distance" clean up.
  rev-list --bisect: Move some bisection code into best_bisection.
  rev-list --bisect: Move finding bisection into do_find_bisection.
  Document ls-files --with-tree=<tree-ish>
  git-commit: partial commit of paths only removed from the index
  git-commit: Allow partial commit of file removal.
  send-email: make message-id generation a bit more robust
  git-apply: fix whitespace stripping
  git-gui: Disable native platform text selection in "lists"
  apply --index-info: fall back to current index for mode changes
  ...
2007-09-18 17:42:15 -07:00
Junio C Hamano
000dfd3f6e Export matches_pack_name() and fix its return value
The function sounds boolean; make it behave as one, not "0 for
success, non-zero for failure".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-17 12:25:26 -07:00
Pierre Habouzit
5ecd293d14 Rewrite convert_to_{git,working_tree} to use strbuf's.
* Now, those functions take an "out" strbuf argument, where they store their
  result if any. In that case, it also returns 1, else it returns 0.
* those functions support "in place" editing, in the sense that it's OK to
  call them this way:
    convert_to_git(path, sb->buf, sb->len, sb);
  When doable, conversions are done in place for real, else the strbuf
  content is just replaced with the new one, transparentely for the caller.

If you want to create a new filter working this way, being the accumulation
of filter1, filter2, ... filtern, then your meta_filter would be:

    int meta_filter(..., const char *src, size_t len, struct strbuf *sb)
    {
        int ret = 0;
        ret |= filter1(...., src, len, sb);
        if (ret) {
            src = sb->buf;
            len = sb->len;
        }
        ret |= filter2(...., src, len, sb);
        if (ret) {
            src = sb->buf;
            len = sb->len;
        }
        ....
        return ret | filtern(..., src, len, sb);
    }

That's why subfilters the convert_to_* functions called were also rewritten
to work this way.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-16 17:30:03 -07:00
Carlos Rica
6640f88165 Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-12 13:25:07 -07:00
Pierre Habouzit
fd17f5b5f7 Replace all read_fd use with strbuf_read, and get rid of it.
This brings builtin-stripspace, builtin-tag and mktag to use strbufs.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10 12:50:58 -07:00
René Scharfe
89b4256cfb Remove unused function convert_sha1_file()
convert_sha1_file() became unused by the previous patch -- remove it.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-03 16:46:23 -07:00
Junio C Hamano
aecbf914c4 git-diff: resurrect the traditional empty "diff --git" behaviour
The warning message to suggest "Consider running git-status" from
"git-diff" that we experimented with during the 1.5.3 cycle turns
out to be a bad idea.  It robbed cache-dirty information from people
who valued it, while still asking users to run "update-index --refresh".
It was hoped that the new behaviour would at least have some educational
value, but not showing the cache-dirty paths like before meant that the
user would not even know easily which paths were cache-dirty, and it
made the need to refresh the index look like even more unnecessary chore.

This commit reinstates the traditional behaviour, but with a twist.

By default, the empty "diff --git" output is totally squelched out
from "git diff" output.  At the end of the command, it automatically
runs "update-index --refresh" as needed, without even bothering the
user.  In other words, people who do not care about the cache-dirtyness
do not even have to see the warning.

The traditional behaviour to see the stat-dirty output and to bypassing
the overhead of content comparison can be specified by setting the
configuration variable diff.autorefreshindex to false.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-31 23:30:14 -07:00
Alexandre Julliard
d616813d75 git-add: Add support for --refresh option.
This allows to refresh only a subset of the project files, based on
the specified pathspecs.

Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-13 12:58:38 -07:00
Junio C Hamano
55d1932bce Merge branch 'cr/tag'
* cr/tag:
  Teach "git stripspace" the --strip-comments option
  Make verify-tag a builtin.
  builtin-tag.c: Fix two memory leaks and minor notation changes.
  launch_editor(): Heed GIT_EDITOR and core.editor settings
  Make git tag a builtin.
2007-08-10 23:17:46 -07:00
Junio C Hamano
af3785dc5a Optimize "diff --cached" performance.
The read_tree() function is called only from the call chain to
run "git diff --cached" (this includes the internal call made by
git-runstatus to run_diff_index()).  The function vacates stage
without any funky "merge" magic.  The caller then goes and
compares stage #1 entries from the tree with stage #0 entries
from the original index.

When adding the cache entries this way, it used the general
purpose add_cache_entry().  This function looks for an existing
entry to replace or if there is none to find where to insert the
new entry, resolves D/F conflict and all the other things.

For the purpose of reading entries into an empty stage, none of
that processing is needed.  We can instead append everything and
then sort the result at the end.

This commit changes read_tree() to first make sure that there is
no existing cache entries at specified stage, and if that is the
case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag
(new), and then sort the resulting cache using qsort().

This new flag tells add_cache_entry() to omit all the checks
such as "Does this path already exist?  Does adding this path
remove other existing entries because it turns a directory to a
file?" and instead append the given cache entry straight at the
end of the active cache.  The caller of course is expected to
sort the resulting cache at the end before using the result.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10 11:44:23 -07:00
Johannes Schindelin
e90fdc39b6 Clean up work-tree handling
The old version of work-tree support was an unholy mess, barely readable,
and not to the point.

For example, why do you have to provide a worktree, when it is not used?
As in "git status".  Now it works.

Another riddle was: if you can have work trees inside the git dir, why
are some programs complaining that they need a work tree?

IOW it is allowed to call

	$ git --git-dir=../ --work-tree=. bla

when you really want to.  In this case, you are both in the git directory
and in the working tree.  So, programs have to actually test for the right
thing, namely if they are inside a working tree, and not if they are
inside a git directory.

Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was
specified, unless there is a repository in the current working directory.
It does now.

The logic to determine if a repository is bare, or has a work tree
(tertium non datur), is this:

--work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true,
which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR
ends in /.git, which overrides the directory in which .git/ was found.

In related news, a long standing bug was fixed: when in .git/bla/x.git/,
which is a bare repository, git formerly assumed ../.. to be the
appropriate git dir.  This problem was reported by Shawn Pearce to have
caused much pain, where a colleague mistakenly ran "git init" in "/" a
long time ago, and bare repositories just would not work.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:31 -07:00
Johannes Schindelin
d7ac12b25d Add set_git_dir() function
With the function set_git_dir() you can reset the path that will
be used for git_path(), git_dir() and friends.

The responsibility to close files and throw away information from the
old git_dir lies with the caller.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:31 -07:00
Johannes Schindelin
e5392c5146 Add is_absolute_path() and make_absolute_path()
This patch adds convenience functions to work with absolute paths.
The function is_absolute_path() should help the efforts to integrate
the MinGW fork.

Note that make_absolute_path() returns a pointer to a static buffer.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:30 -07:00
Johannes Schindelin
4d87b9c5db launch_editor(): Heed GIT_EDITOR and core.editor settings
In the commit 'Add GIT_EDITOR environment and core.editor
configuration variables', this was done for the shell scripts.
Port it over to builtin-tag's version of launch_editor(), which
is just about to be refactored into editor.c.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-21 16:51:14 -07:00
Carlos Rica
c4fba0a358 Rename read_pipe() with read_fd() and make its buffer nul-terminated.
The new name is closer to the purpose of the function.

A NUL-terminated buffer makes things easier when callers need that.
Since the function returns only the memory written with data,
almost always allocating more space than needed because final
size is unknown, an extra NUL terminating the buffer is harmless.
It is not included in the returned size, so the function
remains working as before.

Also, now the function allows the buffer passed to be NULL at first,
and alloc_nr is now used for growing the buffer, instead size=*2.

Signed-off-by: Carlos Rica <jasampler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-18 17:30:03 -07:00
Junio C Hamano
73013afd14 Make show_rfc2822_date() just another date output format.
These days, show_date() takes a date_mode parameter to specify
the output format, and a separate specialized function for dates
in E-mails does not make much sense anymore.

This retires show_rfc2822_date() function and make it just
another date output format.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-13 23:14:52 -07:00
Robin Rosenberg
ee8f838e03 Support output ISO 8601 format dates
Support output of full ISO 8601 style dates in e.g. git log
and other places that use interpolation for formatting.

Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-13 22:47:49 -07:00
Brian Downing
0b87b6e081 Add functions for parsing integers with size suffixes
Split out the nnn{k,m,g} parsing code from git_config_int into
git_parse_long, so command-line parameters can enjoy the same
functionality.  Also add get_parse_ulong for unsigned values.

Make git_config_int use git_parse_long, and add get_config_ulong
as well.

Signed-off-by: Brian Downing <bdowning@lavos.net>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-12 14:32:35 -07:00
Brian Gernhardt
54adf3706c Add core.pager config variable.
This adds a configuration variable that performs the same function as,
but is overridden by, GIT_PAGER.

Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Acked-by: Johannes E. Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-04 10:09:32 -07:00
Junio C Hamano
660d579d6f Merge branch 'jc/quote'
* jc/quote:
  Add core.quotepath configuration variable.
2007-07-01 14:57:51 -07:00
Junio C Hamano
0305b63654 Merge branch 'ei/worktree+filter'
* ei/worktree+filter:
  filter-branch: always export GIT_DIR if it is set
  setup_git_directory: fix segfault if repository is found in cwd
  test GIT_WORK_TREE
  extend rev-parse test for --is-inside-work-tree
  Use new semantics of is_bare/inside_git_dir/inside_work_tree
  introduce GIT_WORK_TREE to specify the work tree
  test git rev-parse
  rev-parse: introduce --is-bare-repository
  rev-parse: document --is-inside-git-dir
2007-07-01 13:10:42 -07:00
Theodore Ts'o
06f59e9f5d Don't fflush(stdout) when it's not helpful
This patch arose from a discussion started by Jim Meyering's patch
whose intention was to provide better diagnostics for failed writes.
Linus proposed a better way to do things, which also had the added
benefit that adding a fflush() to git-log-* operations and incremental
git-blame operations could improve interactive respose time feel, at
the cost of making things a bit slower when we aren't piping the
output to a downstream program.

This patch skips the fflush() calls when stdout is a regular file, or
if the environment variable GIT_FLUSH is set to "0".  This latter can
speed up a command such as:

GIT_FLUSH=0 strace -c -f -e write time git-rev-list HEAD | wc -l

a tiny amount.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-30 20:16:12 -07:00
Junio C Hamano
9378c16135 Add core.quotepath configuration variable.
We always quote "unusual" byte values in a pathname using
C-string style, to make it safer for parsing scripts that do not
handle NUL separated records well (or just too lazy to bother).
The absolute minimum bytes that need to be quoted for this
purpose are TAB, LF (and other control characters), double quote
and backslash.

However, we have also always quoted the bytes in high 8-bit
range; this was partly because we were lazy and partly because
we were being cautious.

This introduces an internal "quote_path_fully" variable, and
core.quotepath configuration variable to control it.  When set
to false, it does not quote bytes in high 8-bit range anymore
but passes them intact.

The variable defaults to "true" to retain the traditional
behaviour for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-24 15:11:42 -07:00
Junio C Hamano
9bee7aabcd Merge branch 'ei/oneline+add-empty'
* ei/oneline+add-empty:
  Fix ALLOC_GROW calls with obsolete semantics
  Fix ALLOC_GROW off-by-one
  builtin-add: simplify (and increase accuracy of) exclude handling
  dir_struct: add collect_ignored option
  Extend --pretty=oneline to cover the first paragraph,
  Lift 16kB limit of log message output
2007-06-22 23:32:19 -07:00
Jeff King
c927e6c69b Fix ALLOC_GROW off-by-one
The ALLOC_GROW macro will never let us fill the array completely,
instead allocating an extra chunk if that would be the case. This is
because the 'nr' argument was originally treated as "how much we do have
now" instead of "how much do we want". The latter makes much more
sense because you can grow by more than one item.

This off-by-one never resulted in an error because it meant we were
overly conservative about when to allocate. Any callers which passed
"how much we have now" need to be updated, or they will fail to allocate
enough.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-16 17:55:44 -07:00
Junio C Hamano
4175e9e3a8 More static
There still are quite a few symbols that ought to be static.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-13 02:02:10 -07:00
Junio C Hamano
4234a76167 Extend --pretty=oneline to cover the first paragraph,
so that an ugly commit message like this can be
handled sanely.

Currently, --pretty=oneline and --pretty=email (hence
format-patch) take and use only the first line of the commit log
message.  This changes them to:

 - Take the first paragraph, where the definition of the first
   paragraph is "skip all blank lines from the beginning, and
   then grab everything up to the next empty line".

 - Replace all line breaks with a whitespace.

This change would not affect a well-behaved commit message that
adheres to the convention of "single line summary, a blank line,
and then body of message", as its first paragraph always
consists of a single line.  Commit messages from different
culture, such as the ones imported from CVS/SVN, can however get
chomped with the existing behaviour at the first linebreak in
the middle of sentence right now, which would become much easier
to see with this change.

The Subject: and --pretty=oneline output would become very long
and unsightly for non-conforming commits, but their messages are
already ugly anyway, and thischange at least avoids the loss of
information.

The Subject: line from a multi-line paragraph is folded using
RFC2822 line folding rules at the places where line breaks were
in the original.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-13 00:41:21 -07:00
Jeff King
6815e56933 refactor dir_add_name
This is in preparation for keeping two entry lists in the
dir object.

This patch adds and uses the ALLOC_GROW() macro, which
implements the commonly used idiom of growing a dynamic
array using the alloc_nr function (not just in dir.c, but
everywhere).

We also move creation of a dir_entry to dir_entry_new.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-12 23:00:31 -07:00
Junio C Hamano
a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Matthias Lederhofer
892c41b98a introduce GIT_WORK_TREE to specify the work tree
setup_gdg is used as abbreviation for setup_git_directory_gently.

The work tree can be specified using the environment variable
GIT_WORK_TREE and the config option core.worktree (the environment
variable has precendence over the config option).  Additionally
there is a command line option --work-tree which sets the
environment variable.

setup_gdg does the following now:

GIT_DIR unspecified
repository in .git directory
    parent directory of the .git directory is used as work tree,
    GIT_WORK_TREE is ignored

GIT_DIR unspecified
repository in cwd
    GIT_DIR is set to cwd
    see the cases with GIT_DIR specified what happens next and
    also see the note below

GIT_DIR specified
GIT_WORK_TREE/core.worktree unspecified
    cwd is used as work tree

GIT_DIR specified
GIT_WORK_TREE/core.worktree specified
    the specified work tree is used

Note on the case where GIT_DIR is unspecified and repository is in cwd:
    GIT_WORK_TREE is used but is_inside_git_dir is always true.
    I did it this way because setup_gdg might be called multiple
    times (e.g. when doing alias expansion) and in successive calls
    setup_gdg should do the same thing every time.

Meaning of is_bare/is_inside_work_tree/is_inside_git_dir:

(1) is_bare_repository
    A repository is bare if core.bare is true or core.bare is
    unspecified and the name suggests it is bare (directory not
    named .git).  The bare option disables a few protective
    checks which are useful with a working tree.  Currently
    this changes if a repository is bare:
        updates of HEAD are allowed
        git gc packs the refs
        the reflog is disabled by default

(2) is_inside_work_tree
    True if the cwd is inside the associated working tree (if there
    is one), false otherwise.

(3) is_inside_git_dir
    True if the cwd is inside the git directory, false otherwise.
    Before this patch is_inside_git_dir was always true for bare
    repositories.

When setup_gdg finds a repository git_config(git_default_config) is
always called.  This ensure that is_bare_repository makes use of
core.bare and does not guess even though core.bare is specified.

inside_work_tree and inside_git_dir are set if setup_gdg finds a
repository.  The is_inside_work_tree and is_inside_git_dir functions
will die if they are called before a successful call to setup_gdg.

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 16:07:53 -07:00
Junio C Hamano
17c2929aa2 Merge branch 'sp/pack'
* sp/pack:
  Style nit - don't put space after function names
  Ensure the pack index is opened before access
  Simplify index access condition in count-objects, pack-redundant
  Test for recent rev-parse $abbrev_sha1 regression
  rev-parse: Identify short sha1 sums correctly.
  Attempt to delay prepare_alt_odb during get_sha1
  Micro-optimize prepare_alt_odb
  Lazily open pack index files on demand
2007-06-02 12:18:51 -07:00
Junio C Hamano
bd724be4be Merge branch 'maint'
* maint:
  git-config: Improve documentation of git-config file handling
  git-config: Various small fixes to asciidoc documentation
  decode_85(): fix missing return.
  fix signed range problems with hex conversions
2007-05-31 00:15:14 -07:00
Junio C Hamano
8e29f903eb Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
  git-config: Improve documentation of git-config file handling
  git-config: Various small fixes to asciidoc documentation
  decode_85(): fix missing return.
  fix signed range problems with hex conversions
2007-05-31 00:09:26 -07:00
Linus Torvalds
192a6be2a7 fix signed range problems with hex conversions
Make hexval_table[] "const".  Also make sure that the accessor
function hexval() does not access the table with out-of-range
values by declaring its parameter "unsigned char", instead of
"unsigned int".

With this, gcc can just generate:

	movzbl  (%rdi), %eax
	movsbl  hexval_table(%rax),%edx
	movzbl  1(%rdi), %eax
	movsbl  hexval_table(%rax),%eax
	sall    $4, %edx
	orl     %eax, %edx

for the code to generate a byte from two hex characters.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-30 15:01:37 -07:00
Junio C Hamano
322bcd9a9a Merge branch 'db/remote'
* db/remote:
  Move refspec pattern matching to match_refs().
  Update local tracking refs when pushing
  Add handlers for fetch-side configuration of remotes.
  Move refspec parser from connect.c and cache.h to remote.{c,h}
  Move remote parsing into a library file out of builtin-push.
2007-05-29 01:24:20 -07:00
Shawn O. Pearce
d079837eee Lazily open pack index files on demand
In some repository configurations the user may have many packfiles,
but all of the recent commits/trees/tags/blobs are likely to
be in the most recent packfile (the one with the newest mtime).
It is therefore common to be able to complete an entire operation
by accessing only one packfile, even if there are 25 packfiles
available to the repository.

Rather than opening and mmaping the corresponding .idx file for
every pack found, we now only open and map the .idx when we suspect
there might be an object of interest in there.

Of course we cannot known in advance which packfile contains an
object, so we still need to scan the entire packed_git list to
locate anything.  But odds are users want to access objects in the
most recently created packfiles first, and that may be all they
ever need for the current operation.

Junio observed in b867092f that placing recent packfiles before
older ones can slightly improve access times for recent objects,
without degrading it for historical object access.

This change improves upon Junio's observations by trying even harder
to avoid the .idx files that we won't need.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 20:28:08 -07:00
Martin Waitz
302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
Daniel Barkalow
6b62816cb1 Move refspec parser from connect.c and cache.h to remote.{c,h}
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-20 21:32:56 -07:00
Junio C Hamano
45bde46bfb Merge branch 'dh/pack'
* dh/pack:
  Custom compression levels for objects and packs
2007-05-20 02:19:19 -07:00
Junio C Hamano
e223249a13 Merge branch 'mst/connect'
* mst/connect:
  connect: display connection progress
2007-05-20 02:18:50 -07:00
Junio C Hamano
cc93020f52 Merge branch 'np/pack'
* np/pack:
  deprecate the new loose object header format
  make "repack -f" imply "pack-objects --no-reuse-object"
  allow for undeltified objects not to be reused
2007-05-20 02:18:43 -07:00
René Scharfe
5e6cfc80e2 git-archive: convert archive entries like checkouts do
As noted by Johan Herland, git-archive is a kind of checkout and needs
to apply any checkout filters that might be configured.

This patch adds the convenience function convert_sha1_file which returns
a buffer containing the object's contents, after converting, if necessary
(i.e. it's a combination of read_sha1_file and convert_to_working_tree).
Direct calls to read_sha1_file in git-archive are then replaced by calls
to convert_sha1_file.

Since convert_sha1_file expects its path argument to be NUL-terminated --
a convention it inherits from convert_to_working_tree -- the patch also
changes the path handling in archive-tar.c to always NUL-terminate the
string.  It used to solely rely on the len field of struct strbuf before.

archive-zip.c already NUL-terminates the path and thus needs no such
change.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-18 16:36:45 -07:00
Michael S. Tsirkin
7841ce7985 connect: display connection progress
Make git notify the user about host resolution/connection attempts.
This is useful both as a progress indicator on slow links, and helps
reassure the user there are no firewall problems.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-16 12:48:18 -07:00
Junio C Hamano
f859c846e9 Add has_symlink_leading_path() function.
When we are applying a patch that creates a blob at a path, or
when we are switching from a branch that does not have a blob at
the path to another branch that has one, we need to make sure
that there is nothing at the path in the working tree, as such a
file is a local modification made by the user that would be lost
by the operation.

Normally, lstat() on the path and making sure ENOENT is returned
is good enough for that purpose.  However there is a twist.  We
may be creating a regular file arch/x86_64/boot/Makefile, while
removing an existing symbolic link at arch/x86_64/boot that
points at existing ../i386/boot directory that has Makefile in
it.  We always first check without touching filesystem and then
perform the actual operation, so when we verify the new file,
arch/x86_64/boot/Makefile, does not exist, we haven't removed
the symbolic link arc/x86_64/boot symbolic link yet.  lstat() on
the file sees through the symbolic link and reports the file is
there, which is not what we want.

The function has_symlink_leading_path() function takes a path,
and sees if any of the leading directory component is a symbolic
link.

When files in a new directory are created, we tend to process
them together because both index and tree are sorted.  The
function takes advantage of this and allows the caller to cache
and reuse which symbolic link on the filesystem caused the
function to return true.

The calling sequence would be:

	char last_symlink[PATH_MAX];

        *last_symlink = '\0';
        for each index entry {
		if (!lose)
			continue;
		if (lstat(it))
			if (errno == ENOENT)
				; /* happy */
			else
				error;
		else if (has_symlink_leading_path(it, last_symlink))
			; /* happy */
		else
			error; /* would lose local changes */
		unlink_entry(it, last_symlink);
	}

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-11 22:11:07 -07:00
Dana How
960ccca680 Custom compression levels for objects and packs
Add config variables pack.compression and core.loosecompression ,
and switch --compression=level to pack-objects.

Loose objects will be compressed using core.loosecompression if set,
else core.compression if set, else Z_BEST_SPEED.
Packed objects will be compressed using --compression=level if seen,
else pack.compression if set, else core.compression if set,
else Z_DEFAULT_COMPRESSION.  This is the "pack compression level".

Loose objects added to a pack undeltified will be recompressed
to the pack compression level if it is unequal to the current
loose compression level by the preceding rules,  or if the loose
object was written while core.legacyheaders = true.  Newly
deltified loose objects are always compressed to the current
pack compression level.

Previously packed objects added to a pack are recompressed
to the current pack compression level exactly when their
deltification status changes,  since the previous pack data
cannot be reused.

In either case,  the --no-reuse-object switch from the first
patch below will always force recompression to the current pack
compression level,  instead of assuming the pack compression level
hasn't changed and pack data can be reused when possible.

This applies on top of the following patches from Nicolas Pitre:
[PATCH] allow for undeltified objects not to be reused
[PATCH] make "repack -f" imply "pack-objects --no-reuse-object"

Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-10 15:23:09 -07:00
Nicolas Pitre
726f852b0e deprecate the new loose object header format
Now that we encourage and actively preserve objects in a packed form
more agressively than we did at the time the new loose object format and
core.legacyheaders were introduced, that extra loose object format
doesn't appear to be worth it anymore.

Because the packing of loose objects has to go through the delta match
loop anyway, and since most of them should end up being deltified in
most cases, there is really little advantage to have this parallel loose
object format as the CPU savings it might provide is rather lost in the
noise in the end.

This patch gets rid of core.legacyheaders, preserve the legacy format as
the only writable loose object format and deprecate the other one to
keep things simpler.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-10 15:22:33 -07:00
Junio C Hamano
a7b02ccf9a Add --date={local,relative,default}
This adds --date={local,relative,default} option to log family of commands,
to allow displaying timestamps in user's local timezone, relative time, or
the default format.

Existing --relative-date option is a synonym of --date=relative; we could
probably deprecate it in the long run.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 21:39:43 -07:00
Luiz Fernando N. Capitulino
efbc583126 entry.c: Use const qualifier for 'struct checkout' parameters
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 13:15:59 -07:00
Junio C Hamano
da94faf671 Merge branch 'jc/the-index'
* jc/the-index:
  Make read-cache.c "the_index" free.
  Move index-related variables into a structure.
2007-04-24 22:13:22 -07:00
Martin Koegler
a0cd87a570 add get_sha1_with_mode
get_sha1_with_mode basically behaves as get_sha1. It has an additional
parameter for storing the mode of the object.

If the mode can not be determined, it stores S_IFINVALID.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-24 00:08:49 -07:00
Martin Koegler
40689ae1ef Add S_IFINVALID mode
S_IFINVALID is used to signal, that no mode information is available.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-24 00:08:49 -07:00
Junio C Hamano
4aab5b46f4 Make read-cache.c "the_index" free.
This makes all low-level functions defined in read-cache.c to
take an explicit index_state structure as their first parameter,
to specify which index to work on.  These functions
traditionally operated on "the_index" and were named foo_cache();
the counterparts this patch introduces are called foo_index().

The traditional foo_cache() functions are made into macros that
give "the_index" to their corresponding foo_index() functions.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Junio C Hamano
228e94f935 Move index-related variables into a structure.
This defines a index_state structure and moves index-related
global variables into it.  Currently there is one instance of
it, the_index, and everybody accesses it, so there is no code
change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Junio C Hamano
42c4b58059 Merge branch 'lt/objalloc'
* 'lt/objalloc':
  Clean up object creation to use more common code
  Use proper object allocators for unknown object nodes too
2007-04-21 17:42:02 -07:00
Junio C Hamano
a2d7c6c620 Merge branch 'jc/attr'
* 'jc/attr': (28 commits)
  lockfile: record the primary process.
  convert.c: restructure the attribute checking part.
  Fix bogus linked-list management for user defined merge drivers.
  Simplify calling of CR/LF conversion routines
  Document gitattributes(5)
  Update 'crlf' attribute semantics.
  Documentation: support manual section (5) - file formats.
  Simplify code to find recursive merge driver.
  Counto-fix in merge-recursive
  Fix funny types used in attribute value representation
  Allow low-level driver to specify different behaviour during internal merge.
  Custom low-level merge driver: change the configuration scheme.
  Allow the default low-level merge driver to be configured.
  Custom low-level merge driver support.
  Add a demonstration/test of customized merge.
  Allow specifying specialized merge-backend per path.
  merge-recursive: separate out xdl_merge() interface.
  Allow more than true/false to attributes.
  Document git-check-attr
  Change attribute negation marker from '!' to '-'.
  ...
2007-04-21 17:38:00 -07:00
Junio C Hamano
afb5b6a24b Merge branch 'lt/gitlink'
* lt/gitlink:
  Tests for core subproject support
  Expose subprojects as special files to "git diff" machinery
  Fix some "git ls-files -o" fallout from gitlinks
  Teach "git-read-tree -u" to check out submodules as a directory
  Teach git list-objects logic to not follow gitlinks
  Fix gitlink index entry filesystem matching
  Teach "git-read-tree -u" to check out submodules as a directory
  Teach git list-objects logic not to follow gitlinks
  Don't show gitlink directories when we want "other" files
  Teach git-update-index about gitlinks
  Teach directory traversal about subprojects
  Fix thinko in subproject entry sorting
  Teach core object handling functions about gitlinks
  Teach "fsck" not to follow subproject links
  Add "S_IFDIRLNK" file mode infrastructure for git links
  Add 'resolve_gitlink_ref()' helper function
  Avoid overflowing name buffer in deep directory structures
  diff-lib: use ce_mode_from_stat() rather than messing with modes manually
2007-04-21 17:21:10 -07:00
Junio C Hamano
99ebd06c18 Merge branch 'np/pack'
* np/pack: (27 commits)
  document --index-version for index-pack and pack-objects
  pack-objects: remove obsolete comments
  pack-objects: better check_object() performances
  add get_size_from_delta()
  pack-objects: make in_pack_header_size a variable of its own
  pack-objects: get rid of create_final_object_list()
  pack-objects: get rid of reuse_cached_pack
  pack-objects: clean up list sorting
  pack-objects: rework check_delta_limit usage
  pack-objects: equal objects in size should delta against newer objects
  pack-objects: optimize preferred base handling a bit
  clean up add_object_entry()
  tests for various pack index features
  use test-genrandom in tests instead of /dev/urandom
  simple random data generator for tests
  validate reused pack data with CRC when possible
  allow forcing index v2 and 64-bit offset treshold
  pack-redundant.c: learn about index v2
  show-index.c: learn about index v2
  sha1_file.c: learn about index version 2
  ...
2007-04-21 17:20:50 -07:00
Junio C Hamano
5e635e3960 lockfile: record the primary process.
The usual process flow is the main process opens and holds the lock to
the index, does its thing, perhaps spawning children during the course,
and then writes the resulting index out by releaseing the lock.

However, the lockfile interface uses atexit(3) to clean it up, without
regard to who actually created the lock.  This typically leads to a
confusing behaviour of lock being released too early when the child
exits, and then the parent process when it calls commit_lockfile()
finds that it cannot unlock it.

This fixes the problem by recording who created and holds the lock, and
upon atexit(3) handler, child simply ignores the lockfile the parent
created.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-21 11:55:23 -07:00
Alex Riesen
ac78e54804 Simplify calling of CR/LF conversion routines
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-20 23:24:34 -07:00
Junio C Hamano
17bee1947a Merge branch 'maint'
* maint:
  Use const qualifier for 'sha1' parameter in delete_ref function
2007-04-17 22:17:29 -07:00
Carlos Rica
1401f46bb4 Use const qualifier for 'sha1' parameter in delete_ref function
delete_ref function does not change the 'sha1' parameter. Non-const pointer
causes a compiler warning if you call to the function using a const argument.

Signed-off-by: Carlos Rica <jasampler@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-17 22:00:18 -07:00
Linus Torvalds
100c5f3b0b Clean up object creation to use more common code
This replaces the fairly odd "created_object()" function that did _most_
of the object setup with a more complete "create_object()" function that
also has a more natural calling convention.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-16 23:36:16 -07:00
Linus Torvalds
2c1cbec1e2 Use proper object allocators for unknown object nodes too
We used to use a different allocator scheme for when we didn't know the
object type.  That meant that objects that were created without any
up-front knowledge of the type would not go through the same allocation
paths as normal object allocations, and would miss out on the statistics.

But perhaps more importantly than the statistics (that are useful when
looking at memory usage but not much else), if we want to make the
object hash tables use a denser object pointer representation, we need
to make sure that they all go through the same blocking allocator.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-16 23:36:11 -07:00
Nicolas Pitre
54dab52ae8 add get_size_from_delta()
... which consists of existing code split out of packed_delta_info()
for other callers to use it as well.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-16 17:43:31 -07:00
Junio C Hamano
f48fd68887 attribute macro support
This adds "attribute macros" (for lack of better name).  So far,
we have low-level attributes such as crlf and diff, which are
defined in operational terms --- setting or unsetting them on a
particular path directly affects what is done to the path.  For
example, in order to decline diffs or crlf conversions on a
binary blob, no diffs on PostScript files, and treat all other
files normally, you would have something like these:

	*		diff crlf
	*.ps		!diff
	proprietary.o	!diff !crlf

That is fine as the operation goes, but gets unwieldy rather
rapidly, when we start adding more low-level attributes that are
defined in operational terms.  A near-term example of such an
attribute would be 'merge-3way' which would control if git
should attempt the usual 3-way file-level merge internally, or
leave merging to a specialized external program of user's
choice.  When it is added, we do _not_ want to force the users
to update the above to:

	*		diff crlf merge-3way
	*.ps		!diff
	proprietary.o	!diff !crlf !merge-3way

The way this patch solves this issue is to realize that the
attributes the user is assigning to paths are not defined in
terms of operations but in terms of what they are.

All of the three low-level attributes usually make sense for
most of the files that sane SCM users have git operate on (these
files are typically called "text').  Only a few cases, such as
binary blob, need exception to decline the "usual treatment
given to text files" -- and people mark them as "binary".

So this allows the $GIT_DIR/info/alternates and .gitattributes
at the toplevel of the project to also specify attributes that
assigns other attributes.  The syntax is '[attr]' followed by an
attribute name followed by a list of attribute names:

	[attr] binary	!diff !crlf !merge-3way

When "binary" attribute is set to a path, if the path has not
got diff/crlf/merge-3way attribute set or unset by other rules,
this rule unsets the three low-level attributes.

It is expected that the user level .gitattributes will be
expressed mostly in terms of attributes based on what the files
are, and the above sample would become like this:

	(built-in attribute configuration)
	[attr] binary	!diff !crlf !merge-3way
	*		diff crlf merge-3way

	(project specific .gitattributes)
	proprietary.o	binary

	(user preference $GIT_DIR/info/attributes)
	*.ps		!diff

There are a few caveats.

 * As described above, you can define these macros only in
   $GIT_DIR/info/attributes and toplevel .gitattributes.

 * There is no attempt to detect circular definition of macro
   attributes, and definitions are evaluated from bottom to top
   as usual to fill in other attributes that have not yet got
   values.  The following would work as expected:

	[attr] text	diff crlf
	[attr] ps	text !diff
	*.ps	ps

   while this would most likely not (I haven't tried):

	[attr] ps	text !diff
	[attr] text	diff crlf
	*.ps	ps

 * When a macro says "[attr] A B !C", saying that a path does
   not have attribute A does not let you tell anything about
   attributes B or C.  That is, given this:

	[attr] text	diff crlf
	[attr] ps	text !diff
	*.txt !ps

  path hello.txt, which would match "*.txt" pattern, would have
  "ps" attribute set to zero, but that does not make text
  attribute of hello.txt set to false (nor diff attribute set to
  true).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 15:43:06 -07:00
Junio C Hamano
d0bfd026a8 Add basic infrastructure to assign attributes to paths
This adds the basic infrastructure to assign attributes to
paths, in a way similar to what the exclusion mechanism does
based on $GIT_DIR/info/exclude and .gitignore files.

An attribute is just a simple string that does not contain any
whitespace.  They can be specified in $GIT_DIR/info/attributes
file, and .gitattributes file in each directory.

Each line in these files defines a pattern matching rule.
Similar to the exclusion mechanism, a later match overrides an
earlier match in the same file, and entries from .gitattributes
file in the same directory takes precedence over the ones from
parent directories.  Lines in $GIT_DIR/info/attributes file are
used as the lowest precedence default rules.

A line is either a comment (an empty line, or a line that begins
with a '#'), or a rule, which is a whitespace separated list of
tokens.  The first token on the line is a shell glob pattern.
The rest are names of attributes, each of which can optionally
be prefixed with '!'.  Such a line means "if a path matches this
glob, this attribute is set (or unset -- if the attribute name
is prefixed with '!').  For glob matching, the same "if the
pattern does not have a slash in it, the basename of the path is
matched with fnmatch(3) against the pattern, otherwise, the path
is matched with the pattern with FNM_PATHNAME" rule as the
exclusion mechanism is used.

This does not define what an attribute means.  Tying an
attribute to various effects it has on git operation for paths
that have it will be specified separately.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00
Junio C Hamano
566f5b217d Merge branch 'maint'
* maint:
  GIT 1.5.1.1
  cvsserver: Fix handling of diappeared files on update
  fsck: do not complain on detached HEAD.
  (encode_85, decode_85): Mark source buffer pointer as "const".
2007-04-11 18:43:01 -07:00
Jim Meyering
f981577202 (encode_85, decode_85): Mark source buffer pointer as "const".
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 00:51:20 -07:00
Linus Torvalds
9eec4795d4 Add "S_IFDIRLNK" file mode infrastructure for git links
This just adds the basic helper functions to recognize and work with git
tree entries that are links to other git repositories ("subprojects").
They still aren't actually connected up to any of the code-paths, but
now all the infrastructure is in place.

The next commit will start actually adding actual subproject support.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 13:46:58 -07:00
Nicolas Pitre
57059091fa get rid of num_packed_objects()
The coming index format change doesn't allow for the number of objects
to be determined from the size of the index file directly.  Instead, Let's
initialize a field in the packed_git structure with the object count when
the index is validated since the count is always known at that point.

While at it let's reorder some struct packed_git fields to avoid padding
due to needed 64-bit alignment for some of them.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 12:48:14 -07:00
Junio C Hamano
68faf68938 A new merge stragety 'subtree'.
This merge strategy largely piggy-backs on git-merge-recursive.
When merging trees A and B, if B corresponds to a subtree of A,
B is first adjusted to match the tree structure of A, instead of
reading the trees at the same level.  This adjustment is also
done to the common ancestor tree.

If you are pulling updates from git-gui repository into git.git
repository, the root level of the former corresponds to git-gui/
subdirectory of the latter.  The tree object of git-gui's toplevel
is wrapped in a fake tree object, whose sole entry has name 'git-gui'
and records object name of the true tree, before being used by
the 3-way merge code.

If you are merging the other way, only the git-gui/ subtree of
git.git is extracted and merged into git-gui's toplevel.

The detection of corresponding subtree is done by comparing the
pathnames and types in the toplevel of the tree.

Heuristics galore!  That's the git way ;-).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-07 02:29:40 -07:00
Junio C Hamano
ee9693e246 Merge branch 'jc/index-output'
* jc/index-output:
  git-read-tree --index-output=<file>
  _GIT_INDEX_OUTPUT: allow plumbing to output to an alternative index file.

Conflicts:

	builtin-apply.c
2007-04-07 02:26:24 -07:00
Junio C Hamano
fd1c3bf053 Rename add_file_to_index() to add_file_to_cache()
This function was not called "add_file_to_cache()" only because
an ancient program, update-cache, used that name as an internal
function name that does something slightly different.  Now that
is gone, we can take over the better name.

The plan is to name all functions that operate on the default
index xxx_cache().  Later patches create a variant of them that
take an explicit parameter xxx_index(), and then turn
xxx_cache() functions into macros that use "the_index".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano
0424138d57 Fix bogus error message from merge-recursive error path
This error message should not usually trigger, but the function
make_cache_entry() called by add_cacheinfo() can return early
without calling into refresh_cache_entry() that sets cache_errno.

Also the error message had a wrong function name reported, and
it did not say anything about which path failed either.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Nicolas Pitre
d72308e01c clean up and optimize nth_packed_object_sha1() usage
Let's avoid the open coded pack index reference in pack-object and use
nth_packed_object_sha1() instead.  This will help encapsulating index
format differences in one place.

And while at it there is no reason to copy SHA1's over and over while a
direct pointer to it in the index will do just fine.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 14:59:47 -07:00
Junio C Hamano
5e7f56ac33 git-read-tree --index-output=<file>
This corrects the interface mistake of the previous one, and
gives a command line parameter to the only plumbing command that
currently needs it: "git-read-tree".

We can add the calls to set_alternate_index_output() to other
plumbing commands that update the index if/when needed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-03 23:44:32 -07:00
Junio C Hamano
30ca07a249 _GIT_INDEX_OUTPUT: allow plumbing to output to an alternative index file.
When defined, this allows plumbing commands that update the
index (add, apply, checkout-index, merge-recursive, mv,
read-tree, rm, update-index, and write-tree) to write their
resulting index to an alternative index file while holding a
lock to the original index file.  With this, git-commit that
jumps the index does not have to make an extra copy of the index
file, and more importantly, it can do the update while holding
the lock on the index.

However, I think the interface to let an environment variable
specify the output is a mistake, as shown in the documentation.
If a curious user has the environment variable set to something
other than the file GIT_INDEX_FILE points at, almost everything
will break.  This should instead be a command line parameter to
tell these plumbing commands to write the result in the named
file, to prevent stupid mistakes.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-03 23:44:32 -07:00
Nicolas Pitre
ce9fbf16e0 index-pack: use hash_sha1_file()
Use hash_sha1_file() instead of duplicating code to compute object SHA1.
While at it make it accept a const pointer.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-20 22:09:57 -07:00
Shawn O. Pearce
18bdec1118 Limit the size of the new delta_base_cache
The new configuration variable core.deltaBaseCacheLimit allows the
user to control how much memory they are willing to give to Git for
caching base objects of deltas.  This is not normally meant to be
a user tweakable knob; the "out of the box" settings are meant to
be suitable for almost all workloads.

We default to 16 MiB under the assumption that the cache is not
meant to consume all of the user's available memory, and that the
cache's main purpose was to cache trees, for faster path limiters
during revision traversal.  Since trees tend to be relatively small
objects, this relatively small limit should still allow a large
number of objects.

On the other hand we don't want the cache to start storing 200
different versions of a 200 MiB blob, as this could easily blow
the entire address space of a 32 bit process.

We evict OBJ_BLOB from the cache first (credit goes to Junio) as
we want to favor OBJ_TREE within the cache.  These are the objects
that have the highest inflate() startup penalty, as they tend to
be small and thus don't have that much of a chance to ammortize
that penalty over the entire data.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-18 22:43:37 -07:00
Junio C Hamano
3635a18770 Merge branch 'sp/run-command'
* sp/run-command:
  Use run_command within send-pack
  Use run_command within receive-pack to invoke index-pack
  Use run_command within merge-index
  Use run_command for proxy connections
  Use RUN_GIT_CMD to run push backends
  Correct new compiler warnings in builtin-revert
  Replace fork_with_pipe in bundle with run_command
  Teach run-command to redirect stdout to /dev/null
  Teach run-command about stdout redirection
2007-03-18 22:21:06 -07:00
Nicolas Pitre
4287307833 [PATCH] clean up pack index handling a bit
Especially with the new index format to come, it is more appropriate
to encapsulate more into check_packed_git_idx() and assume less of the
index format in struct packed_git.

To that effect, the index_base is renamed to index_data with void * type
so it is not used directly but other pointers initialized with it. This
allows for a couple pointer cast removal, as well as providing a better
generic name to grep for when adding support for new index versions or
formats.

And index_data is declared const too while at it.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-16 21:27:36 -07:00
Junio C Hamano
c49b260e99 Merge branch 'jc/repack'
* jc/repack:
  prepare_packed_git(): sort packs by age and localness.
2007-03-14 02:08:48 -07:00
Shawn O. Pearce
1a8f27413b Correct new compiler warnings in builtin-revert
The new builtin-revert code introduces a few new compiler errors
when I'm building with my stricter set of checks enabled in CFLAGS.
These all just stem from trying to store a constant string into
a non-const char*.  Simple fix, make the variables const char*.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-12 23:40:18 -07:00
Junio C Hamano
b867092fec prepare_packed_git(): sort packs by age and localness.
When accessing objects, we first look for them in packs that
are linked together in the reverse order of discovery.

Since younger packs tend to contain more recent objects, which
are more likely to be accessed often, and local packs tend to
contain objects more relevant to our specific projects, sort the
list of packs before starting to access them.  In addition,
favoring local packs over the ones borrowed from alternates can
be a win when alternates are mounted on network file systems.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-11 00:04:05 -08:00
Paolo Bonzini
0746d19a82 git-branch, git-checkout: autosetup for remote branch tracking
In order to track and build on top of a branch 'topic' you track from
your upstream repository, you often would end up doing this sequence:

  git checkout -b mytopic origin/topic
  git config --add branch.mytopic.remote origin
  git config --add branch.mytopic.merge refs/heads/topic

This would first fork your own 'mytopic' branch from the 'topic'
branch you track from the 'origin' repository; then it would set up two
configuration variables so that 'git pull' without parameters does the
right thing while you are on your own 'mytopic' branch.

This commit adds a --track option to git-branch, so that "git
branch --track mytopic origin/topic" performs the latter two actions
when creating your 'mytopic' branch.

If the configuration variable branch.autosetupmerge is set to true, you
do not have to pass the --track option explicitly; further patches in
this series allow setting the variable with a "git remote add" option.
The configuration variable is off by default, and there is a --no-track
option to countermand it even if the variable is set.

Signed-off-by: Paolo Bonzini  <bonzini@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-10 23:41:58 -08:00
Junio C Hamano
8509fed75d Merge branch 'jc/fsck'
* jc/fsck:
  fsck: exit with non-zero status upon errors
  unpack_sha1_file(): detect corrupt loose object files.
  fsck: fix broken loose object check.
2007-03-10 23:10:26 -08:00
Shawn O. Pearce
c4001d92be Use off_t when we really mean a file offset.
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.

By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle.  For most modern systems this is up
around 2^60 or higher.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:25 -08:00
Shawn O. Pearce
326bf39677 Use uint32_t for all packed object counts.
As we permit up to 2^32-1 objects in a single packfile we cannot
use a signed int to represent the object offset within a packfile,
after 2^31-1 objects we will start seeing negative indexes and
error out or compute bad addresses within the mmap'd index.

This is a minor cleanup that does not introduce any significant
logic changes.  It is roach free.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:02:33 -08:00
Shawn O. Pearce
3a55602eec General const correctness fixes
We shouldn't attempt to assign constant strings into char*, as the
string is not writable at runtime.  Likewise we should always be
treating unsigned values as unsigned values, not as signed values.

Most of these are very straightforward.  The only exception is the
(unnecessary) xstrdup/free in builtin-branch.c for the detached
head case.  Since this is a user-level interactive type program
and that particular code path is executed no more than once, I feel
that the extra xstrdup call is well worth the easy elimination of
this warning.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 10:47:10 -08:00