Commit Graph

1011 Commits

Author SHA1 Message Date
Junio C Hamano
8c5b85ce87 Merge branch 'maint'
* maint:
  More friendly message when locking the index fails.
  Document git blame --reverse.
  Documentation: Note file formats send-email accepts
2009-02-19 23:44:07 -08:00
Matthieu Moy
e43a6fd3e9 More friendly message when locking the index fails.
Just saying that index.lock exists doesn't tell the user _what_ to do
to fix the problem. We should give an indication that it's normally
safe to delete index.lock after making sure git isn't running here.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 23:22:57 -08:00
Johannes Schindelin
4fcc86b07d Introduce the function strip_path_suffix()
The function strip_path_suffix() will try to strip a given suffix from
a given path.  The suffix must start at a directory boundary (i.e. "core"
is not a path suffix of "libexec/git-core", but "git-core" is).

Arbitrary runs of directory separators ("slashes") are assumed identical.

Example:

	strip_path_suffix("C:\\msysgit/\\libexec\\git-core",
		"libexec///git-core", &prefix)

will set prefix to "C:\\msysgit" and return 0.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 22:45:48 -08:00
Kjetil Barvik
fba2f38a2c make USE_NSEC work as expected
Since the filesystem ext4 is now defined as stable in Linux v2.6.28,
and ext4 supports nanonsecond resolution timestamps natively, it is
time to make USE_NSEC work as expected.

This will make racy git situations less likely to happen.  For 'git
checkout' this means it will be less likely that we have to open, read
the contents of the file into RAM, and check if file is really
modified or not.  The result sould be a litle less used CPU time, less
pagefaults and a litle faster program, at least for 'git checkout'.

Since the number of possible racy git situations would increase when
disks gets faster, this patch would be more and more helpfull as times
go by.  For a fast Solid State Disk, this patch should be helpfull.

Note that, when file operations starts to take less than 1 nanosecond,
one would again start to get more racy git situations.

For more info on racy git, see Documentation/technical/racy-git.txt
For more info on ext4, see http://kernelnewbies.org/Ext4

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19 21:39:48 -08:00
Kjetil Barvik
36419c8ee4 check_updates(): effective removal of cache entries marked CE_REMOVE
Below is oprofile output from GIT command 'git chekcout -q my-v2.6.25'
(move from tag v2.6.27 to tag v2.6.25 of the Linux kernel):

CPU: Core 2, speed 1999.95 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
                         mask of 0x00 (Unhalted core cycles) count 20000
Counted INST_RETIRED_ANY_P events (number of instructions retired) with a
                           unit mask of 0x00 (No unit mask) count 20000
CPU_CLK_UNHALT...|INST_RETIRED:2...|
  samples|      %|  samples|      %|
------------------------------------
   409247 100.000    342878 100.000 git
        CPU_CLK_UNHALT...|INST_RETIRED:2...|
          samples|      %|  samples|      %|
        ------------------------------------
           260476 63.6476    257843 75.1996 libz.so.1.2.3
           100876 24.6492     64378 18.7758 kernel-2.6.28.4_2.vmlinux
            30850  7.5382      7874  2.2964 libc-2.9.so
            14775  3.6103      8390  2.4469 git
             2020  0.4936      4325  1.2614 libcrypto.so.0.9.8
              191  0.0467        32  0.0093 libpthread-2.9.so
               58  0.0142        36  0.0105 ld-2.9.so
                1 2.4e-04         0       0 libldap-2.3.so.0.2.31

Detail list of the top 20 function entries (libz counted in one blob):

CPU_CLK_UNHALTED  INST_RETIRED_ANY_P
samples  %        samples  %        image name               symbol name
260476   63.6862  257843   75.2725  libz.so.1.2.3            /lib/libz.so.1.2.3
16587     4.0555  3636      1.0615  libc-2.9.so              memcpy
7710      1.8851  277       0.0809  libc-2.9.so              memmove
3679      0.8995  1108      0.3235  kernel-2.6.28.4_2.vmlinux d_validate
3546      0.8670  2607      0.7611  kernel-2.6.28.4_2.vmlinux __getblk
3174      0.7760  1813      0.5293  libc-2.9.so              _int_malloc
2396      0.5858  3681      1.0746  kernel-2.6.28.4_2.vmlinux copy_to_user
2270      0.5550  2528      0.7380  kernel-2.6.28.4_2.vmlinux __link_path_walk
2205      0.5391  1797      0.5246  kernel-2.6.28.4_2.vmlinux ext4_mark_iloc_dirty
2103      0.5142  1203      0.3512  kernel-2.6.28.4_2.vmlinux find_first_zero_bit
2077      0.5078  997       0.2911  kernel-2.6.28.4_2.vmlinux do_get_write_access
2070      0.5061  514       0.1501  git                      cache_name_compare
2043      0.4995  1501      0.4382  kernel-2.6.28.4_2.vmlinux rcu_irq_exit
2022      0.4944  1732      0.5056  kernel-2.6.28.4_2.vmlinux __ext4_get_inode_loc
2020      0.4939  4325      1.2626  libcrypto.so.0.9.8       /usr/lib/libcrypto.so.0.9.8
1965      0.4804  1384      0.4040  git                      patch_delta
1708      0.4176  984       0.2873  kernel-2.6.28.4_2.vmlinux rcu_sched_grace_period
1682      0.4112  727       0.2122  kernel-2.6.28.4_2.vmlinux sysfs_slab_alias
1659      0.4056  290       0.0847  git                      find_pack_entry_one
1480      0.3619  1307      0.3816  kernel-2.6.28.4_2.vmlinux ext4_writepage_trans_blocks

Notice the memmove line, where the CPU did 7710 / 277 = 27.8 cycles
per instruction, and compared to the total cycles spent inside the
source code of GIT for this command, all the memmove() calls
translates to (7710 * 100) / 14775 = 52.2% of this.

Retesting with a GIT program compiled for gcov usage, I found out that
the memmove() calls came from remove_index_entry_at() in read-cache.c,
where we have:

        memmove(istate->cache + pos,
                istate->cache + pos + 1,
                (istate->cache_nr - pos) * sizeof(struct cache_entry *));

remove_index_entry_at() is called 4902 times from check_updates() in
unpack-trees.c, and each time called we move each cache_entry pointers
(from the removed one) one step to the left.

Since we have 28828 entries in the cache this time, and if we on
average move half of them each time, we in total move approximately
4902 * 0.5 * 28828 * 4 = 282 629 712 bytes, or twice this amount if
each pointer is 8 bytes (64 bit).

OK, is seems that the function check_updates() is called 28 times, so
the estimated guess above had been more correct if check_updates() had
been called only once, but the point is: we get lots of bytes moved.

To fix this, and use an O(N) algorithm instead, where N is the number
of cache_entries, we delete/remove all entries in one loop through all
entries.

From a retest, the new remove_marked_cache_entries() from the patch
below, ended up with the following output line from oprofile:

46        0.0105  15        0.0041  git                      remove_marked_cache_entries

If we can trust the numbers from oprofile in this case, we saved
approximately ((7710 - 46) * 20000) / (2 * 1000 * 1000 * 1000) = 0.077
seconds CPU time with this fix for this particular test.  And notice
that now the CPU did only 46 / 15 = 3.1 cycles/instruction.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18 17:11:21 -08:00
Junio C Hamano
160d2bc353 Merge branch 'ms/mailmap'
* ms/mailmap:
  Move mailmap documentation into separate file
  Change current mailmap usage to do matching on both name and email of author/committer.
  Add map_user() and clear_mailmap() to mailmap
  Add find_insert_index, insert_at_index and clear_func functions to string_list
  Add mailmap.file as configurational option for mailmap location
2009-02-15 01:44:15 -08:00
Junio C Hamano
954cfb5cfd Revert "Merge branch 'js/notes'"
This reverts commit 7b75b331f6, reversing
changes made to 5d680a67d7.
2009-02-10 21:32:10 -08:00
Junio C Hamano
6e5d7ddc49 Merge branch 'js/maint-1.6.0-path-normalize'
* js/maint-1.6.0-path-normalize:
  Remove unused normalize_absolute_path()
  Test and fix normalize_path_copy()
  Fix GIT_CEILING_DIRECTORIES on Windows
  Move sanitary_path_copy() to path.c and rename it to normalize_path_copy()
  Make test-path-utils more robust against incorrect use
2009-02-10 21:30:52 -08:00
Junio C Hamano
fd8475d9fb Merge branch 'maint'
* maint:
  Clear the delta base cache during fast-import checkpoint
2009-02-10 21:30:45 -08:00
Junio C Hamano
9b27ea9518 Merge branch 'maint-1.6.0' into maint
* maint-1.6.0:
  Clear the delta base cache during fast-import checkpoint
2009-02-10 15:32:26 -08:00
Shawn O. Pearce
3d20c636af Clear the delta base cache during fast-import checkpoint
Otherwise we may reuse the same memory address for a totally
different "struct packed_git", and a previously cached object from
the prior occupant might be returned when trying to unpack an object
from the new pack.

Found-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-10 15:30:59 -08:00
Kjetil Barvik
7847892716 unlink_entry(): introduce schedule_dir_for_removal()
Currently inside unlink_entry() if we get a successful removal of one
file with unlink(), we try to remove the leading directories each and
every time.  So if one directory containing 200 files is moved to an
other location we get 199 failed calls to rmdir() and 1 successful
call.

To fix this and avoid some unnecessary calls to rmdir(), we schedule
each directory for removal and wait much longer before we do the real
call to rmdir().

Since the unlink_entry() function is called with alphabetically sorted
names, this new function end up being very effective to avoid
unnecessary calls to rmdir().  In some cases over 95% of all calls to
rmdir() is removed with this patch.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Kjetil Barvik
571998921d lstat_cache(): swap func(length, string) into func(string, length)
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-09 20:59:26 -08:00
Marius Storm-Olsen
d551a48816 Add mailmap.file as configurational option for mailmap location
This allows us to augment the repo mailmap file, and to use
mailmap files elsewhere than the repository root. Meaning
that the entries in mailmap.file will override the entries
in "./.mailmap", should they match.

Signed-off-by: Marius Storm-Olsen <marius@trolltech.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-08 12:36:26 -08:00
Johannes Sixt
f2a782b8ba Remove unused normalize_absolute_path()
This function is now superseded by normalize_path_copy().

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-07 12:23:30 -08:00
Johannes Sixt
f3cad0ad82 Move sanitary_path_copy() to path.c and rename it to normalize_path_copy()
This function and normalize_absolute_path() do almost the same thing. The
former already works on Windows, but the latter crashes.

In subsequent changes we will remove normalize_absolute_path(). Here we
make the replacement function reusable. On the way we rename it to reflect
that it does some path normalization. Apart from that this is only moving
around code.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-07 12:23:29 -08:00
Junio C Hamano
7b75b331f6 Merge branch 'js/notes'
* js/notes:
  git-notes: fix printing of multi-line notes
  notes: fix core.notesRef documentation
  Add an expensive test for git-notes
  Speed up git notes lookup
  Add a script to edit/inspect notes
  Introduce commit notes

Conflicts:
	pretty.c
2009-02-05 19:40:39 -08:00
Junio C Hamano
141b6b83d7 Merge branch 'lt/maint-wrap-zlib' into maint
* lt/maint-wrap-zlib:
  Wrap inflate and other zlib routines for better error reporting

Conflicts:
	http-push.c
	http-walker.c
	sha1_file.c
2009-02-05 18:01:00 -08:00
Junio C Hamano
8712b3cdb0 Merge branch 'tr/previous-branch'
* tr/previous-branch:
  t1505: remove debugging cruft
  Simplify parsing branch switching events in reflog
  Introduce for_each_recent_reflog_ent().
  interpret_nth_last_branch(): plug small memleak
  Fix reflog parsing for a malformed branch switching entry
  Fix parsing of @{-1}@{1}
  interpret_nth_last_branch(): avoid traversing the reflog twice
  checkout: implement "-" abbreviation, add docs and tests
  sha1_name: support @{-N} syntax in get_sha1()
  sha1_name: tweak @{-N} lookup
  checkout: implement "@{-N}" shortcut name for N-th last branch

Conflicts:
	sha1_name.c
2009-01-28 15:00:27 -08:00
Junio C Hamano
0990e7aaaa Merge branch 'kb/lstat-cache'
* kb/lstat-cache:
  lstat_cache(): introduce clear_lstat_cache() function
  lstat_cache(): introduce invalidate_lstat_cache() function
  lstat_cache(): introduce has_dirs_only_path() function
  lstat_cache(): introduce has_symlink_or_noent_leading_path() function
  lstat_cache(): more cache effective symlink/directory detection
2009-01-25 17:13:34 -08:00
Junio C Hamano
d64d4835b8 Merge branch 'cb/add-pathspec'
* cb/add-pathspec:
  remove pathspec_match, use match_pathspec instead
  clean up pathspec matching
2009-01-25 17:13:11 -08:00
Junio C Hamano
36dd939393 Merge branch 'lt/maint-wrap-zlib'
* lt/maint-wrap-zlib:
  Wrap inflate and other zlib routines for better error reporting

Conflicts:
	http-push.c
	http-walker.c
	sha1_file.c
2009-01-21 16:55:17 -08:00
Kjetil Barvik
bda6eb0da9 lstat_cache(): introduce clear_lstat_cache() function
If you want to completely clear the contents of the lstat_cache(), then
call this new function.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:58:34 -08:00
Kjetil Barvik
aeabab5c71 lstat_cache(): introduce invalidate_lstat_cache() function
In some cases it could maybe be necessary to say to the cache that
"Hey, I deleted/changed the type of this pathname and if you currently
have it inside your cache, you should deleted it".

This patch introduce a function which support this.

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:58:31 -08:00
Kjetil Barvik
bad4a54fa6 lstat_cache(): introduce has_dirs_only_path() function
The create_directories() function in entry.c currently calls stat()
or lstat() for each path component of the pathname 'path' each and every
time.  For the 'git checkout' command, this function is called on each
file for which we must do an update (ce->ce_flags & CE_UPDATE), so we get
lots and lots of calls.

To fix this, we make a new wrapper to the lstat_cache() function, and
call the wrapper function instead of the calls to the stat() or the
lstat() functions.  Since the paths given to the create_directories()
function, is sorted alphabetically, the new wrapper would be very
cache effective in this situation.

To support it we must update the lstat_cache() function to be able to
say that "please test the complete length of 'name'", and also to give
it the length of a prefix, where the cache should use the stat()
function instead of the lstat() function to test each path component.

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:54:54 -08:00
Kjetil Barvik
09c9306658 lstat_cache(): introduce has_symlink_or_noent_leading_path() function
In some cases, especially inside the unpack-trees.c file, and inside
the verify_absent() function, we can avoid some unnecessary calls to
lstat(), if the lstat_cache() function can also be told to keep track
of non-existing directories.

So we update the lstat_cache() function to handle this new fact,
introduce a new wrapper function, and the result is that we save lots
of lstat() calls for a removed directory which previously contained
lots of files, when we call this new wrapper of lstat_cache() instead
of the old one.

We do similar changes inside the unlink_entry() function, since if we
can already say that the leading directory component of a pathname
does not exist, it is not necessary to try to remove a pathname below
it!

Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
comments to this patch!

Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18 13:54:49 -08:00
Junio C Hamano
ae5a6c3684 checkout: implement "@{-N}" shortcut name for N-th last branch
Implement a shortcut @{-N} for the N-th last branch checked out, that
works by parsing the reflog for the message added by previous
git-checkout invocations.  We expand the @{-N} to the branch name, so
that you end up on an attached HEAD on that branch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 18:36:49 -08:00
Clemens Buchacher
0b50922abf remove pathspec_match, use match_pathspec instead
Both versions have the same functionality. This removes any
redundancy.

This also adds makes two extensions to match_pathspec:

- If pathspec is NULL, return 1. This reflects the behavior of git
  commands, for which no paths usually means "match all paths".

- If seen is NULL, do not use it.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-14 19:18:44 -08:00
Christian Couder
c2c5b27051 sha1_file: make "read_object" static
This function is only used from "sha1_file.c".

And as we want to add a "replace_object" hook in "read_sha1_file",
we must not let people bypass the hook using something other than
"read_sha1_file".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-13 00:14:55 -08:00
Linus Torvalds
39c68542fc Wrap inflate and other zlib routines for better error reporting
R. Tyler Ballance reported a mysterious transient repository corruption;
after much digging, it turns out that we were not catching and reporting
memory allocation errors from some calls we make to zlib.

This one _just_ wraps things; it doesn't do the "retry on low memory
error" part, at least not yet. It is an independent issue from the
reporting.  Some of the errors are expected and passed back to the caller,
but we die when zlib reports it failed to allocate memory for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-11 02:13:06 -08:00
Johannes Schindelin
879ef2485d Introduce commit notes
Commit notes are blobs which are shown together with the commit
message.  These blobs are taken from the notes ref, which you can
configure by the config variable core.notesRef, which in turn can
be overridden by the environment variable GIT_NOTES_REF.

The notes ref is a branch which contains "files" whose names are
the names of the corresponding commits (i.e. the SHA-1).

The rationale for putting this information into a ref is this: we
want to be able to fetch and possibly union-merge the notes,
maybe even look at the date when a note was introduced, and we
want to store them efficiently together with the other objects.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21 02:47:21 -08:00
Junio C Hamano
de0db42278 Merge branch 'maint'
* maint:
  fsck: reduce stack footprint
  make sure packs to be replaced are closed beforehand
2008-12-11 00:36:31 -08:00
Nicolas Pitre
c74faea19e make sure packs to be replaced are closed beforehand
Especially on Windows where an opened file cannot be replaced, make
sure pack-objects always close packs it is about to replace. Even on
non Windows systems, this could save potential bad results if ever
objects were to be read from the new pack file using offset from the old
index.

This should fix t5303 on Windows.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Tested-by: Johannes Sixt <j6t@kdbg.org> (MinGW)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-10 17:56:05 -08:00
Junio C Hamano
0fd9d7e66d Merge branch 'bc/maint-keep-pack' into maint
* bc/maint-keep-pack:
  repack: only unpack-unreachable if we are deleting redundant packs
  t7700: test that 'repack -a' packs alternate packed objects
  pack-objects: extend --local to mean ignore non-local loose objects too
  sha1_file.c: split has_loose_object() into local and non-local counterparts
  t7700: demonstrate mishandling of loose objects in an alternate ODB
  builtin-gc.c: use new pack_keep bitfield to detect .keep file existence
  repack: do not fall back to incremental repacking with [-a|-A]
  repack: don't repack local objects in packs with .keep file
  pack-objects: new option --honor-pack-keep
  packed_git: convert pack_local flag into a bitfield and add pack_keep
  t7700: demonstrate mishandling of objects in packs with a .keep file
2008-12-02 23:00:04 -08:00
Junio C Hamano
388b2acd6e git add --intent-to-add: fix removal of cached emptiness
This uses the extended index flag mechanism introduced earlier to mark
the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.

The logic to detect an "intent to add" entry for the purpose of allowing
"git rm --cached $path" is tightened to check not just for a staged empty
blob, but with the CE_INTENT_TO_ADD bit.  This protects an empty blob that
was explicitly added and then modified in the work tree from being dropped
with this sequence:

	$ >empty
	$ git add empty
	$ echo "non empty" >empty
	$ git rm --cached empty

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-28 19:58:24 -08:00
Junio C Hamano
fe60dff744 Merge branch 'nd/narrow' (early part) into jc/add-i-t-a
* 'nd/narrow' (early part):
  Extend index to save more flags
2008-11-28 17:22:35 -08:00
Junio C Hamano
2af9664776 Merge branch 'lt/preload-lstat'
* lt/preload-lstat:
  Fix index preloading for racy dirty case
  Add cache preload facility
2008-11-27 19:24:13 -08:00
Junio C Hamano
47a792539a Merge branch 'jk/commit-v-strip'
* jk/commit-v-strip:
  status: show "-v" diff even for initial commit
  wt-status: refactor initial commit printing
  define empty tree sha1 as a macro
2008-11-16 00:48:59 -08:00
Linus Torvalds
671c9b7e31 Add cache preload facility
This can do the lstat() storm in parallel, giving potentially much
improved performance for cold-cache cases or things like NFS that have
weak metadata caching.

Just use "read_cache_preload()" instead of "read_cache()" to force an
optimistic preload of the index stat data.  The function takes a
pathspec as its argument, allowing us to preload only the relevant
portion of the index.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-14 19:11:49 -08:00
Junio C Hamano
7b51b77dbc Merge branch 'np/pack-safer'
* np/pack-safer:
  t5303: fix printf format string for portability
  t5303: work around printf breakage in dash
  pack-objects: don't leak pack window reference when splitting packs
  extend test coverage for latest pack corruption resilience improvements
  pack-objects: allow "fixing" a corrupted pack without a full repack
  make find_pack_revindex() aware of the nasty world
  make check_object() resilient to pack corruptions
  make packed_object_info() resilient to pack corruptions
  make unpack_object_header() non fatal
  better validation on delta base object offsets
  close another possibility for propagating pack corruption
2008-11-12 22:26:35 -08:00
Junio C Hamano
ecbbfb15a4 Merge branch 'bc/maint-keep-pack'
* bc/maint-keep-pack:
  t7700: test that 'repack -a' packs alternate packed objects
  pack-objects: extend --local to mean ignore non-local loose objects too
  sha1_file.c: split has_loose_object() into local and non-local counterparts
  t7700: demonstrate mishandling of loose objects in an alternate ODB
  builtin-gc.c: use new pack_keep bitfield to detect .keep file existence
  repack: do not fall back to incremental repacking with [-a|-A]
  repack: don't repack local objects in packs with .keep file
  pack-objects: new option --honor-pack-keep
  packed_git: convert pack_local flag into a bitfield and add pack_keep
  t7700: demonstrate mishandling of objects in packs with a .keep file
2008-11-12 22:00:43 -08:00
Junio C Hamano
6cd3729eae Merge branch 'maint'
* maint:
  Start 1.6.0.5 cycle
  Fix pack.packSizeLimit and --max-pack-size handling
  checkout: Fix "initial checkout" detection
  Remove the period after the git-check-attr summary

Conflicts:
	RelNotes
2008-11-12 15:03:57 -08:00
Junio C Hamano
fa7b3c2f75 checkout: Fix "initial checkout" detection
Earlier commit 5521883 (checkout: do not lose staged removal, 2008-09-07)
tightened the rule to prevent switching branches from losing local
changes, so that staged removal of paths can be protected, while
attempting to keep a loophole to still allow a special case of switching
out of an un-checked-out state.

However, the loophole was made a bit too tight, and did not allow
switching from one branch (in an un-checked-out state) to check out
another branch.

The change to builtin-checkout.c in this commit loosens it to allow this,
by not insisting the original commit and the new commit to be the same.

It also introduces a new function, is_index_unborn (and an associated
macro, is_cache_unborn), to check if the repository is truly in an
un-checked-out state more reliably, by making sure that $GIT_INDEX_FILE
did not exist when populating the in-core index structure.  A few places
the earlier commit 5521883 added the check for the initial checkout
condition are updated to use this function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 14:16:50 -08:00
Jeff King
14d9c57896 define empty tree sha1 as a macro
This can potentially be used in a few places, so let's make
it available to all parts of the code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 12:52:21 -08:00
Brandon Casey
0f4dc14ac4 sha1_file.c: split has_loose_object() into local and non-local counterparts
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 10:29:22 -08:00
Brandon Casey
8d25931d6f packed_git: convert pack_local flag into a bitfield and add pack_keep
pack_keep will be set when a pack file has an associated .keep file.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 10:28:08 -08:00
Junio C Hamano
8b1981d32b Merge branch 'ar/maint-mksnpath' into maint
* ar/maint-mksnpath:
  Use git_pathdup instead of xstrdup(git_path(...))
  git_pathdup: returns xstrdup-ed copy of the formatted path
  Fix potentially dangerous use of git_path in ref.c
  Add git_snpath: a .git path formatting routine with output buffer
  Fix potentially dangerous uses of mkpath and git_path
  Fix mkpath abuse in dwim_ref and dwim_log of sha1_name.c
  Add mksnpath which allows you to specify the output buffer

Conflicts:
	builtin-revert.c
	rerere.c
2008-11-08 16:13:19 -08:00
Junio C Hamano
3b8572a429 Merge branch 'mv/maint-branch-m-symref' into maint
* mv/maint-branch-m-symref:
  update-ref --no-deref -d: handle the case when the pointed ref is packed
  git branch -m: forbid renaming of a symref
  Fix git update-ref --no-deref -d.
  rename_ref(): handle the case when the reflog of a ref does not exist
  Fix git branch -m for symrefs.
2008-11-08 16:07:37 -08:00
Junio C Hamano
a1a846a19e Merge branch 'ar/mksnpath'
* ar/mksnpath:
  Use git_pathdup instead of xstrdup(git_path(...))
  git_pathdup: returns xstrdup-ed copy of the formatted path
  Fix potentially dangerous use of git_path in ref.c
  Add git_snpath: a .git path formatting routine with output buffer
  Fix potentially dangerous uses of mkpath and git_path
  Fix potentially dangerous uses of mkpath and git_path
  Fix mkpath abuse in dwim_ref and dwim_log of sha1_name.c
  Add mksnpath which allows you to specify the output buffer

Conflicts:
	builtin-revert.c
2008-11-05 11:35:53 -08:00
Junio C Hamano
efcce2e1f0 Merge branch 'mv/maint-branch-m-symref'
* mv/maint-branch-m-symref:
  update-ref --no-deref -d: handle the case when the pointed ref is packed
  git branch -m: forbid renaming of a symref
  Fix git update-ref --no-deref -d.
  rename_ref(): handle the case when the reflog of a ref does not exist
  Fix git branch -m for symrefs.
2008-11-05 11:33:19 -08:00
Nicolas Pitre
09ded04b7e make unpack_object_header() non fatal
It is possible to have pack corruption in the object header.  Currently
unpack_object_header() simply die() on them instead of letting the caller
deal with that gracefully.

So let's have unpack_object_header() return an error instead, and find
a better name for unpack_object_header_gently() in that context.  All
callers of unpack_object_header() are ready for it.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-02 15:22:34 -08:00
Nicolas Pitre
0e8189e270 close another possibility for propagating pack corruption
Abstract
--------

With index v2 we have a per object CRC to allow quick and safe reuse of
pack data when repacking.  This, however, doesn't currently prevent a
stealth corruption from being propagated into a new pack when _not_
reusing pack data as demonstrated by the modification to t5302 included
here.

The Context
-----------

The Git database is all checksummed with SHA1 hashes.  Any kind of
corruption can be confirmed by verifying this per object hash against
corresponding data.  However this can be costly to perform systematically
and therefore this check is often not performed at run time when
accessing the object database.

First, the loose object format is entirely compressed with zlib which
already provide a CRC verification of its own when inflating data.  Any
disk corruption would be caught already in this case.

Then, packed objects are also compressed with zlib but only for their
actual payload.  The object headers and delta base references are not
deflated for obvious performance reasons, however this leave them
vulnerable to potentially undetected disk corruptions.  Object types
are often validated against the expected type when they're requested,
and deflated size must always match the size recorded in the object header,
so those cases are pretty much covered as well.

Where corruptions could go unnoticed is in the delta base reference.
Of course, in the OBJ_REF_DELTA case,  the odds for a SHA1 reference to
get corrupted so it actually matches the SHA1 of another object with the
same size (the delta header stores the expected size of the base object
to apply against) are virtually zero.  In the OBJ_OFS_DELTA case, the
reference is a pack offset which would have to match the start boundary
of a different base object but still with the same size, and although this
is relatively much more "probable" than in the OBJ_REF_DELTA case, the
probability is also about zero in absolute terms.  Still, the possibility
exists as demonstrated in t5302 and is certainly greater than a SHA1
collision, especially in the OBJ_OFS_DELTA case which is now the default
when repacking.

Again, repacking by reusing existing pack data is OK since the per object
CRC provided by index v2 guards against any such corruptions. What t5302
failed to test is a full repack in such case.

The Solution
------------

As unlikely as this kind of stealth corruption can be in practice, it
certainly isn't acceptable to propagate it into a freshly created pack.
But, because this is so unlikely, we don't want to pay the run time cost
associated with extra validation checks all the time either.  Furthermore,
consequences of such corruption in anything but repacking should be rather
visible, and even if it could be quite unpleasant, it still has far less
severe consequences than actively creating bad packs.

So the best compromize is to check packed object CRC when unpacking
objects, and only during the compression/writing phase of a repack, and
only when not streaming the result.  The cost of this is minimal (less
than 1% CPU time), and visible only with a full repack.

Someone with a stats background could provide an objective evaluation of
this, but I suspect that it's bad RAM that has more potential for data
corruptions at this point, even in those cases where this extra check
is not performed.  Still, it is best to prevent a known hole for
corruption when recreating object data into a new pack.

What about the streamed pack case?  Well, any client receiving a pack
must always consider that pack as untrusty and perform full validation
anyway, hence no such stealth corruption could be propagated to remote
repositoryes already.  It is therefore worthless doing local validation
in that case.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-02 15:22:15 -08:00
Junio C Hamano
86e67a088c Merge branch 'jk/maint-ls-files-other' into maint
* jk/maint-ls-files-other:
  refactor handling of "other" files in ls-files and status
2008-11-02 13:37:13 -08:00
Junio C Hamano
98b35e2c74 Merge branch 'ar/maint-mksnpath' into ar/mksnpath
* ar/maint-mksnpath:
  Use git_pathdup instead of xstrdup(git_path(...))
  git_pathdup: returns xstrdup-ed copy of the formatted path
  Fix potentially dangerous use of git_path in ref.c
  Add git_snpath: a .git path formatting routine with output buffer

Conflicts:
	builtin-revert.c
	refs.c
	rerere.c
2008-10-30 18:08:58 -07:00
Alex Riesen
aba13e7c05 git_pathdup: returns xstrdup-ed copy of the formatted path
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-30 17:30:55 -07:00
Alex Riesen
fe2d7776d5 Add git_snpath: a .git path formatting routine with output buffer
The function's purpose is to replace git_path where the buffer of
formatted path may not be reused by subsequent calls of the function
or will be copied anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-30 17:00:14 -07:00
Junio C Hamano
356af64d84 Merge branch 'ar/maint-mksnpath' into HEAD
* ar/maint-mksnpath:
  Fix potentially dangerous uses of mkpath and git_path
  Fix mkpath abuse in dwim_ref and dwim_log of sha1_name.c
  Add mksnpath which allows you to specify the output buffer
2008-10-26 22:24:44 -07:00
Alex Riesen
108bebeab3 Add mksnpath which allows you to specify the output buffer
This is just vsnprintf's but additionally calls cleanup_path() on the
result. To be used as alternatives to mkpath() where the buffer for the
created path may not be reused by subsequent calls of the same formatting
function.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-26 22:08:58 -07:00
Miklos Vajna
eca35a25a9 Fix git branch -m for symrefs.
This had two problems with symrefs. First, it copied the actual sha1
instead of the "pointer", second it failed to remove the old ref after a
successful rename.

Given that till now delete_ref() always dereferenced symrefs, a new
parameters has been introduced to delete_ref() to allow deleting refs
without a dereference.

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-26 14:42:57 -07:00
Jeff King
f55527f802 rm: loosen safety valve for empty files
If a file is different between the working tree copy, the index, and the
HEAD, then we do not allow it to be deleted without --force.

However, this is overly tight in the face of "git add --intent-to-add":

  $ git add --intent-to-add file
  $ : oops, I don't actually want to stage that yet
  $ git rm --cached file
  error: 'empty' has staged content different from both the
  file and the HEAD (use -f to force removal)
  $ git rm -f --cached file

Unfortunately, there is currently no way to distinguish between an empty
file that has been added and an "intent to add" file. The ideal behavior
would be to disallow the former while allowing the latter.

This patch loosens the safety valve to allow the deletion only if we are
deleting the cached entry and the cached content is empty.  This covers
the intent-to-add situation, and assumes there is little harm in not
protecting users who have legitimately added an empty file.  In many
cases, the file will still be empty, in which case the safety valve does
not trigger anyway (since the content remains untouched in the working
tree). Otherwise, we do remove the fact that no content was staged, but
given that the content is by definition empty, it is not terribly
difficult for a user to recreate it.

However, we still document the desired behavior in the form of two
tests. One checks the correct removal of an intent-to-add file. The other
checks that we still disallow removal of empty files, but is marked as
expect_failure to indicate this compromise. If the intent-to-add feature
is ever extended to differentiate between normal empty files and
intent-to-add files, then the safety valve can be re-tightened.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-22 17:16:07 -07:00
Junio C Hamano
a157400c97 Merge branch 'jc/maint-co-track'
* jc/maint-co-track:
  Enhance hold_lock_file_for_{update,append}() API
  demonstrate breakage of detached checkout with symbolic link HEAD
  Fix "checkout --track -b newbranch" on detached HEAD

Conflicts:
	builtin-commit.c
2008-10-21 17:58:11 -07:00
Junio C Hamano
acd3b9eca8 Enhance hold_lock_file_for_{update,append}() API
This changes the "die_on_error" boolean parameter to a mere "flags", and
changes the existing callers of hold_lock_file_for_update/append()
functions to pass LOCK_DIE_ON_ERROR.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-19 12:35:37 -07:00
Junio C Hamano
e845e16ee6 Merge branch 'jk/maint-ls-files-other' into jk/fix-ls-files-other
* jk/maint-ls-files-other:
  refactor handling of "other" files in ls-files and status

Conflicts:
	read-cache.c
2008-10-17 13:03:52 -07:00
Jeff King
98fa473887 refactor handling of "other" files in ls-files and status
When the "git status" display code was originally converted
to C, we copied the code from ls-files to discover whether a
pathname returned by read_directory was an "other", or
untracked, file.

Much later, 5698454e updated the code in ls-files to handle
some new cases caused by gitlinks.  This left the code in
wt-status.c broken: it would display submodule directories
as untracked directories. Nobody noticed until now, however,
because unless status.showUntrackedFiles was set to "all",
submodule directories were not actually reported by
read_directory. So the bug was only triggered in the
presence of a submodule _and_ this config option.

This patch pulls the ls-files code into a new function,
cache_name_is_other, and uses it in both places. This should
leave the ls-files functionality the same and fix the bug
in status.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-17 12:46:59 -07:00
Nguyễn Thái Ngọc Duy
06aaaa0bf7 Extend index to save more flags
The on-disk format of index only saves 16 bit flags, nearly all have
been used. The last bit (CE_EXTENDED) is used to for future extension.

This patch extends index entry format to save more flags in future.
The new entry format will be used when CE_EXTENDED bit is 1.

Because older implementation may not understand CE_EXTENDED bit and
misread the new format, if there is any extended entry in index, index
header version will turn 3, which makes it incompatible for older git.
If there is none, header version will return to 2 again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-12 13:21:54 -07:00
Shawn O. Pearce
ed187bd593 Merge branch 'dp/cywginstat'
* dp/cywginstat:
  cygwin: Use native Win32 API for stat
  mingw: move common functionality to win32.h
  add have_git_dir() function
2008-10-09 10:24:14 -07:00
Shawn O. Pearce
a3c76f2858 Merge branch 'jc/add-ita'
* jc/add-ita:
  git-add --intent-to-add (-N)
2008-10-09 10:21:25 -07:00
Nicolas Pitre
9126f0091f fix openssl headers conflicting with custom SHA1 implementations
On ARM I have the following compilation errors:

    CC fast-import.o
In file included from cache.h:8,
                 from builtin.h:6,
                 from fast-import.c:142:
arm/sha1.h:14: error: conflicting types for 'SHA_CTX'
/usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here
arm/sha1.h:16: error: conflicting types for 'SHA1_Init'
/usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here
arm/sha1.h:17: error: conflicting types for 'SHA1_Update'
/usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here
arm/sha1.h:18: error: conflicting types for 'SHA1_Final'
/usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here
make: *** [fast-import.o] Error 1

This is because openssl header files are always included in
git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not
set, which somehow brings in <openssl/sha1.h> clashing with the custom
ARM version.  Compilation of git is probably broken on PPC too for the
same reason.

Turns out that the only file requiring openssl/ssl.h and openssl/err.h
is imap-send.c.  But only moving those problematic includes there
doesn't solve the issue as it also includes cache.h which brings in the
conflicting local SHA1 header file.

As suggested by Jeff King, the best solution is to rename our references
to SHA1 functions and structure to something git specific, and define those
according to the implementation used.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02 18:06:56 -07:00
Nanako Shiraishi
0433bcd9f0 config.c: make git_parse_long() static
This function is not used in any other file.

Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02 18:03:35 -07:00
Dmitry Potapov
d2b0708e1a add have_git_dir() function
This function is used to learn whether git_dir is already set up or not.
It is necessary, because we want to read configuration in compat/cygwin.c

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-30 14:30:06 -07:00
Junio C Hamano
d79796bcf0 push: receiver end advertises refs from alternate repositories
Earlier, when pushing into a repository that borrows from alternate object
stores, we followed the longstanding design decision not to trust refs in
the alternate repository that houses the object store we are borrowing
from.  If your public repository is borrowing from Linus's public
repository, you pushed into it long time ago, and now when you try to push
your updated history that is in sync with more recent history from Linus,
you will end up sending not just your own development, but also the
changes you acquired through Linus's tree, even though the objects needed
for the latter already exists at the receiving end.  This is because the
receiving end does not advertise that the objects only reachable from the
borrowed repository (i.e. Linus's) are already available there.

This solves the issue by making the receiving end advertise refs from
borrowed repositories.  They are not sent with their true names but with a
phoney name ".have" to make sure that the old senders will safely ignore
them (otherwise, the old senders will misbehave, trying to push matching
refs, and mirror push that deletes refs that only exist at the receiving
end).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-09 09:27:46 -07:00
Junio C Hamano
40c155ff14 push: prepare sender to receive extended ref information from the receiver
"git push" enhancement allows the receiving end to report not only its own
refs but refs in repositories it borrows from via the alternate object
store mechanism.  By telling the sender that objects reachable from these
extra refs are already complete in the receiving end, the number of
objects that need to be transfered can be cut down.

These entries are sent over the wire with string ".have", instead of the
actual names of the refs.  This string was chosen so that they are ignored
by older programs at the sending end.  If we sent some random but valid
looking refnames for these entries, "matching refs" rule (triggered when
running "git push" without explicit refspecs, where the sender learns what
refs the receiver has, and updates only the ones with the names of the
refs the sender also has) and "delete missing" rule (triggered when "git
push --mirror" is used, where the sender tells the receiver to delete the
refs it itself does not have) would try to update/delete them, which is
not what we want.

This prepares the send-pack (and "push" that runs native protocol) to
accept extended existing ref information and make use of it.  The ".have"
entries are excluded from ref matching rules, and are exempt from deletion
rule while pushing with --mirror option, but are still used for pack
generation purposes by providing more "bottom" range commits.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-09 09:27:46 -07:00
Junio C Hamano
90b4a71c49 is_directory(): a generic helper function
A simple "grep -e stat --and -e S_ISDIR" revealed there are many
open-coded implementations of this function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-09 09:27:45 -07:00
Junio C Hamano
394258190c git-add --intent-to-add (-N)
This adds "--intent-to-add" option to "git add".  This is to let the
system know that you will tell it the final contents to be staged later,
iow, just be aware of the presense of the path with the type of the blob
for now.  It is implemented by staging an empty blob as the content.

With this sequence:

    $ git reset --hard
    $ edit newfile
    $ git add -N newfile
    $ edit newfile oldfile
    $ git diff

the diff will show all changes relative to the current commit.  Then you
can do:

    $ git commit -a ;# commit everything

or

    $ git commit oldfile ;# only oldfile, newfile not yet added

to pretend you are working with an index-free system like CVS.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-31 16:22:05 -07:00
Junio C Hamano
b46f7e54fc Merge branch 'jc/add-addremove'
* jc/add-addremove:
  builtin-add.c: optimize -A option and "git add ."
  builtin-add.c: restructure the code for maintainability
2008-08-27 16:39:57 -07:00
Junio C Hamano
d6096f17d2 Merge branch 'maint'
* maint:
  unpack_trees(): protect the handcrafted in-core index from read_cache()
  git-p4: Fix one-liner in p4_write_pipe function.
  Completion: add missing '=' for 'diff --diff-filter'
  Fix 'git help help'
2008-08-23 18:28:37 -07:00
Junio C Hamano
913e0e99b6 unpack_trees(): protect the handcrafted in-core index from read_cache()
unpack_trees() rebuilds the in-core index from scratch by allocating a new
structure and finishing it off by copying the built one to the final
index.

The resulting in-core index is Ok for most use, but read_cache() does not
recognize it as such.  The function is meant to be no-op if you already
have loaded the index, until you call discard_cache().

This change the way read_cache() detects an already initialized in-core
index, by introducing an extra bit, and marks the handcrafted in-core
index as initialized, to avoid this problem.

A better fix in the longer term would be to change the read_cache() API so
that it will always discard and re-read from the on-disk index to avoid
confusion.  But there are higher level API that have relied on the current
semantics, and they and their users all need to get converted, which is
outside the scope of 'maint' track.

An example of such a higher level API is write_cache_as_tree(), which is
used by git-write-tree as well as later Porcelains like git-merge, revert
and cherry-pick.  In the longer term, we should remove read_cache() from
there and add one to cmd_write_tree(); other callers expect that the
in-core index they prepared is what gets written as a tree so no other
change is necessary for this particular codepath.

The original version of this patch marked the index by pointing an
otherwise wasted malloc'ed memory with o->result.alloc, but this version
uses Linus's idea to use a new "initialized" bit, which is conceptually
much cleaner.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-23 18:09:27 -07:00
Alex Riesen
9188ed8962 Extend "checkout --track" DWIM to support more cases
The code handles additionally "refs/remotes/<something>/name",
"remotes/<something>/name", and "refs/<namespace>/name".

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-22 17:18:26 -07:00
Junio C Hamano
0a8eb7bae5 Merge branch 'jc/index-extended-flags'
* jc/index-extended-flags:
  index: future proof for "extended" index entries
2008-08-20 23:41:47 -07:00
Junio C Hamano
16ce2e4c8f index: future proof for "extended" index entries
We do not have any more bits in the on-disk index flags word, but we would
need to have more in the future.  Use the last remaining bits as a signal
to tell us that the index entry we are looking at is an extended one.

Since we do not understand the extended format yet, we will just error out
when we see it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-17 00:22:45 -07:00
Dmitry Potapov
43df4f86e0 teach index_fd to work with pipes
index_fd can now work with file descriptors that are not normal files
but any readable file. If the given file descriptor is a regular file
then mmap() is used; for other files, strbuf_read is used.

The path parameter, which has been used as hint for filters, can be
NULL now to indicate that the file should be hashed literally without
any filter.

The index_pipe function is removed as redundant.

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-03 13:14:35 -07:00
Alex Riesen
1ce4790bf5 Make use of stat.ctime configurable
A new configuration variable 'core.trustctime' is introduced to
allow ignoring st_ctime information when checking if paths
in the working tree has changed, because there are situations where
it produces too much false positives.  Like when file system crawlers
keep changing it when scanning and using the ctime for marking scanned
files.

The default is to notice ctime changes.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-28 23:26:25 -07:00
Petr Baudis
81dc2307d0 git-mv: Keep moved index entries inact
The rewrite of git-mv from a shell script to a builtin was perhaps
a little too straightforward: the git add and git rm queues were
emulated directly, which resulted in a rather complicated code and
caused an inconsistent behaviour when moving dirty index entries;
git mv would update the entry based on working tree state,
except in case of overwrites, where the new entry would still have
sha1 of the old file.

This patch introduces rename_index_entry_at() into the index toolkit,
which will rename an entry while removing any entries the new entry
might render duplicate. This is then used in git mv instead
of all the file queues, resulting in a major simplification
of the code and an inevitable change in git mv -n output format.

Also the code used to refuse renaming overwriting symlink with a regular
file and vice versa; there is no need for that.

A few new tests have been added to the testsuite to reflect this change.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-27 15:05:19 -07:00
Junio C Hamano
041aee31be builtin-add.c: restructure the code for maintainability
A private function add_files_to_cache() in builtin-add.c was borrowed by
checkout and commit re-implementors without getting properly refactored to
more library-ish place.  This does the refactoring.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-25 21:14:21 -07:00
Junio C Hamano
d14e7407b3 "needs update" considered harmful
"git update-index --refresh", "git reset" and "git add --refresh" have
reported paths that have local modifications as "needs update" since the
beginning of git.

Although this is logically correct in that you need to update the index at
that path before you can commit that change, it is now becoming more and
more clear, especially with the continuous push for user friendliness
since 1.5.0 series, that the message is suboptimal.  After all, the change
may be something the user might want to get rid of, and "updating" would
be absolutely a wrong thing to do if that is the case.

I prepared two alternatives to solve this.  Both aim to reword the message
to more neutral "locally modified".

This patch is a more intrusive variant that changes the message for only
Porcelain commands ("add" and "reset") while keeping the plumbing
"update-index" intact.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-20 17:21:32 -07:00
Junio C Hamano
fa4946b553 Merge branch 'maint'
* maint:
  fix usage string for git grep
  refresh-index: fix bitmask assignment

Conflicts:
	builtin-grep.c
2008-07-20 17:16:29 -07:00
Junio C Hamano
3f1b7b607a refresh-index: fix bitmask assignment
5fdeacb (Teach update-index about --ignore-submodules, 2008-05-14) added a
new refresh option flag but did not assign a unique bit for it correctly,
and broke "update-index --ignore-missing".

This fixes it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-20 00:00:46 -07:00
Junio C Hamano
fcab40a389 Merge branch 'mv/merge-in-c'
* mv/merge-in-c:
  reduce_heads(): protect from duplicate input
  reduce_heads(): thinkofix
  Add a new test for git-merge-resolve
  t6021: add a new test for git-merge-resolve
  Teach merge.log to "git-merge" again
  Build in merge
  Fix t7601-merge-pull-config.sh on AIX
  git-commit-tree: make it usable from other builtins
  Add new test case to ensure git-merge prepends the custom merge message
  Add new test case to ensure git-merge reduces octopus parents when possible
  Introduce reduce_heads()
  Introduce get_merge_bases_many()
  Add new test to ensure git-merge handles more than 25 refs.
  Introduce get_octopus_merge_bases() in commit.c
  git-fmt-merge-msg: make it usable from other builtins
  Move read_cache_unmerged() to read-cache.c
  Add new test to ensure git-merge handles pull.twohead and pull.octopus
  Move parse-options's skip_prefix() to git-compat-util.h
  Move commit_list_count() to commit.c
  Move split_cmdline() to alias.c

Conflicts:
	Makefile
	parse-options.c
2008-07-15 19:09:46 -07:00
Nicolas Pitre
ac9391093f restore legacy behavior for read_sha1_file()
Since commit 8eca0b47ff, it is possible
for read_sha1_file() to return NULL even with existing objects when they
are corrupted.  Previously a corrupted object would have terminated the
program immediately, effectively making read_sha1_file() return NULL
only when specified object is not found.

Let's restore this behavior for all users of read_sha1_file() and
provide a separate function with the ability to not terminate when
bad objects are encountered.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-14 23:35:32 -07:00
Junio C Hamano
17d778e710 Merge branch 'dr/ceiling'
* dr/ceiling:
  Eliminate an unnecessary chdir("..")
  Add support for GIT_CEILING_DIRECTORIES
  Fold test-absolute-path into test-path-utils
  Implement normalize_absolute_path

Conflicts:

	cache.h
	setup.c
2008-07-07 02:17:23 -07:00
Junio C Hamano
5e97f464df Merge branch 'db/no-git-config'
* db/no-git-config:
  Only use GIT_CONFIG in "git config", not other programs

Conflicts:

	Documentation/RelNotes-1.6.0.txt
2008-07-07 02:17:14 -07:00
Junio C Hamano
bb1ab2db08 Merge branch 'j6t/mingw'
* j6t/mingw: (38 commits)
  compat/pread.c: Add a forward declaration to fix a warning
  Windows: Fix ntohl() related warnings about printf formatting
  Windows: TMP and TEMP environment variables specify a temporary directory.
  Windows: Make 'git help -a' work.
  Windows: Work around an oddity when a pipe with no reader is written to.
  Windows: Make the pager work.
  When installing, be prepared that template_dir may be relative.
  Windows: Use a relative default template_dir and ETC_GITCONFIG
  Windows: Compute the fallback for exec_path from the program invocation.
  Turn builtin_exec_path into a function.
  Windows: Use a customized struct stat that also has the st_blocks member.
  Windows: Add a custom implementation for utime().
  Windows: Add a new lstat and fstat implementation based on Win32 API.
  Windows: Implement a custom spawnve().
  Windows: Implement wrappers for gethostbyname(), socket(), and connect().
  Windows: Work around incompatible sort and find.
  Windows: Implement asynchronous functions as threads.
  Windows: Disambiguate DOS style paths from SSH URLs.
  Windows: A rudimentary poll() emulation.
  Windows: Implement start_command().
  ...
2008-07-02 21:57:52 -07:00
Daniel Barkalow
dc87183189 Only use GIT_CONFIG in "git config", not other programs
For everything other than using "git config" to read or write a
git-style config file that isn't the current repo's config file,
GIT_CONFIG was actively detrimental. Rather than argue over which
programs are important enough to have work anyway, just fix all of
them at the root.

Also removes GIT_LOCAL_CONFIG, which would only be useful for programs
that do want to use global git-specific config, but not the repo's own
git-specific config, and want to use some other, presumably
git-specific config. Despite being documented, I can't find any sign that
it was ever used.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01 02:35:49 -07:00
Miklos Vajna
e46bbcf6e8 Move read_cache_unmerged() to read-cache.c
builtin-read-tree has a read_cache_unmerged() which is useful for other
builtins, for example builtin-merge uses it as well. Move it to
read-cache.c to avoid code duplication.

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-30 22:45:51 -07:00
Miklos Vajna
0989fe9623 Move split_cmdline() to alias.c
split_cmdline() is currently used for aliases only, but later it can be
useful for other builtins as well. Move it to alias.c for now,
indicating that originally it's for aliases, but we'll have it in libgit
this way.

Signed-off-by: Miklos Vajna <vmiklos@frugalware.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-30 22:45:50 -07:00
Junio C Hamano
877f23ccb8 Teach "diff --check" about new blank lines at end
When a patch adds new blank lines at the end, "git apply --whitespace"
warns.  This teaches "diff --check" to do the same.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-26 22:07:26 -07:00
Junio C Hamano
8f8841e9c8 check_and_emit_line(): rename and refactor
The function name was too bland and not explicit enough as to what it is
checking.  Split it into two, and call the one that checks if there is a
whitespace breakage "ws_check()", and call the other one that checks and
emits the line after color coding "ws_check_emit()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-26 18:13:50 -07:00
Junio C Hamano
abf7e0df17 Merge branch 'lt/config-fsync'
* lt/config-fsync:
  Add config option to enable 'fsync()' of object files
  Split up default "i18n" and "branch" config parsing into helper routines
  Split up default "user" config parsing into helper routine
  Split up default "core" config parsing into helper routine
2008-06-25 13:19:49 -07:00
Jeff King
2beebd22f4 clone: create intermediate directories of destination repo
The shell version used to use "mkdir -p" to create the repo
path, but the C version just calls "mkdir". Let's replicate
the old behavior. We have to create the git and worktree
leading dirs separately; while most of the time, the
worktree dir contains the git dir (as .git), the user can
override this using GIT_WORK_TREE.

We can reuse safe_create_leading_directories, but we need to
make a copy of our const buffer to do so. Since
merge-recursive uses the same pattern, we can factor this
out into a global function. This has two other cleanup
advantages for merge-recursive:

  1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually
     creates bar, but this function just creates the leading
     directories.

  2. mkdir_p took a mode argument, but it was completely
     ignored.

Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-25 11:44:15 -07:00
Nicolas Pitre
99093238bb optimize verify-pack a bit
Using find_pack_entry_one() to get object offsets is rather suboptimal
when nth_packed_object_offset() can be used directly.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-24 23:58:57 -07:00
Jeff King
8e21d63b02 clone: create intermediate directories of destination repo
The shell version used to use "mkdir -p" to create the repo
path, but the C version just calls "mkdir". Let's replicate
the old behavior. We have to create the git and worktree
leading dirs separately; while most of the time, the
worktree dir contains the git dir (as .git), the user can
override this using GIT_WORK_TREE.

We can reuse safe_create_leading_directories, but we need to
make a copy of our const buffer to do so. Since
merge-recursive uses the same pattern, we can factor this
out into a global function. This has two other cleanup
advantages for merge-recursive:

  1. mkdir_p wasn't a very good name. "mkdir -p foo/bar" actually
     creates bar, but this function just creates the leading
     directories.

  2. mkdir_p took a mode argument, but it was completely
     ignored.

Acked-by: Daniel Barkalow <barkalow@iabervon.org>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-24 23:23:21 -07:00
Nicolas Pitre
8eca0b47ff implement some resilience against pack corruptions
We should be able to fall back to loose objects or alternative packs when
a pack becomes corrupted.  This is especially true when an object exists
in one pack only as a delta but its base object is corrupted.  Currently
there is no way to retrieve the former object even if the later is
available in another pack or loose.

This patch allows for a delta to be resolved (with a performance cost)
using a base object from a source other than the pack where that delta
is located.  Same thing for non-delta objects: rather than failing
outright, a search is made in other packs or used loose when the
currently active pack has it but corrupted.

Of course git will become extremely noisy with error messages when that
happens.  However, if the operation succeeds nevertheless, a simple
'git repack -a -f -d' will "fix" the corrupted repository given that all
corrupted objects have a good duplicate somewhere in the object store,
possibly manually copied from another source.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-23 21:29:33 -07:00
Johannes Sixt
25fe217b86 Windows: Treat Windows style path names.
GIT's guts work with a forward slash as a path separators. We do not change
that. Rather we make sure that only "normalized" paths enter the depths
of the machinery.

We have to translate backslashes to forward slashes in the prefix and in
command line arguments. Fortunately, all of them are passed through
functions in setup.c.

A macro has_dos_drive_path() is defined that checks whether a path begins
with a drive letter+colon combination. This predicate is always false on
Unix. Another macro is_dir_sep() abstracts that a backslash is also a
directory separator on Windows.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
2008-06-23 13:30:22 +02:00
Junio C Hamano
6b8791982c Merge branch 'sn/static'
* sn/static:
  config.c: make git_env_bool() static
  environment.c: remove unused function
2008-06-22 14:34:09 -07:00
しらいしななこ
e4bffb5a1d config.c: make git_env_bool() static
This function is not used by any other file.

Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-19 17:07:41 -07:00
しらいしななこ
78d0f5d210 environment.c: remove unused function
get_refs_directory() is not used anywhere.

Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-19 17:07:22 -07:00
Linus Torvalds
044bbbcb63 Make git_dir a path relative to work_tree in setup_work_tree()
Once we find the absolute paths for git_dir and work_tree, we can make
git_dir a relative path since we know pwd will be work_tree. This should
save the kernel some time traversing the path to work_tree all the time
if git_dir is inside work_tree.

Daniel's patch didn't apply for me as-is, so I recreated it with some
differences, and here are the numbers from ten runs each.

There is some IO for me - probably due to more-or-less random flushing of
the journal - so the variation is bigger than I'd like, but whatever:

	Before:
		real    0m8.135s
		real    0m7.933s
		real    0m8.080s
		real    0m7.954s
		real    0m7.949s
		real    0m8.112s
		real    0m7.934s
		real    0m8.059s
		real    0m7.979s
		real    0m8.038s

	After:
		real    0m7.685s
		real    0m7.968s
		real    0m7.703s
		real    0m7.850s
		real    0m7.995s
		real    0m7.817s
		real    0m7.963s
		real    0m7.955s
		real    0m7.848s
		real    0m7.969s

Now, going by "best of ten" (on the assumption that the longer numbers
are all due to IO), I'm saying a 7.933s -> 7.685s reduction, and it does
seem to be outside of the noise (ie the "after" case never broke 8s, while
the "before" case did so half the time).

So looks like about 3% to me.

Doing it for a slightly smaller test-case (just the "arch" subdirectory)
gets more stable numbers probably due to not filling the journal with
metadata updates, so we have:

	Before:
		real    0m1.633s
		real    0m1.633s
		real    0m1.633s
		real    0m1.632s
		real    0m1.632s
		real    0m1.630s
		real    0m1.634s
		real    0m1.631s
		real    0m1.632s
		real    0m1.632s

	After:
		real    0m1.610s
		real    0m1.609s
		real    0m1.610s
		real    0m1.608s
		real    0m1.607s
		real    0m1.610s
		real    0m1.609s
		real    0m1.611s
		real    0m1.608s
		real    0m1.611s

where I'ld just take the averages and say 1.632 vs 1.610, which is just
over 1% peformance improvement.

So it's not in the noise, but it's not as big as I initially thought and
measured.

(That said, it obviously depends on how deep the working directory path is
too, and whether it is behind NFS or something else that might need to
cause more work to look up).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-19 16:44:21 -07:00
Linus Torvalds
aafe9fbaf4 Add config option to enable 'fsync()' of object files
As explained in the documentation[*] this is totally useless on
filesystems that do ordered/journalled data writes, but it can be a
useful safety feature on filesystems like HFS+ that only journal the
metadata, not the actual file contents.

It defaults to off, although we could presumably in theory some day
auto-enable it on a per-filesystem basis.

[*] Yes, I updated the docs for the thing.  Hell really _has_ frozen
    over, and the four horsemen are probably just beyond the horizon.
    EVERYBODY PANIC!

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-18 16:50:35 -07:00
Junio C Hamano
79c6dca413 sha1_file.c: simplify parse_pack_index()
It was implemented as a thin wrapper around an otherwise unused
helper function parse_pack_index_file().  The code becomes simpler
and easier to read by consolidating the two.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-16 22:19:00 -07:00
Junio C Hamano
6483925999 sha1_file.c: dead code removal
write_sha1_from_fd() and write_sha1_to_fd() were dead code nobody called,
neither the latter's helper repack_object() was.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-13 23:00:51 -07:00
Daniel Barkalow
1b9a9467f8 Use nonrelative paths instead of absolute paths for cloned repositories
Particularly for the "alternates" file, if one will be created, we
want a path that doesn't depend on the current directory, but we want
to retain any symlinks in the path as given and any in the user's view
of the current directory when the path was given.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-06 11:23:10 -07:00
Linus Torvalds
4c81b03e30 Make pack creation always fsync() the result
This means that we can depend on packs always being stable on disk,
simplifying a lot of the object serialization worries.  And unlike loose
objects, serializing pack creation IO isn't going to be a performance
killer.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-31 14:46:57 -07:00
Junio C Hamano
9bd81e4249 Merge branch 'js/config-cb'
* js/config-cb:
  Provide git_config with a callback-data parameter

Conflicts:

	builtin-add.c
	builtin-cat-file.c
2008-05-25 14:25:02 -07:00
Junio C Hamano
0166592495 Merge branch 'jc/add-n-u'
* jc/add-n-u:
  Make git add -n and git -u -n output consistent
  "git-add -n -u" should not add but just report

Conflicts:

	builtin-add.c
	builtin-mv.c
	cache.h
	read-cache.c
2008-05-25 14:03:50 -07:00
Junio C Hamano
b84c343c88 Merge branch 'db/clone-in-c'
* db/clone-in-c:
  Add test for cloning with "--reference" repo being a subset of source repo
  Add a test for another combination of --reference
  Test that --reference actually suppresses fetching referenced objects
  clone: fall back to copying if hardlinking fails
  builtin-clone.c: Need to closedir() in copy_or_link_directory()
  builtin-clone: fix initial checkout
  Build in clone
  Provide API access to init_db()
  Add a function to set a non-default work tree
  Allow for having for_each_ref() list extra refs
  Have a constant extern refspec for "--tags"
  Add a library function to add an alternate to the alternates file
  Add a lockfile function to append to a file
  Mark the list of refs to fetch as const

Conflicts:

	cache.h
	t/t5700-clone-reference.sh
2008-05-25 13:41:37 -07:00
Junio C Hamano
7e83003029 Merge branch 'js/ignore-submodule'
* js/ignore-submodule:
  Ignore dirty submodule states during rebase and stash
  Teach update-index about --ignore-submodules
  diff options: Introduce --ignore-submodules
2008-05-25 13:37:08 -07:00
Junio C Hamano
e5e9714a10 Merge branch 'bc/repack'
* bc/repack:
  Documentation/git-repack.txt: document new -A behaviour
  let pack-objects do the writing of unreachable objects as loose objects
  add a force_object_loose() function
  builtin-gc.c: deprecate --prune, it now really has no effect
  git-gc: always use -A when manually repacking
  repack: modify behavior of -A option to leave unreferenced objects unpacked

Conflicts:

	builtin-pack-objects.c
2008-05-23 16:06:01 -07:00
David Reiss
0454dd93bf Add support for GIT_CEILING_DIRECTORIES
Make git recognize a new environment variable that prevents it from
chdir'ing up into specified directories when looking for a GIT_DIR.
Useful for avoiding slow network directories.

For example, I use git in an environment where homedirs are automounted
and "ls /home/nonexistent" takes about 9 seconds.  Setting
GIT_CEILING_DIRS="/home" allows "git help -a" (for bash completion) and
"git symbolic-ref" (for my shell prompt) to run in a reasonable time.

Signed-off-by: David Reiss <dreiss@facebook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-23 14:15:01 -07:00
David Reiss
ae299be0e5 Implement normalize_absolute_path
normalize_absolute_path removes several oddities form absolute paths,
giving nice clean paths like "/dir/sub1/sub2".  Also add a test case
for this utility, based on a new test program (in the style of test-sha1).

Signed-off-by: David Reiss <dreiss@facebook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-23 14:11:20 -07:00
Junio C Hamano
9d880582ee Merge branch 'ar/add-unreadable'
* ar/add-unreadable:
  Add a config option to ignore errors for git-add
  Add a test for git-add --ignore-errors
  Add --ignore-errors to git-add to allow it to skip files with read errors
  Extend interface of add_files_to_cache to allow ignore indexing errors
  Make the exit code of add_file_to_index actually useful
2008-05-21 14:16:46 -07:00
Junio C Hamano
38ed1d89f7 "git-add -n -u" should not add but just report
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-21 12:04:41 -07:00
Johannes Schindelin
5fdeacb0ca Teach update-index about --ignore-submodules
Like with the diff machinery, update-index should sometimes just
ignore submodules (e.g. to determine a clean state before a rebase).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-15 16:12:43 -07:00
Junio C Hamano
b66ae7955c Merge branch 'sb/committer'
* sb/committer:
  commit: Show committer if automatic
  commit: Show author if different from committer
  Preparation to call determine_author_info from prepare_to_commit
2008-05-14 13:45:20 -07:00
Johannes Schindelin
ef90d6d420 Provide git_config with a callback-data parameter
git_config() only had a function parameter, but no callback data
parameter.  This assumes that all callback functions only modify
global variables.

With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-14 12:34:44 -07:00
Nicolas Pitre
bbac73117e add a force_object_loose() function
This is meant to force the creation of a loose object even if it
already exists packed.  Needed for the next commit.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-13 22:42:33 -07:00
Alex Riesen
7ae02a30e8 Extend interface of add_files_to_cache to allow ignore indexing errors
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-12 20:54:52 -07:00
Junio C Hamano
dccb3a6acb Merge branch 'lt/core-optim'
* lt/core-optim:
  Optimize symlink/directory detection
  Avoid some unnecessary lstat() calls
  is_racy_timestamp(): do not check timestamp for gitlinks
  diff-lib.c: rename check_work_tree_entity()
  diff: a submodule not checked out is not modified
  Add t7506 to test submodule related functions for git-status
  t4027: test diff for submodule with empty directory
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-11 12:08:20 -07:00
Dustin Sallings
c998ae9baa Allow tracking branches to set up rebase by default.
Change cd67e4d4 introduced a new configuration parameter that told
pull to automatically perform a rebase instead of a merge.  This
change provides a configuration option to enable this feature
automatically when creating a new branch.

If the variable branch.autosetuprebase applies for a branch that's
being created, that branch will have branch.<name>.rebase set to true.

Signed-off-by: Dustin Sallings <dustin@spy.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-11 09:28:52 -07:00
Linus Torvalds
c40641b77b Optimize symlink/directory detection
This is the base for making symlink detection in the middle fo a pathname
saner and (much) more efficient.

Under various loads, we want to verify that the full path leading up to a
filename is a real directory tree, and that when we successfully do an
'lstat()' on a filename, we don't get a false positive due to a symlink in
the middle of the path that git should have seen as a symlink, not as a
normal path component.

The 'has_symlink_leading_path()' function already did this, and cached
a single level of symlink information, but didn't cache the _lack_ of a
symlink, so the normal behaviour was actually the wrong way around, and we
ended up doing an 'lstat()' on each path component to check that it was a
real directory.

This caches the last detected full directory and symlink entries, and
speeds up especially deep directory structures a lot by avoiding to
lstat() all the directories leading up to each entry in the index.

[ This can - and should - probably be extended upon so that we eventually
  never do a bare 'lstat()' on any path entries at *all* when checking the
  index, but always check the full path carefully. Right now we do not
  generally check the whole path for all our normal quick index
  revalidation.

  We should also make sure that we're careful about all the invalidation,
  ie when we remove a link and replace it by a directory we should
  invalidate the symlink cache if it matches (and vice versa for the
  directory cache).

  But regardless, the basic function needs to be sane to do that. The old
  'has_symlink_leading_path()' was not capable enough - or indeed the code
  readable enough - to really do that sanely. So I'm pushing this as not
  just an optimization, but as a base for further work. ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10 18:16:31 -07:00
Linus Torvalds
d177cab048 Avoid some unnecessary lstat() calls
The commit sequence used to do

	if (file_exists(p->path))
		add_file_to_cache(p->path, 0);

where both "file_exists()" and "add_file_to_cache()" needed to do a
lstat() on the path to do their work.

This cuts down 'lstat()' calls for the partial commit case by two
for each path we know about (because we do this twice per path).

Just move the lstat() to the caller instead (that's all that
"file_exists()" really does), and pass the stat information down to the
add_to_cache() function.

This essentially makes 'add_to_index()' the core function that adds a path
to the index, getting the index pointer, the pathname and the stat
information as arguments. There are then shorthand helper functions that
use this core function:

 - 'add_to_cache()' is just 'add_to_index()' with the default index

 - 'add_file_to_cache/index()' is the same, but does the lstat() call
   itself, so you can pass just the pathname if you don't already have the
   stat information available.

So old users of the 'add_file_to_xyzzy()' are essentially left unchanged,
and this just exposes the more generic helper function that can take
existing stat information into account.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-10 18:16:30 -07:00
Junio C Hamano
380a742679 Merge branch 'lt/case-insensitive'
* lt/case-insensitive:
  Make git-add behave more sensibly in a case-insensitive environment
  When adding files to the index, add support for case-independent matches
  Make unpack-tree update removed files before any updated files
  Make branch merging aware of underlying case-insensitive filsystems
  Add 'core.ignorecase' option
  Make hash_name_lookup able to do case-independent lookups
  Make "index_name_exists()" return the cache_entry it found
  Move name hashing functions into a file of its own
  Make unpack_trees_options bit flags actual bitfields
2008-05-10 18:14:28 -07:00
Junio C Hamano
31a3c6bb45 Merge branch 'db/learn-HEAD'
* db/learn-HEAD:
  Make ls-remote http://... list HEAD, like for git://...
  Make walker.fetch_ref() take a struct ref.
2008-05-08 20:06:23 -07:00
Santi Béjar
bb1ae3f6ff commit: Show committer if automatic
To warn the user in case he/she might be using an unintended
committer identity.

Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-06 16:50:17 -07:00
Junio C Hamano
e2e2defc14 Merge branch 'lh/git-file'
* lh/git-file:
  Teach GIT-VERSION-GEN about the .git file
  Teach git-submodule.sh about the .git file
  Teach resolve_gitlink_ref() about the .git file
  Add platform-independent .git "symlink"
2008-05-05 19:16:16 -07:00
Daniel Barkalow
f225aeb278 Provide API access to init_db()
The caller first calls set_git_dir() to specify the GIT_DIR, and then
calls init_db() to initialize it. This also cleans up various parts of
the code to account for the fact that everything is done with GIT_DIR
set, so it's unnecessary to pass the specified directory around.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:45 -07:00
Daniel Barkalow
19757d80e5 Add a function to set a non-default work tree
This function may only be used before the work tree is used.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:44 -07:00
Daniel Barkalow
bef70b22ba Add a library function to add an alternate to the alternates file
This is in the core so that, if the alternates file has already been
read, the addition can be parsed and put into effect for the current
process.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:44 -07:00
Daniel Barkalow
ea3cd5c7c6 Add a lockfile function to append to a file
This takes care of copying the original contents into the replacement
file after the lock is held, so that concurrent additions can't miss
each other's changes.

[jc: munged to drop mmap in favor of copy_file.]

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-04 17:41:44 -07:00
Heikki Orsila
0104ca09e3 Make read_in_full() and write_in_full() consistent with xread() and xwrite()
xread() and xwrite() return ssize_t values as their native POSIX
counterparts read(2) and write(2).

To be consistent, read_in_full() and write_in_full() should also return
ssize_t values.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-29 23:11:57 -07:00
Daniel Barkalow
be885d96fe Make ls-remote http://... list HEAD, like for git://...
This makes a struct ref able to represent a symref, and makes http.c
able to recognize one, and makes transport.c look for "HEAD" as a ref
in the list, and makes it dereference symrefs for the resulting ref,
if any.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-26 17:36:18 -07:00
Heikki Orsila
06cbe85503 Make core.sharedRepository more generic
git init --shared=0xxx, where '0xxx' is an octal number, will create
a repository with file modes set to '0xxx'. Users with a safe umask
value (0077) can use this option to force file modes. For example,
'0640' is a group-readable but not group-writable regardless of
user's umask value. Values compatible with old Git versions are written
as they were before, for compatibility reasons. That is, "1" for
"group" and "2" for "everybody".

"git config core.sharedRepository 0xxx" is also handled.

Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-16 18:23:54 -07:00
Junio C Hamano
a53f2ec617 git_config_bool_or_int()
This new function can be used by config parsers to tell if a variable
is simply set, set to 1, or set to "true".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-12 18:38:40 -07:00
Lars Hjemli
b44ebb19e3 Add platform-independent .git "symlink"
This patch allows .git to be a regular textfile containing the path of
the real git directory (prefixed with "gitdir: "), which can be useful on
platforms lacking support for real symlinks.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:50 -07:00
Linus Torvalds
1102952b45 Make git-add behave more sensibly in a case-insensitive environment
This expands on the previous patch, and allows "git add" to sanely handle
a filename that has changed case, keeping the case in the index constant,
and avoiding aliases.

In particular, if you have an index entry called "File", but the
checked-out tree is case-corrupted and has an entry called "file"
instead, doing a

	git add .

(or naming "file" explicitly) will automatically notice that we have an
alias, and will replace the name "file" with the existing index
capitalization (ie "File").

However, if we actually have *both* a file called "File" and one called
"file", and they don't have the same lstat() information (ie we're on a
case-sensitive filesystem but have the "core.ignorecase" flag set), we
will error out if we try to add them both.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
0a9b88b7de Add 'core.ignorecase' option
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
cd2fef59ed Make hash_name_lookup able to do case-independent lookups
Right now nobody uses it, but "index_name_exists()" gets a flag so
you can enable it on a case-by-case basis.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
df292c791a Make "index_name_exists()" return the cache_entry it found
This allows verify_absent() in unpack_trees() to use the hash chains
rather than looking it up using the binary search.

Perhaps more importantly, it's also going to be useful for the next phase,
where we actually start looking at the cache entry when we do
case-insensitive lookups and checking the result.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Linus Torvalds
96872bc200 Move name hashing functions into a file of its own
It's really totally separate functionality, and if we want to start
doing case-insensitive hash lookups, I'd rather do it when it's
separated out.

It also renames "remove_index_entry()" to "remove_name_hash()", because
that really describes the thing better. It doesn't actually remove the
index entry, that's done by "remove_index_entry_at()", which is something
very different, despite the similarity in names.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-09 01:22:25 -07:00
Junio C Hamano
b85997d14d Merge branch 'lt/unpack-trees'
* lt/unpack-trees:
  unpack_trees(): fix diff-index regression.
  traverse_trees_recursive(): propagate merge errors up
  unpack_trees(): minor memory leak fix in unused destination index
  Make 'unpack_trees()' have a separate source and destination index
  Make 'unpack_trees()' take the index to work on as an argument
  Add 'const' where appropriate to index handling functions
  Fix tree-walking compare_entry() in the presense of --prefix
  Move 'unpack_trees()' over to 'traverse_trees()' interface
  Make 'traverse_trees()' traverse conflicting DF entries in parallel
  Add return value to 'traverse_tree()' callback
  Make 'traverse_tree()' use linked structure rather than 'const char *base'
  Add 'df_name_compare()' helper function
2008-03-11 22:13:44 -07:00
Junio C Hamano
5d921e2931 Merge branch 'jc/cherry-pick' (early part)
* 'jc/cherry-pick' (early part):
  expose a helper function peel_to_type().
  merge-recursive: split low-level merge functions out.

Conflicts:

	Makefile
	builtin-merge-recursive.c
	sha1_name.c
2008-03-11 02:05:12 -07:00
Linus Torvalds
d1f128b050 Add 'const' where appropriate to index handling functions
This is in an effort to make the source index of 'unpack_trees()' as
being const, and thus making the compiler help us verify that we only
access it for reading.

The constification also extended to some of the hashing helpers that get
called indirectly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:48 -08:00
Linus Torvalds
0ab9e1e8cd Add 'df_name_compare()' helper function
This new helper is identical to base_name_compare(), except it compares
conflicting directory/file entries as equal in order to help handling DF
conflicts (thus the name).

Note that while a directory name compares as equal to a regular file
with the new helper, they then individually compare _differently_ to a
filename that has a dot after the basename (because '\0' < '.' < '/').

So a directory called "foo/" will compare equal to a file "foo", even
though "foo.c" will compare after "foo" and before "foo/"

This will be used by routines that want to traverse the git namespace
but then handle conflicting entries together when possible.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-09 00:43:46 -08:00
Junio C Hamano
eadbcd498a Merge branch 'mk/maint-parse-careful'
* mk/maint-parse-careful:
  receive-pack: use strict mode for unpacking objects
  index-pack: introduce checking mode
  unpack-objects: prevent writing of inconsistent objects
  unpack-object: cache for non written objects
  add common fsck error printing function
  builtin-fsck: move common object checking code to fsck.c
  builtin-fsck: reports missing parent commits
  Remove unused object-ref code
  builtin-fsck: move away from object-refs to fsck_walk
  add generic, type aware object chain walker

Conflicts:

	Makefile
	builtin-fsck.c
2008-03-02 15:11:07 -08:00
Junio C Hamano
60b188a984 Merge branch 'js/branch-track'
* js/branch-track:
  doc: documentation update for the branch track changes
  branch: optionally setup branch.*.merge from upstream local branches

Conflicts:

	Documentation/config.txt
	Documentation/git-branch.txt
	Documentation/git-checkout.txt
	builtin-branch.c
	cache.h
	t/t7201-co.sh
2008-02-27 13:02:57 -08:00
Junio C Hamano
5a4d707a6d Merge branch 'db/checkout'
* db/checkout: (21 commits)
  checkout: error out when index is unmerged even with -m
  checkout: show progress when checkout takes long time while switching branches
  Add merge-subtree back
  checkout: updates to tracking report
  builtin-checkout.c: Remove unused prefix arguments in switch_branches path
  checkout: work from a subdirectory
  checkout: tone down the "forked status" diagnostic messages
  Clean up reporting differences on branch switch
  builtin-checkout.c: fix possible usage segfault
  checkout: notice when the switched branch is behind or forked
  Build in checkout
  Move code to clean up after a branch change to branch.c
  Library function to check for unmerged index entries
  Use diff -u instead of diff in t7201
  Move create_branch into a library file
  Build-in merge-recursive
  Add "skip_unmerged" option to unpack_trees.
  Discard "deleted" cache entries after using them to update the working tree
  Send unpack-trees debugging output to stderr
  Add flag to make unpack_trees() not print errors.
  ...

Conflicts:

	Makefile
2008-02-27 12:53:26 -08:00
Junio C Hamano
d87aa32935 Merge branch 'jk/help-alias'
* jk/help-alias:
  help: respect aliases
  make alias lookup a public, procedural function
  help: use parseopt
2008-02-27 11:55:43 -08:00
Junio C Hamano
2db511fdbd Merge branch 'maint'
* maint:
  Documentation/git-am.txt: Pass -r in the example invocation of rm -f .dotest
  timezone_names[]: fixed the tz offset for New Zealand.
  filter-branch documentation: non-zero exit status in command abort the filter
  rev-parse: fix potential bus error with --parseopt option spec handling
  Use a single implementation and API for copy_file()
  Documentation/git-filter-branch: add a new msg-filter example
  Correct fast-export file mode strings to match fast-import standard
2008-02-26 00:14:22 -08:00
Martin Koegler
355885d531 add generic, type aware object chain walker
The requirements are:
* it may not crash on NULL pointers
* a callback function is needed, as index-pack/unpack-objects
  need to do different things
* the type information is needed to check the expected <-> real type
  and print better error messages

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 23:57:34 -08:00
Daniel Barkalow
1468bd4783 Use a single implementation and API for copy_file()
Originally by Kristian Hï¿œgsberg; I fixed the conversion of rerere, which
had a different API.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 13:06:49 -08:00
Jeff King
94351118c0 make alias lookup a public, procedural function
This converts git_config_alias to the public alias_lookup
function. Because of the nature of our config parser, we
still have to rely on setting static data. However, that
interface is wrapped so that you can just say

  value = alias_lookup(key);

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-24 18:31:49 -08:00
Junio C Hamano
e38f892d18 Merge branch 'jc/apply-whitespace'
* jc/apply-whitespace:
  ws_fix_copy(): move the whitespace fixing function to ws.c
  apply: do not barf on patch with too large an offset
  core.whitespace: cr-at-eol
  git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run
  builtin-apply.c: pass ws_rule down to match_fragment()
  builtin-apply.c: move copy_wsfix() function a bit higher.
  builtin-apply.c: do not feed copy_wsfix() leading '+'
  builtin-apply.c: simplify calling site to apply_line()
  builtin-apply.c: clean-up apply_one_fragment()
  builtin-apply.c: mark common context lines in lineinfo structure.
  builtin-apply.c: optimize match_beginning/end processing a bit.
  builtin-apply.c: make it more line oriented
  builtin-apply.c: push match-beginning/end logic down
  builtin-apply.c: restructure "offset" matching
  builtin-apply.c: refactor small part that matches context
2008-02-24 17:23:17 -08:00
Junio C Hamano
fe3403c320 ws_fix_copy(): move the whitespace fixing function to ws.c
This is used by git-apply but we can use it elsewhere by slightly
generalizing it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-23 16:59:16 -08:00
Linus Torvalds
eb7a2f1d50 Use helper function for copying index entry information
We used to just memcpy() the index entry when we copied the stat() and
SHA1 hash information, which worked well enough back when the index
entry was just an exact bit-for-bit representation of the information on
disk.

However, these days we actually have various management information in
the cache entry too, and we should be careful to not overwrite it when
we copy the stat information from another index entry.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Linus Torvalds
d070e3a31b Name hash fixups: export (and rename) remove_hash_entry
This makes the name hash removal function (which really just sets the
bit that disables lookups of it) available to external routines, and
makes read_cache_unmerged() use it when it drops an unmerged entry from
the index.

It's renamed to remove_index_entry(), and we drop the (unused) 'istate'
argument.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Linus Torvalds
a22c637124 Fix name re-hashing semantics
We handled the case of removing and re-inserting cache entries badly,
which is something that merging commonly needs to do (removing the
different stages, and then re-inserting one of them as the merged
state).

We even had a rather ugly special case for this failure case, where
replace_index_entry() basically turned itself into a no-op if the new
and the old entries were the same, exactly because the hash routines
didn't handle it on their own.

So what this patch does is to not just have the UNHASHED bit, but a
HASHED bit too, and when you insert an entry into the name hash, that
involves:

 - clear the UNHASHED bit, because now it's valid again for lookup
   (which is really all that UNHASHED meant)

 - if we're being lazy, we're done here (but we still want to clear the
   UNHASHED bit regardless of lazy mode, since we can become unlazy
   later, and so we need the UNHASHED bit to always be set correctly,
   even if we never actually insert the entry into the hash list)

 - if it was already hashed, we just leave it on the list

 - otherwise mark it HASHED and insert it into the list

this all means that unhashing and rehashing a name all just works
automatically.  Obviously, you cannot change the name of an entry (that
would be a serious bug), but nothing can validly do that anyway (you'd
have to allocate a new struct cache_entry anyway since the name length
could change), so that's not a new limitation.

The code actually gets simpler in many ways, although the lazy hashing
does mean that there are a few odd cases (ie something can be marked
unhashed even though it was never on the hash in the first place, and
isn't actually marked hashed!).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-22 21:24:47 -08:00
Jay Soffian
9ed36cfa35 branch: optionally setup branch.*.merge from upstream local branches
"git branch" and "git checkout -b" now honor --track option even when
the upstream branch is local.  Previously --track was silently ignored
when forking from a local branch.  Also the command did not error out
when --track was explicitly asked for but the forked point specified
was not an existing branch (i.e. when there is no way to set up the
tracking configuration), but now it correctly does.

The configuration setting branch.autosetupmerge can now be set to
"always", which is equivalent to using --track from the command line.
Setting branch.autosetupmerge to "true" will retain the former behavior
of only setting up branch.*.merge for remote upstream branches.

Includes test cases for the new functionality.

Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-19 21:17:45 -08:00
Junio C Hamano
8177631547 expose a helper function peel_to_type().
This helper function is the core of "$object^{type}" parser.
Now it is made available to callers outside sha1_name.c
2008-02-18 00:51:05 -08:00
Junio C Hamano
2ac4b4b222 Merge branch 'sp/safecrlf'
* sp/safecrlf:
  safecrlf: Add mechanism to warn about irreversible crlf conversions
2008-02-16 17:59:20 -08:00
Junio C Hamano
987e315a6b Merge branch 'jc/gitignore-ends-with-slash'
* jc/gitignore-ends-with-slash:
  gitignore: lazily find dtype
  gitignore(5): Allow "foo/" in ignore list to match directory "foo"
2008-02-16 17:57:06 -08:00
Junio C Hamano
fef1c4c0a0 Merge branch 'jk/noetcconfig'
* jk/noetcconfig:
  fix config reading in tests
  allow suppressing of global and system config

Conflicts:

	cache.h
2008-02-16 17:56:51 -08:00
Junio C Hamano
d5558581d2 Merge branch 'maint'
* maint:
  commit: discard index after setting up partial commit
  filter-branch: handle filenames that need quoting
  diff: Fix miscounting of --check output
  hg-to-git: fix parent analysis
  mailinfo: feed only one line to handle_filter() for QP input
  diff.c: add "const" qualifier to "char *cmd" member of "struct ll_diff_driver"
  Add "const" qualifier to "char *excludes_file".
  Add "const" qualifier to "char *editor_program".
  Add "const" qualifier to "char *pager_program".
  config: add 'git_config_string' to refactor string config variables.
  diff.c: remove useless check for value != NULL
  fast-import: check return value from unpack_entry()
  Validate nicknames of remote branches to prohibit confusing ones
  diff.c: replace a 'strdup' with 'xstrdup'.
  diff.c: fixup garding of config parser from value=NULL
2008-02-16 00:20:37 -08:00
Christian Couder
dfb068be8d Add "const" qualifier to "char *excludes_file".
Also use "git_config_string" to simplify "config.c" code
where "excludes_file" is set.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:54 -08:00
Christian Couder
ee9601e6be Add "const" qualifier to "char *editor_program".
Also use "git_config_string" to simplify "config.c" code
where "editor_program" is set.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:53 -08:00
Christian Couder
872da32d80 Add "const" qualifier to "char *pager_program".
Also use "git_config_string" to simplify "config.c" code
where "pager_program" is set.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:53 -08:00
Christian Couder
ea5105a5e3 config: add 'git_config_string' to refactor string config variables.
In many places we just check if a value from the config file is not
NULL, then we duplicate it and return 0. This patch introduces the new
'git_config_string' function to do that.

This function is also used to refactor some code in 'config.c'.
Refactoring other files is left for other patches.

Also not all the code in "config.c" is refactored, because the function
takes a "const char **" as its first parameter, but in many places a
"char *" is used instead of a "const char *". (And C does not allow
using a "char **" instead of a "const char **" without a warning.)

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-15 21:24:53 -08:00
Junio C Hamano
e0197c9aae Merge branch 'lt/in-core-index'
* lt/in-core-index:
  lazy index hashing
  Create pathname-based hash-table lookup into index
  read-cache.c: introduce is_racy_timestamp() helper
  read-cache.c: fix a couple more CE_REMOVE conversion
  Also use unpack_trees() in do_diff_cache()
  Make run_diff_index() use unpack_trees(), not read_tree()
  Avoid running lstat(2) on the same cache entry.
  index: be careful when handling long names
  Make on-disk index representation separate from in-core one
2008-02-11 16:46:20 -08:00
Junio C Hamano
40ea4ed903 Add config_error_nonbool() helper function
This is used to report misconfigured configuration file that does not
give any value to a non-boolean variable, e.g.

	[section]
		var

It is perfectly fine to say it if the section.var is a boolean (it means
true), but if a variable expects a string value it should be flagged as
a configuration error.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-11 13:11:36 -08:00
Daniel Barkalow
94a5728cfb Library function to check for unmerged index entries
It's small, but it was in three places already, so it should be in the
library.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
2008-02-09 23:16:51 -08:00
Jeff King
ab88c36321 allow suppressing of global and system config
The GIT_CONFIG_NOGLOBAL and GIT_CONFIG_NOSYSTEM environment
variables are magic undocumented switches that can be used
to ensure a totally clean environment. This is necessary for
running reliable tests, since those config files may contain
settings that change the outcome of tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-06 14:52:23 -08:00
Steffen Prohaska
21e5ad50fc safecrlf: Add mechanism to warn about irreversible crlf conversions
CRLF conversion bears a slight chance of corrupting data.
autocrlf=true will convert CRLF to LF during commit and LF to
CRLF during checkout.  A file that contains a mixture of LF and
CRLF before the commit cannot be recreated by git.  For text
files this is the right thing to do: it corrects line endings
such that we have only LF line endings in the repository.
But for binary files that are accidentally classified as text the
conversion can corrupt data.

If you recognize such corruption early you can easily fix it by
setting the conversion type explicitly in .gitattributes.  Right
after committing you still have the original file in your work
tree and this file is not yet corrupted.  You can explicitly tell
git that this file is binary and git will handle the file
appropriately.

Unfortunately, the desired effect of cleaning up text files with
mixed line endings and the undesired effect of corrupting binary
files cannot be distinguished.  In both cases CRLFs are removed
in an irreversible way.  For text files this is the right thing
to do because CRLFs are line endings, while for binary files
converting CRLFs corrupts data.

This patch adds a mechanism that can either warn the user about
an irreversible conversion or can even refuse to convert.  The
mechanism is controlled by the variable core.safecrlf, with the
following values:

 - false: disable safecrlf mechanism
 - warn: warn about irreversible conversions
 - true: refuse irreversible conversions

The default is to warn.  Users are only affected by this default
if core.autocrlf is set.  But the current default of git is to
leave core.autocrlf unset, so users will not see warnings unless
they deliberately chose to activate the autocrlf mechanism.

The safecrlf mechanism's details depend on the git command.  The
general principles when safecrlf is active (not false) are:

 - we warn/error out if files in the work tree can modified in an
   irreversible way without giving the user a chance to backup the
   original file.

 - for read-only operations that do not modify files in the work tree
   we do not not print annoying warnings.

There are exceptions.  Even though...

 - "git add" itself does not touch the files in the work tree, the
   next checkout would, so the safety triggers;

 - "git apply" to update a text file with a patch does touch the files
   in the work tree, but the operation is about text files and CRLF
   conversion is about fixing the line ending inconsistencies, so the
   safety does not trigger;

 - "git diff" itself does not touch the files in the work tree, it is
   often run to inspect the changes you intend to next "git add".  To
   catch potential problems early, safety triggers.

The concept of a safety check was originally proposed in a similar
way by Linus Torvalds.  Thanks to Dimitry Potapov for insisting
on getting the naked LF/autocrlf=true case right.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
2008-02-06 13:07:28 -08:00
Junio C Hamano
d6b8fc303b gitignore(5): Allow "foo/" in ignore list to match directory "foo"
A pattern "foo/" in the exclude list did not match directory
"foo", but a pattern "foo" did.  This attempts to extend the
exclude mechanism so that it would while not matching a regular
file or a symbolic link "foo".  In order to differentiate a
directory and non directory, this passes down the type of path
being checked to excluded() function.

A downside is that the recursive directory walk may need to run
lstat(2) more often on systems whose "struct dirent" do not give
the type of the entry; earlier it did not have to do so for an
excluded path, but we now need to figure out if a path is a
directory before deciding to exclude it.  This is especially bad
because an idea similar to the earlier CE_UPTODATE optimization
to reduce number of lstat(2) calls would by definition not apply
to the codepaths involved, as (1) directories will not be
registered in the index, and (2) excluded paths will not be in
the index anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:46:49 -08:00
Junio C Hamano
b2979ff599 core.whitespace: cr-at-eol
This new error mode allows a line to have a carriage return at the
end of the line when checking and fixing trailing whitespace errors.

Some people like to keep CRLF line ending recorded in the repository,
and still want to take advantage of the automated trailing whitespace
stripping.  We still show ^M in the diff output piped to "less" to
remind them that they do have the CR at the end, but these carriage
return characters at the end are no longer flagged as errors.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-05 00:38:41 -08:00
Junio C Hamano
9cb76b8cdc lazy index hashing
This delays the hashing of index names until it becomes necessary for
the first time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 23:01:13 -08:00
Linus Torvalds
cf558704fb Create pathname-based hash-table lookup into index
This creates a hash index of every single file added to the index.
Right now that hash index isn't actually used for much: I implemented a
"cache_name_exists()" function that uses it to efficiently look up a
filename in the index without having to do the O(logn) binary search,
but quite frankly, that's not why this patch is interesting.

No, the whole and only reason to create the hash of the filenames in the
index is that by modifying the hash function, you can fairly easily do
things like making it always hash equivalent names into the same bucket.

That, in turn, means that suddenly questions like "does this name exist
in the index under an _equivalent_ name?" becomes much much cheaper.

Guiding principles behind this patch:

 - it shouldn't be too costly. In fact, my primary goal here was to
   actually speed up "git commit" with a fully populated kernel tree, by
   being faster at checking whether a file already existed in the index. I
   did succeed, but only barely:

	Best before:
		[torvalds@woody linux]$ time git commit > /dev/null
		real    0m0.255s
		user    0m0.168s
		sys     0m0.088s

	Best after:

		[torvalds@woody linux]$ time ~/git/git commit > /dev/null
		real    0m0.233s
		user    0m0.144s
		sys     0m0.088s

   so some things are actually faster (~8%).

   Caveat: that's really the best case. Other things are invariably going
   to be slightly slower, since we populate that index cache, and quite
   frankly, few things really use it to look things up.

   That said, the cost is really quite small. The worst case is probably
   doing a "git ls-files", which will do very little except puopulate the
   index, and never actually looks anything up in it, just lists it.

	Before:
		[torvalds@woody linux]$ time git ls-files > /dev/null
		real    0m0.016s
		user    0m0.016s
		sys     0m0.000s

	After:
		[torvalds@woody linux]$ time ~/git/git ls-files > /dev/null
		real    0m0.021s
		user    0m0.012s
		sys     0m0.008s

   and while the thing has really gotten relatively much slower, we're
   still talking about something almost unmeasurable (eg 5ms). And that
   really should be pretty much the worst case.

   So we lose 5ms on one "benchmark", but win 22ms on another. Pick your
   poison - this patch has the advantage that it will _likely_ speed up
   the cases that are complex and expensive more than it slows down the
   cases that are already so fast that nobody cares. But if you look at
   relative speedups/slowdowns, it doesn't look so good.

 - It should be simple and clean

   The code may be a bit subtle (the reasons I do hash removal the way I
   do etc), but it re-uses the existing hash.c files, so it really is
   fairly small and straightforward apart from a few odd details.

Now, this patch on its own doesn't really do much, but I think it's worth
looking at, if only because if done correctly, the name hashing really can
make an improvement to the whole issue of "do we have a filename that
looks like this in the index already". And at least it gets real testing
by being used even by default (ie there is a real use-case for it even
without any insane filesystems).

NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just
not ashamed of it enough to really care. I took all the numbers out of my
nether regions - I'm sure it's good enough that it works in practice, but
the whole point was that you can make a really much fancier hash that
hashes characters not directly, but by their upper-case value or something
like that, and thus you get a case-insensitive hash, while still keeping
the name and the index itself totally case sensitive.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-22 21:46:30 -08:00
Junio C Hamano
eadb583134 Avoid running lstat(2) on the same cache entry.
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath.  This
patch is not about reducing them.

 * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
   cache entries that record a regular file blob that is up to
   date in the work tree.  If somebody later walks the index and
   wants to see if the work tree has changes, they do not have
   to be checked with lstat(2) again.

 * fill_stat_cache_info() marks the cache entry it just added
   with CE_UPTODATE.  This has the effect of marking the paths
   we write out of the index and lstat(2) immediately as "no
   need to lstat -- we know it is up-to-date", from quite a lot
   fo callers:

    - git-apply --index
    - git-update-index
    - git-checkout-index
    - git-add (uses add_file_to_index())
    - git-commit (ditto)
    - git-mv (ditto)

 * refresh_cache_ent() also marks the cache entry that are clean
   with CE_UPTODATE.

 * write_index is changed not to write CE_UPTODATE out to the
   index file, because CE_UPTODATE is meant to be transient only
   in core.  For the same reason, CE_UPDATE is not written to
   prevent an accident from happening.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Junio C Hamano
7fec10b7f4 index: be careful when handling long names
We currently use lower 12-bit (masked with CE_NAMEMASK) in the
ce_flags field to store the length of the name in cache_entry,
without checking the length parameter given to
create_ce_flags().  This can make us store incorrect length.

Currently we are mostly protected by the fact that many
codepaths first copy the path in a variable of size PATH_MAX,
which typically is 4096 that happens to match the limit, but
that feels like a bug waiting to happen.  Besides, that would
not allow us to shorten the width of CE_NAMEMASK to use the bits
for new flags.

This redefines the meaning of the name length stored in the
cache_entry.  A name that does not fit is represented by storing
CE_NAMEMASK in the field, and the actual length needs to be
computed by actually counting the bytes in the name[] field.
This way, only the unusually long paths need to suffer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Linus Torvalds
7a51ed66f6 Make on-disk index representation separate from in-core one
This converts the index explicitly on read and write to its on-disk
format, allowing the in-core format to contain more flags, and be
simpler.

In particular, the in-core format is now host-endian (as opposed to the
on-disk one that is network endian in order to be able to be shared
across machines) and as a result we can dispense with all the
htonl/ntohl on accesses to the cache_entry fields.

This will make it easier to make use of various temporary flags that do
not exist in the on-disk format.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-21 12:44:31 -08:00
Shawn O. Pearce
c9ced051c3 Fix random fast-import errors when compiled with NO_MMAP
fast-import was relying on the fact that on most systems mmap() and
write() are synchronized by the filesystem's buffer cache.  We were
relying on the ability to mmap() 20 bytes beyond the current end
of the file, then later fill in those bytes with a future write()
call, then read them through the previously obtained mmap() address.

This isn't always true with some implementations of NFS, but it is
especially not true with our NO_MMAP=YesPlease build time option used
on some platforms.  If fast-import was built with NO_MMAP=YesPlease
we used the malloc()+pread() emulation and the subsequent write()
call does not update the trailing 20 bytes of a previously obtained
"mmap()" (aka malloc'd) address.

Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to
be unable to read an object header (or data) that has been unlucky
enough to be written to the packfile at a location such that it
is in the trailing 20 bytes of a window previously opened on that
same packfile.

This bug has gone unnoticed for a very long time as it is highly data
dependent.  Not only does the object have to be placed at the right
position, but it also needs to be positioned behind some other object
that has been accessed due to a branch cache invalidation.  In other
words the stars had to align just right, and if you did run into
this bug you probably should also have purchased a lottery ticket.

Fortunately the workaround is a lot easier than the bug explanation.

Before we allow unpack_entry() to read data from a pack window
that has also (possibly) been modified through write() we force
all existing windows on that packfile to be closed.  By closing
the windows we ensure that any new access via the emulated mmap()
will reread the packfile, updating to the current file content.

This comes at a slight performance degredation as we cannot reuse
previously cached windows when we update the packfile.  But it
is a fairly minor difference as the window closes happen at only
two points:

 - When the packfile is finalized and its .idx is generated:

   At this stage we are getting ready to update the refs and any
   data access into the packfile is going to be random, and is
   going after only the branch tips (to ensure they are valid).
   Our existing windows (if any) are not likely to be positioned
   at useful locations to access those final tip commits so we
   probably were closing them before anyway.

 - When the branch cache missed and we need to reload:

   At this point fast-import is getting change commands for the next
   commit and it needs to go re-read a tree object it previously
   had written out to the packfile.  What windows we had (if any)
   are not likely to cover the tree in question so we probably were
   closing them before anyway.

We do try to avoid unnecessarily closing windows in the second case
by checking to see if the packfile size has increased since the
last time we called unpack_entry() on that packfile.  If the size
has not changed then we have not written additional data, and any
existing window is still vaild.  This nicely handles the cases where
fast-import is going through a branch cache reload and needs to read
many trees at once.  During such an event we are not likely to be
updating the packfile so we do not cycle the windows between reads.

With this change in place t9301-fast-export.sh (which was broken
by c3b0dec509) finally works again.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-17 22:39:20 -08:00
Brandon Casey
d6cf61bfd4 close_lock_file(): new function in the lockfile API
The lockfile API is a handy way to obtain a file that is cleaned
up if you die().  But sometimes you would need this sequence to
work:

 1. hold_lock_file_for_update() to get a file descriptor for
    writing;

 2. write the contents out, without being able to decide if the
    results should be committed or rolled back;

 3. do something else that makes the decision --- and this
    "something else" needs the lockfile not to have an open file
    descriptor for writing (e.g. Windows do not want a open file
    to be renamed);

 4. call commit_lock_file() or rollback_lock_file() as
    appropriately.

This adds close_lock_file() you can call between step 2 and 3 in
the above sequence.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-16 15:35:03 -08:00
Wincent Colaiuta
c1795bb08a Unify whitespace checking
This commit unifies three separate places where whitespace checking was
performed:

 - the whitespace checking previously done in builtin-apply.c is
extracted into a function in ws.c

 - the equivalent logic in "git diff" is removed

 - the emit_line_with_ws() function is also removed because that also
rechecks the whitespace, and its functionality is rolled into ws.c

The new function is called check_and_emit_line() and it does two things:
checks a line for whitespace errors and optionally emits it. The checking
is based on lines of content rather than patch lines (in other words, the
caller must strip the leading "+" or "-"); this was suggested by Junio on
the mailing list to allow for a future extension to "git show" to display
whitespace errors in blobs.

At the same time we teach it to report all classes of whitespace errors
found for a given line rather than reporting only the first found error.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-13 23:43:58 -08:00
Jeff King
6e9af863ee Support GIT_PAGER_IN_USE environment variable
When deciding whether or not to turn on automatic color
support, git_config_colorbool checks whether stdout is a
tty. However, because we run a pager, if stdout is not a
tty, we must check whether it is because we started the
pager. This used to be done by checking the pager_in_use
variable.

This variable was set only when the git program being run
started the pager; there was no way for an external program
running git indicate that it had already started a pager.
This patch allows a program to set GIT_PAGER_IN_USE to a
true value to indicate that even though stdout is not a tty,
it is because a pager is being used.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-11 00:42:05 -08:00
Junio C Hamano
4eb39e9bcc Merge branch 'jc/spht'
* jc/spht:
  Use gitattributes to define per-path whitespace rule
  core.whitespace: documentation updates.
  builtin-apply: teach whitespace_rules
  builtin-apply: rename "whitespace" variables and fix styles
  core.whitespace: add test for diff whitespace error highlighting
  git-diff: complain about >=8 consecutive spaces in initial indent
  War on whitespace: first, a bit of retreat.

Conflicts:

	cache.h
	config.c
	diff.c
2007-12-09 01:23:48 -08:00
Junio C Hamano
774751a8bc Re-fix "builtin-commit: fix --signoff"
An earlier fix to the said commit was incomplete; it mixed up the
meaning of the flag parameter passed to the internal fmt_ident()
function, so this corrects it.

git_author_info() and git_committer_info() can be told to issue a
warning when no usable user information is found, and optionally can be
told to error out.  Operations that actually use the information to
record a new commit or a tag will still error out, but the caller to
leave reflog record will just silently use bogus user information.

Not warning on misconfigured user information while writing a reflog
entry is somewhat debatable, but it is probably nicer to the users to
silently let it pass, because the only information you are losing is who
checked out the branch.

 * git_author_info() and git_committer_info() used to take 1 (positive
   int) to error out with a warning on misconfiguration; this is now
   signalled with a symbolic constant IDENT_ERROR_ON_NO_NAME.

 * These functions used to take -1 (negative int) to warn but continue;
   this is now signalled with a symbolic constant IDENT_WARN_ON_NO_NAME.

 * fmt_ident() function implements the above error reporting behaviour
   common to git_author_info() and git_committer_info().  A symbolic
   constant IDENT_NO_DATE can be or'ed in to the flag parameter to make
   it return only the "Name <email@address.xz>".

 * fmt_name() is a thin wrapper around fmt_ident() that always passes
   IDENT_ERROR_ON_NO_NAME and IDENT_NO_DATE.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-09 00:55:55 -08:00
Junio C Hamano
cf1b7869f0 Use gitattributes to define per-path whitespace rule
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]).  This attribute gives you finer
control per path.

For example, if you have these in the .gitattributes:

    frotz   whitespace
    nitfol  -whitespace
    xyzzy   whitespace=-trailing

all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'.  A project with mixed Python and C might want
to have:

    *.c    whitespace
    *.py   whitespace=-indent-with-non-tab

in its toplevel .gitattributes file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-06 00:45:30 -08:00
Junio C Hamano
31cbb5d961 Merge branch 'kh/commit'
* kh/commit: (33 commits)
  git-commit --allow-empty
  git-commit: Allow to amend a merge commit that does not change the tree
  quote_path: fix collapsing of relative paths
  Make git status usage say git status instead of git commit
  Fix --signoff in builtin-commit differently.
  git-commit: clean up die messages
  Do not generate full commit log message if it is not going to be used
  Remove git-status from list of scripts as it is builtin
  Fix off-by-one error when truncating the diff out of the commit message.
  builtin-commit.c: export GIT_INDEX_FILE for launch_editor as well.
  Add a few more tests for git-commit
  builtin-commit: Include the diff in the commit message when verbose.
  builtin-commit: fix partial-commit support
  Fix add_files_to_cache() to take pathspec, not user specified list of files
  Export three helper functions from ls-files
  builtin-commit: run commit-msg hook with correct message file
  builtin-commit: do not color status output shown in the message template
  file_exists(): dangling symlinks do exist
  Replace "runstatus" with "status" in the tests
  t7501-commit: Add test for git commit <file> with dirty index.
  ...
2007-12-04 17:16:33 -08:00
Junio C Hamano
9bbe6db85f Merge branch 'sp/refspec-match'
* sp/refspec-match:
  refactor fetch's ref matching to use refname_match()
  push: use same rules as git-rev-parse to resolve refspecs
  add refname_match()
  push: support pushing HEAD to real branch name
2007-12-04 17:07:10 -08:00
Christian Couder
b319ce4c14 Trace and quote with argv: get rid of unneeded count argument.
Now that str_buf takes care of all the allocations, there is
no more gain to pass an argument count.

So this patch removes the "count" argument from:
	- "sq_quote_argv"
	- "trace_argv_printf"
and all the callers.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-03 22:11:53 -08:00
Junio C Hamano
d9ccfe7711 Fix --signoff in builtin-commit differently.
Introduce fmt_name() specifically meant for formatting the name and
email pair, to add signed-off-by value.  This reverts parts of
13208572fb (builtin-commit: fix --signoff)
so that an empty datestamp string given to fmt_ident() by mistake will
error out as before.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-02 23:35:46 -08:00
Junio C Hamano
b45563a229 rename: Break filepairs with different types.
When we consider if a path has been totally rewritten, we did not
touch changes from symlinks to files or vice versa.  But a change
that modifies even the type of a blob surely should count as a
complete rewrite.

While we are at it, modernise diffcore-break to be aware of gitlinks (we
do not want to touch them).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-12-02 02:24:46 -08:00
Junio C Hamano
fd200790dc Merge branch 'jk/send-pack'
* jk/send-pack: (24 commits)
  send-pack: cluster ref status reporting
  send-pack: fix "everything up-to-date" message
  send-pack: tighten remote error reporting
  make "find_ref_by_name" a public function
  Fix warning about bitfield in struct ref
  send-pack: assign remote errors to each ref
  send-pack: check ref->status before updating tracking refs
  send-pack: track errors for each ref
  git-push: add documentation for the newly added --mirror mode
  Add tests for git push'es mirror mode
  Update the tracking references only if they were succesfully updated on remote
  Add a test checking if send-pack updated local tracking branches correctly
  git-push: plumb in --mirror mode
  Teach send-pack a mirror mode
  send-pack: segfault fix on forced push
  Reteach builtin-ls-remote to understand remotes
  send-pack: require --verbose to show update of tracking refs
  receive-pack: don't mention successful updates
  more terse push output
  Build in ls-remote
  ...
2007-11-24 16:45:37 -08:00
Junio C Hamano
b6ec1d619f Fix add_files_to_cache() to take pathspec, not user specified list of files
This separates the logic to limit the extent of change to the
index by where you are (controlled by "prefix") and what you
specify from the command line (controlled by "pathspec").

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-22 17:05:05 -08:00
Junio C Hamano
ee425e4643 Export three helper functions from ls-files
This exports three helper functions from ls-files.

 * pathspec_match() checks if a given path matches a set of pathspecs
   and optionally records which pathspec was used.  This function used
   to be called "match()" but renamed to be a bit less vague.

 * report_path_error() takes a set of pathspecs and the record
   pathspec_match() above leaves, and gives error message.  This
   was split out of the main function of ls-files.

 * overlay_tree_on_cache() takes a tree-ish (typically "HEAD")
   and overlays it on the current in-core index.  By iterating
   over the resulting index, the caller can find out the paths
   in either the index or the HEAD.  This function used to be
   called "overlay_tree()" but renamed to be a bit more
   descriptive.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-22 17:05:05 -08:00
Steffen Prohaska
605b4978a1 refactor fetch's ref matching to use refname_match()
The old rules used by fetch were coded as a series of ifs.  The old
rules are:
1) match full refname if it starts with "refs/" or matches "HEAD"
2) verify that full refname starts with "refs/"
3) match abbreviated name in "refs/" if it starts with "heads/",
    "tags/", or "remotes/".
4) match abbreviated name in "refs/heads/"

This is replaced by the new rules
a) match full refname
b) match abbreviated name prefixed with "refs/"
c) match abbreviated name prefixed with "refs/heads/"

The details of the new rules are different from the old rules.  We no
longer verify that the full refname starts with "refs/".  The new rule
(a) matches any full string.  The old rules (1) and (2) were stricter.
Now, the caller is responsible for using sensible full refnames.  This
should be the case for the current code.  The new rule (b) is less
strict than old rule (3).  The new rule accepts abbreviated names that
start with a non-standard prefix below "refs/".

Despite this modifications the new rules should handle all cases as
expected.  Two tests are added to verify that fetch does not resolve
short tags or HEAD in remotes.

We may even think about loosening the rules a bit more and unify them
with the rev-parse rules.  This would be done by replacing
ref_ref_fetch_rules with ref_ref_parse_rules.  Note, the two new test
would break.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 18:39:01 -08:00
Steffen Prohaska
79803322c1 add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/').  git-rev-parse and git-fetch
use slightly different rules.

This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).

abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true.  rules is a NULL-terminate
list of format patterns with "%.*s", for example:

    const char *ref_rev_parse_rules[] = {
               "%.*s",
               "refs/%.*s",
               "refs/tags/%.*s",
               "refs/heads/%.*s",
               "refs/remotes/%.*s",
               "refs/remotes/%.*s/HEAD",
               NULL
    };

Asterisks are included in the format strings because this is the form
required in sha1_name.c.  Sharing the list with the functions there is
a good idea to avoid duplicating the rules.  Hopefully this
facilitates unified matching rules in the future.

This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison.  Before this change, the rules
were buried in get_sha1*() and dwim_ref().

A follow-up commit will refactor the rules used by fetch.

refname_match() will be used for matching refspecs in git-send-pack.

Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 18:39:00 -08:00
Jeff King
2a0fe89a97 send-pack: tighten remote error reporting
Previously, we set all ref pushes to 'OK', and then marked
them as errors if the remote reported so. This has the
problem that if the remote dies or fails to report a ref, we
just assume it was OK.

Instead, we use a new non-OK state to indicate that we are
expecting status (if the remote doesn't support the
report-status feature, we fall back on the old behavior).
Thus we can flag refs for which we expected a status, but
got none (conversely, we now also print a warning for refs
for which we get a status, but weren't expecting one).

This also allows us to simplify the receive_status exit
code, since each ref is individually marked with failure
until we get a success response. We can just print the usual
status table, so the user still gets a sense of what we were
trying to do when the failure happened.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 02:34:52 -08:00
Jeff King
cda69f481d make "find_ref_by_name" a public function
This was a static in remote.c, but is generally useful.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 02:34:34 -08:00
Shawn O. Pearce
9f8a15c734 Fix warning about bitfield in struct ref
cache.h:503: warning: type of bit-field 'force' is a GCC extension
cache.h:504: warning: type of bit-field 'merge' is a GCC extension
cache.h:505: warning: type of bit-field 'nonfastforward' is a GCC extension
cache.h:506: warning: type of bit-field 'deletion' is a GCC extension

So we change it to an 'unsigned int' which is not a GCC extension.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-18 02:23:25 -08:00
Jeff King
ca74c458a3 send-pack: assign remote errors to each ref
This lets us show remote errors (e.g., a denied hook) along
with the usual push output.

There is a slightly clever optimization in receive_status
that bears explanation. We need to correlate the returned
status and our ref objects, which naively could be an O(m*n)
operation. However, since the current implementation of
receive-pack returns the errors to us in the same order that
we sent them, we optimistically look for the next ref to be
looked up to come after the last one we have found. So it
should be an O(m+n) merge if the receive-pack behavior
holds, but we fall back to a correct but slower behavior if
it should change.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-17 12:10:50 -08:00
Jeff King
8736a84890 send-pack: track errors for each ref
Instead of keeping the 'ret' variable, we instead have a
status flag for each ref that tracks what happened to it.
We then print the ref status after all of the refs have
been examined.

This paves the way for three improvements:
  - updating tracking refs only for non-error refs
  - incorporating remote rejection into the printed status
  - printing errors in a different order than we processed
    (e.g., consolidating non-ff errors near the end with
    a special message)

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Acked-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-17 12:10:50 -08:00
Johannes Sixt
506b17b136 Introduce git_etc_gitconfig() that encapsulates access of ETC_GITCONFIG.
In a subsequent patch the path to the system-wide config file will be
computed. This is a preparation for that change. It turns all accesses
of ETC_GITCONFIG into function calls. There is no change in behavior.

As a consequence, config.c is the only file that needs the definition of
ETC_GITCONFIG. Hence, -DETC_GITCONFIG is removed from the CFLAGS and a
special build rule for config.c is introduced. As a side-effect, changing
the defintion of ETC_GITCONFIG (e.g. in config.mak) does not trigger a
complete rebuild anymore.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:18:39 -08:00
Johannes Schindelin
4723ee992c Close files opened by lock_file() before unlinking.
This is needed on Windows since open files cannot be unlinked.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:18:39 -08:00
Junio C Hamano
039bc64e88 core.excludesfile clean-up
There are inconsistencies in the way commands currently handle
the core.excludesfile configuration variable.  The problem is
the variable is too new to be noticed by anything other than
git-add and git-status.

 * git-ls-files does not notice any of the "ignore" files by
   default, as it predates the standardized set of ignore files.
   The calling scripts established the convention to use
   .git/info/exclude, .gitignore, and later core.excludesfile.

 * git-add and git-status know about it because they call
   add_excludes_from_file() directly with their own notion of
   which standard set of ignore files to use.  This is just a
   stupid duplication of code that need to be updated every time
   the definition of the standard set of ignore files is
   changed.

 * git-read-tree takes --exclude-per-directory=<gitignore>,
   not because the flexibility was needed.  Again, this was
   because the option predates the standardization of the ignore
   files.

 * git-merge-recursive uses hardcoded per-directory .gitignore
   and nothing else.  git-clean (scripted version) does not
   honor core.* because its call to underlying ls-files does not
   know about it.  git-clean in C (parked in 'pu') doesn't either.

We probably could change git-ls-files to use the standard set
when no excludes are specified on the command line and ignore
processing was asked, or something like that, but that will be a
change in semantics and might break people's scripts in a subtle
way.  I am somewhat reluctant to make such a change.

On the other hand, I think it makes perfect sense to fix
git-read-tree, git-merge-recursive and git-clean to follow the
same rule as other commands.  I do not think of a valid use case
to give an exclude-per-directory that is nonstandard to
read-tree command, outside a "negative" test in the t1004 test
script.

This patch is the first step to untangle this mess.

The next step would be to teach read-tree, merge-recursive and
clean (in C) to use setup_standard_excludes().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-14 15:08:04 -08:00
Junio C Hamano
c78a24986d Merge branch 'jc/maint-add-sync-stat'
* jc/maint-add-sync-stat:
  t2200: test more cases of "add -u"
  git-add: make the entry stat-clean after re-adding the same contents
  ce_match_stat, run_diff_files: use symbolic constants for readability

Conflicts:

	builtin-add.c
2007-11-14 14:15:40 -08:00
Junio C Hamano
a108e53861 Merge branch 'db/remote-builtin' into jk/send-pack
* db/remote-builtin:
  Reteach builtin-ls-remote to understand remotes
  Build in ls-remote
  Use built-in send-pack.
  Build-in send-pack, with an API for other programs to call.
  Build-in peek-remote, using transport infrastructure.
  Miscellaneous const changes and utilities

Conflicts:

	transport.c
2007-11-14 03:09:52 -08:00
Junio C Hamano
4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
Junio C Hamano
0380074315 Merge branch 'ds/maint-deflatebound'
* ds/maint-deflatebound:
  Improve accuracy of check for presence of deflateBound.
2007-11-07 18:17:20 -08:00
David Symonds
609a2289d7 Improve accuracy of check for presence of deflateBound.
ZLIB_VERNUM isn't defined in some zlib versions, so this patch does a proper
linking test in autoconf to see whether deflateBound exists in zlib. Also,
setting NO_DEFLATE_BOUND will also work for folk not using autoconf.

Signed-off-by: David Symonds <dsymonds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-07 17:06:14 -08:00
Mike Hommey
59f0f2f33a Refactor working tree setup
Create a setup_work_tree() that can be used from any command requiring
a working tree conditionally.

Signed-off-by: Mike Hommey <mh@glandium.org>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-05 22:47:57 -08:00
Daniel Barkalow
4577370e9b Miscellaneous const changes and utilities
The list of remote refs in struct transport should be const, because
builtin-fetch will get confused if it changes.

The url in git_connect should be const (and work on a copy) instead of
requiring the caller to copy it.

match_refs doesn't modify the refspecs it gets.

get_fetch_map and get_remote_ref don't change the list they get.

Allow transport get_refs_list methods to modify the struct transport.

Add a function to copy a list of refs, when a function needs a mutable
copy of a const list.

Add a function to check the type of a ref, as per the code in connect.c

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-02 22:40:43 -07:00
Junio C Hamano
459fa6d0fe git-diff: complain about >=8 consecutive spaces in initial indent
This introduces a new whitespace error type, "indent-with-non-tab".
The error is about starting a line with 8 or more SP, instead of
indenting it with a HT.

This is not enabled by default, as some projects employ an
indenting policy to use only SPs and no HTs.

The kernel folks and git contributors may want to enable this
detection with:

	[core]
		whitespace = indent-with-non-tab

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-02 17:58:09 -07:00
Junio C Hamano
a9cc857ada War on whitespace: first, a bit of retreat.
This introduces core.whitespace configuration variable that lets
you specify the definition of "whitespace error".

Currently there are two kinds of whitespace errors defined:

 * trailing-space: trailing whitespaces at the end of the line.

 * space-before-tab: a SP appears immediately before HT in the
   indent part of the line.

You can specify the desired types of errors to be detected by
listing their names (unique abbreviations are accepted)
separated by comma.  By default, these two errors are always
detected, as that is the traditional behaviour.  You can disable
detection of a particular type of error by prefixing a '-' in
front of the name of the error, like this:

	[core]
		whitespace = -trailing-space

This patch teaches the code to output colored diff with
DIFF_WHITESPACE color to highlight the detected whitespace
errors to honor the new configuration.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-02 17:58:08 -07:00
Johannes Sixt
98158e9cfd Change git_connect() to return a struct child_process instead of a pid_t.
This prepares the API of git_connect() and finish_connect() to operate on
a struct child_process. Currently, we just use that object as a placeholder
for the pid that we used to return. A follow-up patch will change the
implementation of git_connect() and finish_connect() to make full use
of the object.

Old code had early-return-on-error checks at the calling sites of
git_connect(), but since git_connect() dies on errors anyway, these checks
were removed.

[sp: Corrected style nit of "conn == NULL" to "!conn"]

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-21 01:30:39 -04:00
Shawn O. Pearce
2e13e5d892 Merge branch 'master' into db/fetch-pack
There's a number of tricky conflicts between master and
this topic right now due to the rewrite of builtin-push.
Junio must have handled these via rerere; I'd rather not
deal with them again so I'm pre-merging master into the
topic.  Besides this topic somehow started to depend on
the strbuf series that was in next, but is now in master.
It no longer compiles on its own without the strbuf API.

* master: (184 commits)
  Whip post 1.5.3.4 maintenance series into shape.
  Minor usage update in setgitperms.perl
  manual: use 'URL' instead of 'url'.
  manual: add some markup.
  manual: Fix example finding commits referencing given content.
  Fix wording in push definition.
  Fix some typos, punctuation, missing words, minor markup.
  manual: Fix or remove em dashes.
  Add a --dry-run option to git-push.
  Add a --dry-run option to git-send-pack.
  Fix in-place editing functions in convert.c
  instaweb: support for Ruby's WEBrick server
  instaweb: allow for use of auto-generated scripts
  Add 'git-p4 commit' as an alias for 'git-p4 submit'
  hg-to-git speedup through selectable repack intervals
  git-svn: respect Subversion's [auth] section configuration values
  gtksourceview2 support for gitview
  fix contrib/hooks/post-receive-email hooks.recipients error message
  Support cvs via git-shell
  rebase -i: use diff plumbing instead of porcelain
  ...

Conflicts:

	Makefile
	builtin-push.c
	rsh.c
2007-10-16 00:15:25 -04:00
Junio C Hamano
66d4035e10 Merge branch 'ph/strbuf'
* ph/strbuf: (44 commits)
  Make read_patch_file work on a strbuf.
  strbuf_read_file enhancement, and use it.
  strbuf change: be sure ->buf is never ever NULL.
  double free in builtin-update-index.c
  Clean up stripspace a bit, use strbuf even more.
  Add strbuf_read_file().
  rerere: Fix use of an empty strbuf.buf
  Small cache_tree_write refactor.
  Make builtin-rerere use of strbuf nicer and more efficient.
  Add strbuf_cmp.
  strbuf_setlen(): do not barf on setting length of an empty buffer to 0
  sq_quote_argv and add_to_string rework with strbuf's.
  Full rework of quote_c_style and write_name_quoted.
  Rework unquote_c_style to work on a strbuf.
  strbuf API additions and enhancements.
  nfv?asprintf are broken without va_copy, workaround them.
  Fix the expansion pattern of the pseudo-static path buffer.
  builtin-for-each-ref.c::copy_name() - do not overstep the buffer.
  builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion.
  Use xmemdupz() in many places.
  ...
2007-10-03 03:06:02 -07:00
Junio C Hamano
0341091a9e Merge branch 'jc/autogc'
* jc/autogc:
  git-gc --auto: run "repack -A -d -l" as necessary.
  git-gc --auto: restructure the way "repack" command line is built.
  git-gc --auto: protect ourselves from accumulated cruft
  git-gc --auto: add documentation.
  git-gc --auto: move threshold check to need_to_gc() function.
  repack -A -d: use --keep-unreachable when repacking
  pack-objects --keep-unreachable
  Export matches_pack_name() and fix its return value
  Invoke "git gc --auto" from commit, merge, am and rebase.
  Implement git gc --auto
2007-10-03 03:05:32 -07:00
Andy Parkins
856665f827 parse_date_format(): convert a format name to an enum date_mode
Factor out the code to parse --date=<format> parameter to revision
walkers into a separate function, parse_date_format().  This function
is passed a string and converts it to an enum date_format:

 - "relative"         => DATE_RELATIVE
 - "iso8601" or "iso" => DATE_ISO8601
 - "rfc2822"          => DATE_RFC2822
 - "short"            => DATE_SHORT
 - "local"            => DATE_LOCAL
 - "default"          => DATE_NORMAL

In the event that none of these strings is found, the function die()s.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-29 20:31:59 -07:00
Carlos Rica
102c2338da Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-26 13:42:10 -07:00
Pierre Habouzit
19247e5510 nfv?asprintf are broken without va_copy, workaround them.
* drop nfasprintf.
* move nfvasprintf into imap-send.c back, and let it work on a 8k buffer,
  and die() in case of overflow. It should be enough for imap commands, if
  someone cares about imap-send, he's welcomed to fix it properly.
* replace nfvasprintf use in merge-recursive with a copy of the strbuf_addf
  logic, it's one place, we'll live with it.
  To ease the change, output_buffer string list is replaced with a strbuf ;)
* rework trace.c to call vsnprintf itself.  It's used to format strerror()s
  and git command names, it should never be more than a few octets long, let
  it work on a 8k static buffer with vsnprintf or die loudly.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
2007-09-20 23:17:22 -07:00
Daniel Barkalow
b888d61c83 Make fetch a builtin
Thanks to Johannes Schindelin for review and fixes, and Julian
Phillips for the original C translation.

This changes a few small bits of behavior:

branch.<name>.merge is parsed as if it were the lhs of a fetch
refspec, and does not have to exactly match the actual lhs of a
refspec, so long as it is a valid abbreviation for the same ref.

branch.<name>.merge is no longer ignored if the remote is configured
with a branches/* file. Neither behavior is useful, because there can
only be one ref that gets fetched, but this is more consistant.

Also, fetch prints different information to standard out.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-19 03:22:30 -07:00
Junio C Hamano
39bd2eb56a Merge branch 'master' into ph/strbuf
* master: (94 commits)
  Fixed update-hook example allow-users format.
  Documentation/git-svn: updated design philosophy notes
  t/t4014: test "am -3" with mode-only change.
  git-commit.sh: Shell script cleanup
  preserve executable bits in zip archives
  Fix lapsus in builtin-apply.c
  git-push: documentation and tests for pushing only branches
  git-svnimport: Use separate arguments in the pipe for git-rev-parse
  contrib/fast-import: add perl version of simple example
  contrib/fast-import: add simple shell example
  rev-list --bisect: Bisection "distance" clean up.
  rev-list --bisect: Move some bisection code into best_bisection.
  rev-list --bisect: Move finding bisection into do_find_bisection.
  Document ls-files --with-tree=<tree-ish>
  git-commit: partial commit of paths only removed from the index
  git-commit: Allow partial commit of file removal.
  send-email: make message-id generation a bit more robust
  git-apply: fix whitespace stripping
  git-gui: Disable native platform text selection in "lists"
  apply --index-info: fall back to current index for mode changes
  ...
2007-09-18 17:42:15 -07:00
Junio C Hamano
000dfd3f6e Export matches_pack_name() and fix its return value
The function sounds boolean; make it behave as one, not "0 for
success, non-zero for failure".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-17 12:25:26 -07:00
Pierre Habouzit
5ecd293d14 Rewrite convert_to_{git,working_tree} to use strbuf's.
* Now, those functions take an "out" strbuf argument, where they store their
  result if any. In that case, it also returns 1, else it returns 0.
* those functions support "in place" editing, in the sense that it's OK to
  call them this way:
    convert_to_git(path, sb->buf, sb->len, sb);
  When doable, conversions are done in place for real, else the strbuf
  content is just replaced with the new one, transparentely for the caller.

If you want to create a new filter working this way, being the accumulation
of filter1, filter2, ... filtern, then your meta_filter would be:

    int meta_filter(..., const char *src, size_t len, struct strbuf *sb)
    {
        int ret = 0;
        ret |= filter1(...., src, len, sb);
        if (ret) {
            src = sb->buf;
            len = sb->len;
        }
        ret |= filter2(...., src, len, sb);
        if (ret) {
            src = sb->buf;
            len = sb->len;
        }
        ....
        return ret | filtern(..., src, len, sb);
    }

That's why subfilters the convert_to_* functions called were also rewritten
to work this way.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-16 17:30:03 -07:00
Carlos Rica
6640f88165 Move make_cache_entry() from merge-recursive.c into read-cache.c
The function make_cache_entry() is too useful to be hidden away in
merge-recursive.  So move it to libgit.a (exposing it via cache.h).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-12 13:25:07 -07:00
Pierre Habouzit
fd17f5b5f7 Replace all read_fd use with strbuf_read, and get rid of it.
This brings builtin-stripspace, builtin-tag and mktag to use strbufs.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-10 12:50:58 -07:00
René Scharfe
89b4256cfb Remove unused function convert_sha1_file()
convert_sha1_file() became unused by the previous patch -- remove it.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-03 16:46:23 -07:00
Junio C Hamano
aecbf914c4 git-diff: resurrect the traditional empty "diff --git" behaviour
The warning message to suggest "Consider running git-status" from
"git-diff" that we experimented with during the 1.5.3 cycle turns
out to be a bad idea.  It robbed cache-dirty information from people
who valued it, while still asking users to run "update-index --refresh".
It was hoped that the new behaviour would at least have some educational
value, but not showing the cache-dirty paths like before meant that the
user would not even know easily which paths were cache-dirty, and it
made the need to refresh the index look like even more unnecessary chore.

This commit reinstates the traditional behaviour, but with a twist.

By default, the empty "diff --git" output is totally squelched out
from "git diff" output.  At the end of the command, it automatically
runs "update-index --refresh" as needed, without even bothering the
user.  In other words, people who do not care about the cache-dirtyness
do not even have to see the warning.

The traditional behaviour to see the stat-dirty output and to bypassing
the overhead of content comparison can be specified by setting the
configuration variable diff.autorefreshindex to false.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-31 23:30:14 -07:00
Alexandre Julliard
d616813d75 git-add: Add support for --refresh option.
This allows to refresh only a subset of the project files, based on
the specified pathspecs.

Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-13 12:58:38 -07:00
Junio C Hamano
55d1932bce Merge branch 'cr/tag'
* cr/tag:
  Teach "git stripspace" the --strip-comments option
  Make verify-tag a builtin.
  builtin-tag.c: Fix two memory leaks and minor notation changes.
  launch_editor(): Heed GIT_EDITOR and core.editor settings
  Make git tag a builtin.
2007-08-10 23:17:46 -07:00
Junio C Hamano
af3785dc5a Optimize "diff --cached" performance.
The read_tree() function is called only from the call chain to
run "git diff --cached" (this includes the internal call made by
git-runstatus to run_diff_index()).  The function vacates stage
without any funky "merge" magic.  The caller then goes and
compares stage #1 entries from the tree with stage #0 entries
from the original index.

When adding the cache entries this way, it used the general
purpose add_cache_entry().  This function looks for an existing
entry to replace or if there is none to find where to insert the
new entry, resolves D/F conflict and all the other things.

For the purpose of reading entries into an empty stage, none of
that processing is needed.  We can instead append everything and
then sort the result at the end.

This commit changes read_tree() to first make sure that there is
no existing cache entries at specified stage, and if that is the
case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag
(new), and then sort the resulting cache using qsort().

This new flag tells add_cache_entry() to omit all the checks
such as "Does this path already exist?  Does adding this path
remove other existing entries because it turns a directory to a
file?" and instead append the given cache entry straight at the
end of the active cache.  The caller of course is expected to
sort the resulting cache at the end before using the result.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10 11:44:23 -07:00
Johannes Schindelin
e90fdc39b6 Clean up work-tree handling
The old version of work-tree support was an unholy mess, barely readable,
and not to the point.

For example, why do you have to provide a worktree, when it is not used?
As in "git status".  Now it works.

Another riddle was: if you can have work trees inside the git dir, why
are some programs complaining that they need a work tree?

IOW it is allowed to call

	$ git --git-dir=../ --work-tree=. bla

when you really want to.  In this case, you are both in the git directory
and in the working tree.  So, programs have to actually test for the right
thing, namely if they are inside a working tree, and not if they are
inside a git directory.

Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was
specified, unless there is a repository in the current working directory.
It does now.

The logic to determine if a repository is bare, or has a work tree
(tertium non datur), is this:

--work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true,
which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR
ends in /.git, which overrides the directory in which .git/ was found.

In related news, a long standing bug was fixed: when in .git/bla/x.git/,
which is a bare repository, git formerly assumed ../.. to be the
appropriate git dir.  This problem was reported by Shawn Pearce to have
caused much pain, where a colleague mistakenly ran "git init" in "/" a
long time ago, and bare repositories just would not work.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:31 -07:00
Johannes Schindelin
d7ac12b25d Add set_git_dir() function
With the function set_git_dir() you can reset the path that will
be used for git_path(), git_dir() and friends.

The responsibility to close files and throw away information from the
old git_dir lies with the caller.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:31 -07:00
Johannes Schindelin
e5392c5146 Add is_absolute_path() and make_absolute_path()
This patch adds convenience functions to work with absolute paths.
The function is_absolute_path() should help the efforts to integrate
the MinGW fork.

Note that make_absolute_path() returns a pointer to a static buffer.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 00:38:30 -07:00
Johannes Schindelin
4d87b9c5db launch_editor(): Heed GIT_EDITOR and core.editor settings
In the commit 'Add GIT_EDITOR environment and core.editor
configuration variables', this was done for the shell scripts.
Port it over to builtin-tag's version of launch_editor(), which
is just about to be refactored into editor.c.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-21 16:51:14 -07:00
Carlos Rica
c4fba0a358 Rename read_pipe() with read_fd() and make its buffer nul-terminated.
The new name is closer to the purpose of the function.

A NUL-terminated buffer makes things easier when callers need that.
Since the function returns only the memory written with data,
almost always allocating more space than needed because final
size is unknown, an extra NUL terminating the buffer is harmless.
It is not included in the returned size, so the function
remains working as before.

Also, now the function allows the buffer passed to be NULL at first,
and alloc_nr is now used for growing the buffer, instead size=*2.

Signed-off-by: Carlos Rica <jasampler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-18 17:30:03 -07:00
Junio C Hamano
73013afd14 Make show_rfc2822_date() just another date output format.
These days, show_date() takes a date_mode parameter to specify
the output format, and a separate specialized function for dates
in E-mails does not make much sense anymore.

This retires show_rfc2822_date() function and make it just
another date output format.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-13 23:14:52 -07:00
Robin Rosenberg
ee8f838e03 Support output ISO 8601 format dates
Support output of full ISO 8601 style dates in e.g. git log
and other places that use interpolation for formatting.

Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-13 22:47:49 -07:00
Brian Downing
0b87b6e081 Add functions for parsing integers with size suffixes
Split out the nnn{k,m,g} parsing code from git_config_int into
git_parse_long, so command-line parameters can enjoy the same
functionality.  Also add get_parse_ulong for unsigned values.

Make git_config_int use git_parse_long, and add get_config_ulong
as well.

Signed-off-by: Brian Downing <bdowning@lavos.net>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-12 14:32:35 -07:00
Brian Gernhardt
54adf3706c Add core.pager config variable.
This adds a configuration variable that performs the same function as,
but is overridden by, GIT_PAGER.

Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Acked-by: Johannes E. Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-04 10:09:32 -07:00
Junio C Hamano
660d579d6f Merge branch 'jc/quote'
* jc/quote:
  Add core.quotepath configuration variable.
2007-07-01 14:57:51 -07:00
Junio C Hamano
0305b63654 Merge branch 'ei/worktree+filter'
* ei/worktree+filter:
  filter-branch: always export GIT_DIR if it is set
  setup_git_directory: fix segfault if repository is found in cwd
  test GIT_WORK_TREE
  extend rev-parse test for --is-inside-work-tree
  Use new semantics of is_bare/inside_git_dir/inside_work_tree
  introduce GIT_WORK_TREE to specify the work tree
  test git rev-parse
  rev-parse: introduce --is-bare-repository
  rev-parse: document --is-inside-git-dir
2007-07-01 13:10:42 -07:00
Theodore Ts'o
06f59e9f5d Don't fflush(stdout) when it's not helpful
This patch arose from a discussion started by Jim Meyering's patch
whose intention was to provide better diagnostics for failed writes.
Linus proposed a better way to do things, which also had the added
benefit that adding a fflush() to git-log-* operations and incremental
git-blame operations could improve interactive respose time feel, at
the cost of making things a bit slower when we aren't piping the
output to a downstream program.

This patch skips the fflush() calls when stdout is a regular file, or
if the environment variable GIT_FLUSH is set to "0".  This latter can
speed up a command such as:

GIT_FLUSH=0 strace -c -f -e write time git-rev-list HEAD | wc -l

a tiny amount.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-30 20:16:12 -07:00
Junio C Hamano
9378c16135 Add core.quotepath configuration variable.
We always quote "unusual" byte values in a pathname using
C-string style, to make it safer for parsing scripts that do not
handle NUL separated records well (or just too lazy to bother).
The absolute minimum bytes that need to be quoted for this
purpose are TAB, LF (and other control characters), double quote
and backslash.

However, we have also always quoted the bytes in high 8-bit
range; this was partly because we were lazy and partly because
we were being cautious.

This introduces an internal "quote_path_fully" variable, and
core.quotepath configuration variable to control it.  When set
to false, it does not quote bytes in high 8-bit range anymore
but passes them intact.

The variable defaults to "true" to retain the traditional
behaviour for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-24 15:11:42 -07:00
Junio C Hamano
9bee7aabcd Merge branch 'ei/oneline+add-empty'
* ei/oneline+add-empty:
  Fix ALLOC_GROW calls with obsolete semantics
  Fix ALLOC_GROW off-by-one
  builtin-add: simplify (and increase accuracy of) exclude handling
  dir_struct: add collect_ignored option
  Extend --pretty=oneline to cover the first paragraph,
  Lift 16kB limit of log message output
2007-06-22 23:32:19 -07:00
Jeff King
c927e6c69b Fix ALLOC_GROW off-by-one
The ALLOC_GROW macro will never let us fill the array completely,
instead allocating an extra chunk if that would be the case. This is
because the 'nr' argument was originally treated as "how much we do have
now" instead of "how much do we want". The latter makes much more
sense because you can grow by more than one item.

This off-by-one never resulted in an error because it meant we were
overly conservative about when to allocate. Any callers which passed
"how much we have now" need to be updated, or they will fail to allocate
enough.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-16 17:55:44 -07:00
Junio C Hamano
4175e9e3a8 More static
There still are quite a few symbols that ought to be static.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-13 02:02:10 -07:00
Junio C Hamano
4234a76167 Extend --pretty=oneline to cover the first paragraph,
so that an ugly commit message like this can be
handled sanely.

Currently, --pretty=oneline and --pretty=email (hence
format-patch) take and use only the first line of the commit log
message.  This changes them to:

 - Take the first paragraph, where the definition of the first
   paragraph is "skip all blank lines from the beginning, and
   then grab everything up to the next empty line".

 - Replace all line breaks with a whitespace.

This change would not affect a well-behaved commit message that
adheres to the convention of "single line summary, a blank line,
and then body of message", as its first paragraph always
consists of a single line.  Commit messages from different
culture, such as the ones imported from CVS/SVN, can however get
chomped with the existing behaviour at the first linebreak in
the middle of sentence right now, which would become much easier
to see with this change.

The Subject: and --pretty=oneline output would become very long
and unsightly for non-conforming commits, but their messages are
already ugly anyway, and thischange at least avoids the loss of
information.

The Subject: line from a multi-line paragraph is folded using
RFC2822 line folding rules at the places where line breaks were
in the original.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-13 00:41:21 -07:00
Jeff King
6815e56933 refactor dir_add_name
This is in preparation for keeping two entry lists in the
dir object.

This patch adds and uses the ALLOC_GROW() macro, which
implements the commonly used idiom of growing a dynamic
array using the alloc_nr function (not just in dir.c, but
everywhere).

We also move creation of a dir_entry to dir_entry_new.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-12 23:00:31 -07:00
Junio C Hamano
a6080a0a44 War on whitespace
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time.  There are a few files that need
to have trailing whitespaces (most notably, test vectors).  The results
still passes the test, and build result in Documentation/ area is unchanged.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07 00:04:01 -07:00
Matthias Lederhofer
892c41b98a introduce GIT_WORK_TREE to specify the work tree
setup_gdg is used as abbreviation for setup_git_directory_gently.

The work tree can be specified using the environment variable
GIT_WORK_TREE and the config option core.worktree (the environment
variable has precendence over the config option).  Additionally
there is a command line option --work-tree which sets the
environment variable.

setup_gdg does the following now:

GIT_DIR unspecified
repository in .git directory
    parent directory of the .git directory is used as work tree,
    GIT_WORK_TREE is ignored

GIT_DIR unspecified
repository in cwd
    GIT_DIR is set to cwd
    see the cases with GIT_DIR specified what happens next and
    also see the note below

GIT_DIR specified
GIT_WORK_TREE/core.worktree unspecified
    cwd is used as work tree

GIT_DIR specified
GIT_WORK_TREE/core.worktree specified
    the specified work tree is used

Note on the case where GIT_DIR is unspecified and repository is in cwd:
    GIT_WORK_TREE is used but is_inside_git_dir is always true.
    I did it this way because setup_gdg might be called multiple
    times (e.g. when doing alias expansion) and in successive calls
    setup_gdg should do the same thing every time.

Meaning of is_bare/is_inside_work_tree/is_inside_git_dir:

(1) is_bare_repository
    A repository is bare if core.bare is true or core.bare is
    unspecified and the name suggests it is bare (directory not
    named .git).  The bare option disables a few protective
    checks which are useful with a working tree.  Currently
    this changes if a repository is bare:
        updates of HEAD are allowed
        git gc packs the refs
        the reflog is disabled by default

(2) is_inside_work_tree
    True if the cwd is inside the associated working tree (if there
    is one), false otherwise.

(3) is_inside_git_dir
    True if the cwd is inside the git directory, false otherwise.
    Before this patch is_inside_git_dir was always true for bare
    repositories.

When setup_gdg finds a repository git_config(git_default_config) is
always called.  This ensure that is_bare_repository makes use of
core.bare and does not guess even though core.bare is specified.

inside_work_tree and inside_git_dir are set if setup_gdg finds a
repository.  The is_inside_work_tree and is_inside_git_dir functions
will die if they are called before a successful call to setup_gdg.

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 16:07:53 -07:00
Junio C Hamano
17c2929aa2 Merge branch 'sp/pack'
* sp/pack:
  Style nit - don't put space after function names
  Ensure the pack index is opened before access
  Simplify index access condition in count-objects, pack-redundant
  Test for recent rev-parse $abbrev_sha1 regression
  rev-parse: Identify short sha1 sums correctly.
  Attempt to delay prepare_alt_odb during get_sha1
  Micro-optimize prepare_alt_odb
  Lazily open pack index files on demand
2007-06-02 12:18:51 -07:00
Junio C Hamano
bd724be4be Merge branch 'maint'
* maint:
  git-config: Improve documentation of git-config file handling
  git-config: Various small fixes to asciidoc documentation
  decode_85(): fix missing return.
  fix signed range problems with hex conversions
2007-05-31 00:15:14 -07:00
Junio C Hamano
8e29f903eb Merge branch 'maint-1.5.1' into maint
* maint-1.5.1:
  git-config: Improve documentation of git-config file handling
  git-config: Various small fixes to asciidoc documentation
  decode_85(): fix missing return.
  fix signed range problems with hex conversions
2007-05-31 00:09:26 -07:00
Linus Torvalds
192a6be2a7 fix signed range problems with hex conversions
Make hexval_table[] "const".  Also make sure that the accessor
function hexval() does not access the table with out-of-range
values by declaring its parameter "unsigned char", instead of
"unsigned int".

With this, gcc can just generate:

	movzbl  (%rdi), %eax
	movsbl  hexval_table(%rax),%edx
	movzbl  1(%rdi), %eax
	movsbl  hexval_table(%rax),%eax
	sall    $4, %edx
	orl     %eax, %edx

for the code to generate a byte from two hex characters.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-30 15:01:37 -07:00
Junio C Hamano
322bcd9a9a Merge branch 'db/remote'
* db/remote:
  Move refspec pattern matching to match_refs().
  Update local tracking refs when pushing
  Add handlers for fetch-side configuration of remotes.
  Move refspec parser from connect.c and cache.h to remote.{c,h}
  Move remote parsing into a library file out of builtin-push.
2007-05-29 01:24:20 -07:00
Shawn O. Pearce
d079837eee Lazily open pack index files on demand
In some repository configurations the user may have many packfiles,
but all of the recent commits/trees/tags/blobs are likely to
be in the most recent packfile (the one with the newest mtime).
It is therefore common to be able to complete an entire operation
by accessing only one packfile, even if there are 25 packfiles
available to the repository.

Rather than opening and mmaping the corresponding .idx file for
every pack found, we now only open and map the .idx when we suspect
there might be an object of interest in there.

Of course we cannot known in advance which packfile contains an
object, so we still need to scan the entire packed_git list to
locate anything.  But odds are users want to access objects in the
most recently created packfiles first, and that may be all they
ever need for the current operation.

Junio observed in b867092f that placing recent packfiles before
older ones can slightly improve access times for recent objects,
without degrading it for historical object access.

This change improves upon Junio's observations by trying even harder
to avoid the .idx files that we won't need.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-26 20:28:08 -07:00
Martin Waitz
302b9282c9 rename dirlink to gitlink.
Unify naming of plumbing dirlink/gitlink concept:

git ls-files -z '*.[ch]' |
xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;'

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-21 23:34:54 -07:00
Daniel Barkalow
6b62816cb1 Move refspec parser from connect.c and cache.h to remote.{c,h}
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-20 21:32:56 -07:00
Junio C Hamano
45bde46bfb Merge branch 'dh/pack'
* dh/pack:
  Custom compression levels for objects and packs
2007-05-20 02:19:19 -07:00
Junio C Hamano
e223249a13 Merge branch 'mst/connect'
* mst/connect:
  connect: display connection progress
2007-05-20 02:18:50 -07:00
Junio C Hamano
cc93020f52 Merge branch 'np/pack'
* np/pack:
  deprecate the new loose object header format
  make "repack -f" imply "pack-objects --no-reuse-object"
  allow for undeltified objects not to be reused
2007-05-20 02:18:43 -07:00
René Scharfe
5e6cfc80e2 git-archive: convert archive entries like checkouts do
As noted by Johan Herland, git-archive is a kind of checkout and needs
to apply any checkout filters that might be configured.

This patch adds the convenience function convert_sha1_file which returns
a buffer containing the object's contents, after converting, if necessary
(i.e. it's a combination of read_sha1_file and convert_to_working_tree).
Direct calls to read_sha1_file in git-archive are then replaced by calls
to convert_sha1_file.

Since convert_sha1_file expects its path argument to be NUL-terminated --
a convention it inherits from convert_to_working_tree -- the patch also
changes the path handling in archive-tar.c to always NUL-terminate the
string.  It used to solely rely on the len field of struct strbuf before.

archive-zip.c already NUL-terminates the path and thus needs no such
change.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-18 16:36:45 -07:00
Michael S. Tsirkin
7841ce7985 connect: display connection progress
Make git notify the user about host resolution/connection attempts.
This is useful both as a progress indicator on slow links, and helps
reassure the user there are no firewall problems.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-16 12:48:18 -07:00
Junio C Hamano
f859c846e9 Add has_symlink_leading_path() function.
When we are applying a patch that creates a blob at a path, or
when we are switching from a branch that does not have a blob at
the path to another branch that has one, we need to make sure
that there is nothing at the path in the working tree, as such a
file is a local modification made by the user that would be lost
by the operation.

Normally, lstat() on the path and making sure ENOENT is returned
is good enough for that purpose.  However there is a twist.  We
may be creating a regular file arch/x86_64/boot/Makefile, while
removing an existing symbolic link at arch/x86_64/boot that
points at existing ../i386/boot directory that has Makefile in
it.  We always first check without touching filesystem and then
perform the actual operation, so when we verify the new file,
arch/x86_64/boot/Makefile, does not exist, we haven't removed
the symbolic link arc/x86_64/boot symbolic link yet.  lstat() on
the file sees through the symbolic link and reports the file is
there, which is not what we want.

The function has_symlink_leading_path() function takes a path,
and sees if any of the leading directory component is a symbolic
link.

When files in a new directory are created, we tend to process
them together because both index and tree are sorted.  The
function takes advantage of this and allows the caller to cache
and reuse which symbolic link on the filesystem caused the
function to return true.

The calling sequence would be:

	char last_symlink[PATH_MAX];

        *last_symlink = '\0';
        for each index entry {
		if (!lose)
			continue;
		if (lstat(it))
			if (errno == ENOENT)
				; /* happy */
			else
				error;
		else if (has_symlink_leading_path(it, last_symlink))
			; /* happy */
		else
			error; /* would lose local changes */
		unlink_entry(it, last_symlink);
	}

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-11 22:11:07 -07:00
Dana How
960ccca680 Custom compression levels for objects and packs
Add config variables pack.compression and core.loosecompression ,
and switch --compression=level to pack-objects.

Loose objects will be compressed using core.loosecompression if set,
else core.compression if set, else Z_BEST_SPEED.
Packed objects will be compressed using --compression=level if seen,
else pack.compression if set, else core.compression if set,
else Z_DEFAULT_COMPRESSION.  This is the "pack compression level".

Loose objects added to a pack undeltified will be recompressed
to the pack compression level if it is unequal to the current
loose compression level by the preceding rules,  or if the loose
object was written while core.legacyheaders = true.  Newly
deltified loose objects are always compressed to the current
pack compression level.

Previously packed objects added to a pack are recompressed
to the current pack compression level exactly when their
deltification status changes,  since the previous pack data
cannot be reused.

In either case,  the --no-reuse-object switch from the first
patch below will always force recompression to the current pack
compression level,  instead of assuming the pack compression level
hasn't changed and pack data can be reused when possible.

This applies on top of the following patches from Nicolas Pitre:
[PATCH] allow for undeltified objects not to be reused
[PATCH] make "repack -f" imply "pack-objects --no-reuse-object"

Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-10 15:23:09 -07:00
Nicolas Pitre
726f852b0e deprecate the new loose object header format
Now that we encourage and actively preserve objects in a packed form
more agressively than we did at the time the new loose object format and
core.legacyheaders were introduced, that extra loose object format
doesn't appear to be worth it anymore.

Because the packing of loose objects has to go through the delta match
loop anyway, and since most of them should end up being deltified in
most cases, there is really little advantage to have this parallel loose
object format as the CPU savings it might provide is rather lost in the
noise in the end.

This patch gets rid of core.legacyheaders, preserve the legacy format as
the only writable loose object format and deprecate the other one to
keep things simpler.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-10 15:22:33 -07:00
Junio C Hamano
a7b02ccf9a Add --date={local,relative,default}
This adds --date={local,relative,default} option to log family of commands,
to allow displaying timestamps in user's local timezone, relative time, or
the default format.

Existing --relative-date option is a synonym of --date=relative; we could
probably deprecate it in the long run.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 21:39:43 -07:00
Luiz Fernando N. Capitulino
efbc583126 entry.c: Use const qualifier for 'struct checkout' parameters
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-25 13:15:59 -07:00
Junio C Hamano
da94faf671 Merge branch 'jc/the-index'
* jc/the-index:
  Make read-cache.c "the_index" free.
  Move index-related variables into a structure.
2007-04-24 22:13:22 -07:00
Martin Koegler
a0cd87a570 add get_sha1_with_mode
get_sha1_with_mode basically behaves as get_sha1. It has an additional
parameter for storing the mode of the object.

If the mode can not be determined, it stores S_IFINVALID.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-24 00:08:49 -07:00
Martin Koegler
40689ae1ef Add S_IFINVALID mode
S_IFINVALID is used to signal, that no mode information is available.

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-24 00:08:49 -07:00
Junio C Hamano
4aab5b46f4 Make read-cache.c "the_index" free.
This makes all low-level functions defined in read-cache.c to
take an explicit index_state structure as their first parameter,
to specify which index to work on.  These functions
traditionally operated on "the_index" and were named foo_cache();
the counterparts this patch introduces are called foo_index().

The traditional foo_cache() functions are made into macros that
give "the_index" to their corresponding foo_index() functions.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Junio C Hamano
228e94f935 Move index-related variables into a structure.
This defines a index_state structure and moves index-related
global variables into it.  Currently there is one instance of
it, the_index, and everybody accesses it, so there is no code
change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:53:54 -07:00
Junio C Hamano
42c4b58059 Merge branch 'lt/objalloc'
* 'lt/objalloc':
  Clean up object creation to use more common code
  Use proper object allocators for unknown object nodes too
2007-04-21 17:42:02 -07:00
Junio C Hamano
a2d7c6c620 Merge branch 'jc/attr'
* 'jc/attr': (28 commits)
  lockfile: record the primary process.
  convert.c: restructure the attribute checking part.
  Fix bogus linked-list management for user defined merge drivers.
  Simplify calling of CR/LF conversion routines
  Document gitattributes(5)
  Update 'crlf' attribute semantics.
  Documentation: support manual section (5) - file formats.
  Simplify code to find recursive merge driver.
  Counto-fix in merge-recursive
  Fix funny types used in attribute value representation
  Allow low-level driver to specify different behaviour during internal merge.
  Custom low-level merge driver: change the configuration scheme.
  Allow the default low-level merge driver to be configured.
  Custom low-level merge driver support.
  Add a demonstration/test of customized merge.
  Allow specifying specialized merge-backend per path.
  merge-recursive: separate out xdl_merge() interface.
  Allow more than true/false to attributes.
  Document git-check-attr
  Change attribute negation marker from '!' to '-'.
  ...
2007-04-21 17:38:00 -07:00
Junio C Hamano
afb5b6a24b Merge branch 'lt/gitlink'
* lt/gitlink:
  Tests for core subproject support
  Expose subprojects as special files to "git diff" machinery
  Fix some "git ls-files -o" fallout from gitlinks
  Teach "git-read-tree -u" to check out submodules as a directory
  Teach git list-objects logic to not follow gitlinks
  Fix gitlink index entry filesystem matching
  Teach "git-read-tree -u" to check out submodules as a directory
  Teach git list-objects logic not to follow gitlinks
  Don't show gitlink directories when we want "other" files
  Teach git-update-index about gitlinks
  Teach directory traversal about subprojects
  Fix thinko in subproject entry sorting
  Teach core object handling functions about gitlinks
  Teach "fsck" not to follow subproject links
  Add "S_IFDIRLNK" file mode infrastructure for git links
  Add 'resolve_gitlink_ref()' helper function
  Avoid overflowing name buffer in deep directory structures
  diff-lib: use ce_mode_from_stat() rather than messing with modes manually
2007-04-21 17:21:10 -07:00
Junio C Hamano
99ebd06c18 Merge branch 'np/pack'
* np/pack: (27 commits)
  document --index-version for index-pack and pack-objects
  pack-objects: remove obsolete comments
  pack-objects: better check_object() performances
  add get_size_from_delta()
  pack-objects: make in_pack_header_size a variable of its own
  pack-objects: get rid of create_final_object_list()
  pack-objects: get rid of reuse_cached_pack
  pack-objects: clean up list sorting
  pack-objects: rework check_delta_limit usage
  pack-objects: equal objects in size should delta against newer objects
  pack-objects: optimize preferred base handling a bit
  clean up add_object_entry()
  tests for various pack index features
  use test-genrandom in tests instead of /dev/urandom
  simple random data generator for tests
  validate reused pack data with CRC when possible
  allow forcing index v2 and 64-bit offset treshold
  pack-redundant.c: learn about index v2
  show-index.c: learn about index v2
  sha1_file.c: learn about index version 2
  ...
2007-04-21 17:20:50 -07:00
Junio C Hamano
5e635e3960 lockfile: record the primary process.
The usual process flow is the main process opens and holds the lock to
the index, does its thing, perhaps spawning children during the course,
and then writes the resulting index out by releaseing the lock.

However, the lockfile interface uses atexit(3) to clean it up, without
regard to who actually created the lock.  This typically leads to a
confusing behaviour of lock being released too early when the child
exits, and then the parent process when it calls commit_lockfile()
finds that it cannot unlock it.

This fixes the problem by recording who created and holds the lock, and
upon atexit(3) handler, child simply ignores the lockfile the parent
created.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-21 11:55:23 -07:00
Alex Riesen
ac78e54804 Simplify calling of CR/LF conversion routines
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-20 23:24:34 -07:00
Junio C Hamano
17bee1947a Merge branch 'maint'
* maint:
  Use const qualifier for 'sha1' parameter in delete_ref function
2007-04-17 22:17:29 -07:00
Carlos Rica
1401f46bb4 Use const qualifier for 'sha1' parameter in delete_ref function
delete_ref function does not change the 'sha1' parameter. Non-const pointer
causes a compiler warning if you call to the function using a const argument.

Signed-off-by: Carlos Rica <jasampler@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-17 22:00:18 -07:00
Linus Torvalds
100c5f3b0b Clean up object creation to use more common code
This replaces the fairly odd "created_object()" function that did _most_
of the object setup with a more complete "create_object()" function that
also has a more natural calling convention.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-16 23:36:16 -07:00
Linus Torvalds
2c1cbec1e2 Use proper object allocators for unknown object nodes too
We used to use a different allocator scheme for when we didn't know the
object type.  That meant that objects that were created without any
up-front knowledge of the type would not go through the same allocation
paths as normal object allocations, and would miss out on the statistics.

But perhaps more importantly than the statistics (that are useful when
looking at memory usage but not much else), if we want to make the
object hash tables use a denser object pointer representation, we need
to make sure that they all go through the same blocking allocator.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-16 23:36:11 -07:00
Nicolas Pitre
54dab52ae8 add get_size_from_delta()
... which consists of existing code split out of packed_delta_info()
for other callers to use it as well.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-16 17:43:31 -07:00
Junio C Hamano
f48fd68887 attribute macro support
This adds "attribute macros" (for lack of better name).  So far,
we have low-level attributes such as crlf and diff, which are
defined in operational terms --- setting or unsetting them on a
particular path directly affects what is done to the path.  For
example, in order to decline diffs or crlf conversions on a
binary blob, no diffs on PostScript files, and treat all other
files normally, you would have something like these:

	*		diff crlf
	*.ps		!diff
	proprietary.o	!diff !crlf

That is fine as the operation goes, but gets unwieldy rather
rapidly, when we start adding more low-level attributes that are
defined in operational terms.  A near-term example of such an
attribute would be 'merge-3way' which would control if git
should attempt the usual 3-way file-level merge internally, or
leave merging to a specialized external program of user's
choice.  When it is added, we do _not_ want to force the users
to update the above to:

	*		diff crlf merge-3way
	*.ps		!diff
	proprietary.o	!diff !crlf !merge-3way

The way this patch solves this issue is to realize that the
attributes the user is assigning to paths are not defined in
terms of operations but in terms of what they are.

All of the three low-level attributes usually make sense for
most of the files that sane SCM users have git operate on (these
files are typically called "text').  Only a few cases, such as
binary blob, need exception to decline the "usual treatment
given to text files" -- and people mark them as "binary".

So this allows the $GIT_DIR/info/alternates and .gitattributes
at the toplevel of the project to also specify attributes that
assigns other attributes.  The syntax is '[attr]' followed by an
attribute name followed by a list of attribute names:

	[attr] binary	!diff !crlf !merge-3way

When "binary" attribute is set to a path, if the path has not
got diff/crlf/merge-3way attribute set or unset by other rules,
this rule unsets the three low-level attributes.

It is expected that the user level .gitattributes will be
expressed mostly in terms of attributes based on what the files
are, and the above sample would become like this:

	(built-in attribute configuration)
	[attr] binary	!diff !crlf !merge-3way
	*		diff crlf merge-3way

	(project specific .gitattributes)
	proprietary.o	binary

	(user preference $GIT_DIR/info/attributes)
	*.ps		!diff

There are a few caveats.

 * As described above, you can define these macros only in
   $GIT_DIR/info/attributes and toplevel .gitattributes.

 * There is no attempt to detect circular definition of macro
   attributes, and definitions are evaluated from bottom to top
   as usual to fill in other attributes that have not yet got
   values.  The following would work as expected:

	[attr] text	diff crlf
	[attr] ps	text !diff
	*.ps	ps

   while this would most likely not (I haven't tried):

	[attr] ps	text !diff
	[attr] text	diff crlf
	*.ps	ps

 * When a macro says "[attr] A B !C", saying that a path does
   not have attribute A does not let you tell anything about
   attributes B or C.  That is, given this:

	[attr] text	diff crlf
	[attr] ps	text !diff
	*.txt !ps

  path hello.txt, which would match "*.txt" pattern, would have
  "ps" attribute set to zero, but that does not make text
  attribute of hello.txt set to false (nor diff attribute set to
  true).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 15:43:06 -07:00
Junio C Hamano
d0bfd026a8 Add basic infrastructure to assign attributes to paths
This adds the basic infrastructure to assign attributes to
paths, in a way similar to what the exclusion mechanism does
based on $GIT_DIR/info/exclude and .gitignore files.

An attribute is just a simple string that does not contain any
whitespace.  They can be specified in $GIT_DIR/info/attributes
file, and .gitattributes file in each directory.

Each line in these files defines a pattern matching rule.
Similar to the exclusion mechanism, a later match overrides an
earlier match in the same file, and entries from .gitattributes
file in the same directory takes precedence over the ones from
parent directories.  Lines in $GIT_DIR/info/attributes file are
used as the lowest precedence default rules.

A line is either a comment (an empty line, or a line that begins
with a '#'), or a rule, which is a whitespace separated list of
tokens.  The first token on the line is a shell glob pattern.
The rest are names of attributes, each of which can optionally
be prefixed with '!'.  Such a line means "if a path matches this
glob, this attribute is set (or unset -- if the attribute name
is prefixed with '!').  For glob matching, the same "if the
pattern does not have a slash in it, the basename of the path is
matched with fnmatch(3) against the pattern, otherwise, the path
is matched with the pattern with FNM_PATHNAME" rule as the
exclusion mechanism is used.

This does not define what an attribute means.  Tying an
attribute to various effects it has on git operation for paths
that have it will be specified separately.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00
Junio C Hamano
566f5b217d Merge branch 'maint'
* maint:
  GIT 1.5.1.1
  cvsserver: Fix handling of diappeared files on update
  fsck: do not complain on detached HEAD.
  (encode_85, decode_85): Mark source buffer pointer as "const".
2007-04-11 18:43:01 -07:00
Jim Meyering
f981577202 (encode_85, decode_85): Mark source buffer pointer as "const".
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-11 00:51:20 -07:00
Linus Torvalds
9eec4795d4 Add "S_IFDIRLNK" file mode infrastructure for git links
This just adds the basic helper functions to recognize and work with git
tree entries that are links to other git repositories ("subprojects").
They still aren't actually connected up to any of the code-paths, but
now all the infrastructure is in place.

The next commit will start actually adding actual subproject support.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 13:46:58 -07:00
Nicolas Pitre
57059091fa get rid of num_packed_objects()
The coming index format change doesn't allow for the number of objects
to be determined from the size of the index file directly.  Instead, Let's
initialize a field in the packed_git structure with the object count when
the index is validated since the count is always known at that point.

While at it let's reorder some struct packed_git fields to avoid padding
due to needed 64-bit alignment for some of them.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10 12:48:14 -07:00
Junio C Hamano
68faf68938 A new merge stragety 'subtree'.
This merge strategy largely piggy-backs on git-merge-recursive.
When merging trees A and B, if B corresponds to a subtree of A,
B is first adjusted to match the tree structure of A, instead of
reading the trees at the same level.  This adjustment is also
done to the common ancestor tree.

If you are pulling updates from git-gui repository into git.git
repository, the root level of the former corresponds to git-gui/
subdirectory of the latter.  The tree object of git-gui's toplevel
is wrapped in a fake tree object, whose sole entry has name 'git-gui'
and records object name of the true tree, before being used by
the 3-way merge code.

If you are merging the other way, only the git-gui/ subtree of
git.git is extracted and merged into git-gui's toplevel.

The detection of corresponding subtree is done by comparing the
pathnames and types in the toplevel of the tree.

Heuristics galore!  That's the git way ;-).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-07 02:29:40 -07:00
Junio C Hamano
ee9693e246 Merge branch 'jc/index-output'
* jc/index-output:
  git-read-tree --index-output=<file>
  _GIT_INDEX_OUTPUT: allow plumbing to output to an alternative index file.

Conflicts:

	builtin-apply.c
2007-04-07 02:26:24 -07:00
Junio C Hamano
fd1c3bf053 Rename add_file_to_index() to add_file_to_cache()
This function was not called "add_file_to_cache()" only because
an ancient program, update-cache, used that name as an internal
function name that does something slightly different.  Now that
is gone, we can take over the better name.

The plan is to name all functions that operate on the default
index xxx_cache().  Later patches create a variant of them that
take an explicit parameter xxx_index(), and then turn
xxx_cache() functions into macros that use "the_index".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano
0424138d57 Fix bogus error message from merge-recursive error path
This error message should not usually trigger, but the function
make_cache_entry() called by add_cacheinfo() can return early
without calling into refresh_cache_entry() that sets cache_errno.

Also the error message had a wrong function name reported, and
it did not say anything about which path failed either.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Nicolas Pitre
d72308e01c clean up and optimize nth_packed_object_sha1() usage
Let's avoid the open coded pack index reference in pack-object and use
nth_packed_object_sha1() instead.  This will help encapsulating index
format differences in one place.

And while at it there is no reason to copy SHA1's over and over while a
direct pointer to it in the index will do just fine.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 14:59:47 -07:00
Junio C Hamano
5e7f56ac33 git-read-tree --index-output=<file>
This corrects the interface mistake of the previous one, and
gives a command line parameter to the only plumbing command that
currently needs it: "git-read-tree".

We can add the calls to set_alternate_index_output() to other
plumbing commands that update the index if/when needed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-03 23:44:32 -07:00
Junio C Hamano
30ca07a249 _GIT_INDEX_OUTPUT: allow plumbing to output to an alternative index file.
When defined, this allows plumbing commands that update the
index (add, apply, checkout-index, merge-recursive, mv,
read-tree, rm, update-index, and write-tree) to write their
resulting index to an alternative index file while holding a
lock to the original index file.  With this, git-commit that
jumps the index does not have to make an extra copy of the index
file, and more importantly, it can do the update while holding
the lock on the index.

However, I think the interface to let an environment variable
specify the output is a mistake, as shown in the documentation.
If a curious user has the environment variable set to something
other than the file GIT_INDEX_FILE points at, almost everything
will break.  This should instead be a command line parameter to
tell these plumbing commands to write the result in the named
file, to prevent stupid mistakes.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-03 23:44:32 -07:00
Nicolas Pitre
ce9fbf16e0 index-pack: use hash_sha1_file()
Use hash_sha1_file() instead of duplicating code to compute object SHA1.
While at it make it accept a const pointer.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-20 22:09:57 -07:00
Shawn O. Pearce
18bdec1118 Limit the size of the new delta_base_cache
The new configuration variable core.deltaBaseCacheLimit allows the
user to control how much memory they are willing to give to Git for
caching base objects of deltas.  This is not normally meant to be
a user tweakable knob; the "out of the box" settings are meant to
be suitable for almost all workloads.

We default to 16 MiB under the assumption that the cache is not
meant to consume all of the user's available memory, and that the
cache's main purpose was to cache trees, for faster path limiters
during revision traversal.  Since trees tend to be relatively small
objects, this relatively small limit should still allow a large
number of objects.

On the other hand we don't want the cache to start storing 200
different versions of a 200 MiB blob, as this could easily blow
the entire address space of a 32 bit process.

We evict OBJ_BLOB from the cache first (credit goes to Junio) as
we want to favor OBJ_TREE within the cache.  These are the objects
that have the highest inflate() startup penalty, as they tend to
be small and thus don't have that much of a chance to ammortize
that penalty over the entire data.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-18 22:43:37 -07:00
Junio C Hamano
3635a18770 Merge branch 'sp/run-command'
* sp/run-command:
  Use run_command within send-pack
  Use run_command within receive-pack to invoke index-pack
  Use run_command within merge-index
  Use run_command for proxy connections
  Use RUN_GIT_CMD to run push backends
  Correct new compiler warnings in builtin-revert
  Replace fork_with_pipe in bundle with run_command
  Teach run-command to redirect stdout to /dev/null
  Teach run-command about stdout redirection
2007-03-18 22:21:06 -07:00
Nicolas Pitre
4287307833 [PATCH] clean up pack index handling a bit
Especially with the new index format to come, it is more appropriate
to encapsulate more into check_packed_git_idx() and assume less of the
index format in struct packed_git.

To that effect, the index_base is renamed to index_data with void * type
so it is not used directly but other pointers initialized with it. This
allows for a couple pointer cast removal, as well as providing a better
generic name to grep for when adding support for new index versions or
formats.

And index_data is declared const too while at it.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-16 21:27:36 -07:00
Junio C Hamano
c49b260e99 Merge branch 'jc/repack'
* jc/repack:
  prepare_packed_git(): sort packs by age and localness.
2007-03-14 02:08:48 -07:00
Shawn O. Pearce
1a8f27413b Correct new compiler warnings in builtin-revert
The new builtin-revert code introduces a few new compiler errors
when I'm building with my stricter set of checks enabled in CFLAGS.
These all just stem from trying to store a constant string into
a non-const char*.  Simple fix, make the variables const char*.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-12 23:40:18 -07:00
Junio C Hamano
b867092fec prepare_packed_git(): sort packs by age and localness.
When accessing objects, we first look for them in packs that
are linked together in the reverse order of discovery.

Since younger packs tend to contain more recent objects, which
are more likely to be accessed often, and local packs tend to
contain objects more relevant to our specific projects, sort the
list of packs before starting to access them.  In addition,
favoring local packs over the ones borrowed from alternates can
be a win when alternates are mounted on network file systems.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-11 00:04:05 -08:00
Paolo Bonzini
0746d19a82 git-branch, git-checkout: autosetup for remote branch tracking
In order to track and build on top of a branch 'topic' you track from
your upstream repository, you often would end up doing this sequence:

  git checkout -b mytopic origin/topic
  git config --add branch.mytopic.remote origin
  git config --add branch.mytopic.merge refs/heads/topic

This would first fork your own 'mytopic' branch from the 'topic'
branch you track from the 'origin' repository; then it would set up two
configuration variables so that 'git pull' without parameters does the
right thing while you are on your own 'mytopic' branch.

This commit adds a --track option to git-branch, so that "git
branch --track mytopic origin/topic" performs the latter two actions
when creating your 'mytopic' branch.

If the configuration variable branch.autosetupmerge is set to true, you
do not have to pass the --track option explicitly; further patches in
this series allow setting the variable with a "git remote add" option.
The configuration variable is off by default, and there is a --no-track
option to countermand it even if the variable is set.

Signed-off-by: Paolo Bonzini  <bonzini@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-10 23:41:58 -08:00
Junio C Hamano
8509fed75d Merge branch 'jc/fsck'
* jc/fsck:
  fsck: exit with non-zero status upon errors
  unpack_sha1_file(): detect corrupt loose object files.
  fsck: fix broken loose object check.
2007-03-10 23:10:26 -08:00
Shawn O. Pearce
c4001d92be Use off_t when we really mean a file offset.
Not all platforms have declared 'unsigned long' to be a 64 bit value,
but we want to support a 64 bit packfile (or close enough anyway)
in the near future as some projects are getting large enough that
their packed size exceeds 4 GiB.

By using off_t, the POSIX type that is declared to mean an offset
within a file, we support whatever maximum file size the underlying
operating system will handle.  For most modern systems this is up
around 2^60 or higher.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:06:25 -08:00
Shawn O. Pearce
326bf39677 Use uint32_t for all packed object counts.
As we permit up to 2^32-1 objects in a single packfile we cannot
use a signed int to represent the object offset within a packfile,
after 2^31-1 objects we will start seeing negative indexes and
error out or compute bad addresses within the mmap'd index.

This is a minor cleanup that does not introduce any significant
logic changes.  It is roach free.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:02:33 -08:00
Shawn O. Pearce
3a55602eec General const correctness fixes
We shouldn't attempt to assign constant strings into char*, as the
string is not writable at runtime.  Likewise we should always be
treating unsigned values as unsigned values, not as signed values.

Most of these are very straightforward.  The only exception is the
(unnecessary) xstrdup/free in builtin-branch.c for the detached
head case.  Since this is a user-level interactive type program
and that particular code path is executed no more than once, I feel
that the extra xstrdup call is well worth the easy elimination of
this warning.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 10:47:10 -08:00
Junio C Hamano
7efbff7531 unpack_sha1_file(): detect corrupt loose object files.
We did not detect broken loose object files, either when
underlying inflate() signalled the breakage, nor inflate()
finished and we had garbage trailing at the end.  We do better
now.

We also make unpack_sha1_file() a static function to
sha1_file.c, since it is not used by anybody outside.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-05 00:55:19 -08:00
Johannes Sixt
78a8d641c1 Add core.symlinks to mark filesystems that do not support symbolic links.
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.

This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system.  A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).

Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-02 16:58:05 -08:00
Junio C Hamano
8ab3e18586 Merge branch 'js/commit-format'
* js/commit-format:
  show_date(): rename the "relative" parameter to "mode"
  Actually make print_wrapped_text() useful
  pretty-formats: add 'format:<string>'
2007-03-02 00:37:12 -08:00
Junio C Hamano
53bca91a7d index_fd(): pass optional path parameter as hint for blob conversion
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 12:00:00 -08:00
Junio C Hamano
edaec3fbe8 index_fd(): use enum object_type instead of type name string.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 12:00:00 -08:00
Nicolas Pitre
fef742c4ed make sure enum object_type is signed
This allows for keeping the common idiom which consists of using
negative values to signal error conditions by ensuring that the enum
will be a signed type.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 21:37:46 -08:00
Johannes Schindelin
f8493ec09b show_date(): rename the "relative" parameter to "mode"
Now, show_date() can print three different kinds of dates: normal,
relative and short (%Y-%m-%s) dates.

To achieve this, the "int relative" was changed to "enum date_mode
mode", which has three states: DATE_NORMAL, DATE_RELATIVE and
DATE_SHORT.

Since existing users of show_date() only call it with relative_date
being either 0 or 1, and DATE_NORMAL and DATE_RELATIVE having these
values, no behaviour is changed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 17:29:37 -08:00
Nicolas Pitre
21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Junio C Hamano
ef1a5c2fa8 Merge branches 'lt/crlf' and 'jc/apply-config'
* lt/crlf:
  Teach core.autocrlf to 'git apply'
  t0020: add test for auto-crlf
  Make AutoCRLF ternary variable.
  Lazy man's auto-CRLF

* jc/apply-config:
  t4119: test autocomputing -p<n> for traditional diff input.
  git-apply: guess correct -p<n> value for non-git patches.
  git-apply: notice "diff --git" patch again
  Fix botched "leak fix"
  t4119: add test for traditional patch and different p_value
  apply: fix memory leak in prefix_one()
  git-apply: require -p<n> when working in a subdirectory.
  git-apply: do not lose cwd when run from a subdirectory.
  Teach 'git apply' to look at $HOME/.gitconfig even outside of a repository
  Teach 'git apply' to look at $GIT_DIR/config
2007-02-22 21:34:36 -08:00
Junio C Hamano
185c975faa Do not take mode bits from index after type change.
When we do not trust executable bit from lstat(2), we copied
existing ce_mode bits without checking if the filesystem object
is a regular file (which is the only thing we apply the "trust
executable bit" business) nor if the blob in the index is a
regular file (otherwise, we should do the same as registering a
new regular file, which is to default non-executable).

Noticed by Johannes Sixt.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-16 22:56:06 -08:00
Linus Torvalds
6c510bee20 Lazy man's auto-CRLF
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.

Anyway, BY DEFAULT it is off regardless, because it requires a

	[core]
		AutoCRLF = true

in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).

But you can actually enable it on UNIX, and it will cause:

 - "git update-index" will write blobs without CRLF
 - "git diff" will diff working tree files without CRLF
 - "git checkout" will write files to the working tree _with_ CRLF

and things work fine.

Funnily, it actually shows an odd file in git itself:

	git clone -n git test-crlf
	cd test-crlf
	git config core.autocrlf true
	git checkout
	git diff

shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.

Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).

I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.

NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.

The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14 11:19:22 -08:00
Johannes Schindelin
eb3a48221f log --reflog: use dwim_log
Since "git log origin/master" uses dwim_log() to match
"refs/remotes/origin/master", it makes sense to do that for
"git log --reflog", too.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-08 17:48:22 -08:00
Junio C Hamano
d66b37bb19 Add pretend_sha1_file() interface.
The new interface allows an application to temporarily hash a
small number of objects and pretend that they are available in
the object store without actually writing them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-05 14:55:11 -08:00
Junio C Hamano
798123af21 Rename get_ident() to fmt_ident() and make it available to outside
This makes the functionality of ident.c::get_ident() available to
other callers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-04 17:50:14 -08:00
Nicolas Pitre
8b5157e407 add logref support to git-symbolic-ref
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-28 02:16:46 -08:00
Junio C Hamano
01754769ab Don't force everybody to call setup_ident().
Back when only handful commands that created commit and tag were
the only users of committer identity information, it made sense
to explicitly call setup_ident() to pre-fill the default value
from the gecos information.  But it is much simpler for programs
to make the call automatic when get_ident() is called these days,
since many more programs want to use the information when updating
the reflog.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-28 01:58:50 -08:00
Junio C Hamano
cb280e1075 Allow non-developer to clone, checkout and fetch more easily.
The code that uses committer_info() in reflog can barf and die
whenever it is asked to update a ref.  And I do not think
calling ignore_missing_committer_name() upfront like recent
receive-pack did in the aplication is a reasonable workaround.

What the patch does.

 - git_committer_info() takes one parameter.  It used to be "if
   this is true, then die() if the name is not available due to
   bad GECOS, otherwise issue a warning once but leave the name
   empty".  The reason was because we wanted to prevent bad
   commits from being made by git-commit-tree (and its
   callers).  The value 0 is only used by "git var -l".

   Now it takes -1, 0 or 1.  When set to -1, it does not
   complain but uses the pw->pw_name when name is not
   available.  Existing 0 and 1 values mean the same thing as
   they used to mean before.  0 means issue warnings and leave
   it empty, 1 means barf and die.

 - ignore_missing_committer_name() and its existing caller
   (receive-pack, to set the reflog) have been removed.

 - git-format-patch, to come up with the phoney message ID when
   asked to thread, now passes -1 to git_committer_info().  This
   codepath uses only the e-mail part, ignoring the name.  It
   used to barf and die.  The other call in the same program
   when asked to add signed-off-by line based on committer
   identity still passes 1 to make sure it barfs instead of
   adding a bogus s-o-b line.

 - log_ref_write in refs.c, to come up with the name to record
   who initiated the ref update in the reflog, passes -1.  It
   used to barf and die.

The last change means that git-update-ref, git-branch, and
commit walker backends can now be used in a repository with
reflog by somebody who does not have the user identity required
to make a commit.  They all used to barf and die.

I've run tests and all of them seem to pass, and also tried "git
clone" as a user whose GECOS is empty -- git clone works again
now (it was broken when reflog was enabled by default).

But this definitely needs extra sets of eyeballs.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-25 21:16:58 -08:00
Johannes Schindelin
68025633e3 Do not verify filenames in a bare repository
For example, it makes no sense to check the presence of a file
named "HEAD" when calling "git log HEAD" in a bare repository.

Noticed by Han-Wen Nienhuys.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
2007-01-20 19:10:26 -08:00
Junio C Hamano
e86eb6668e dwim_ref(): Separate name-to-ref DWIM code out.
I'll be using this in another function to figure out what to
pass to resolve_ref().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-19 17:57:53 -08:00
Junio C Hamano
b18b00a661 Use fixed-size integers for .idx file I/O
This attempts to finish what Simon started in the previous commit.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-18 14:11:50 -08:00
Chris Wedgwood
276bc2caab cache.h; fix a couple of prototypes
Trivial patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-16 22:46:57 -08:00
Shawn O. Pearce
e6e2bd6201 Remove read_or_die in favor of better error messages.
Originally I introduced read_or_die for the purpose of reading
the pack header and trailer, and I was too lazy to print proper
error messages.

Linus Torvalds <torvalds@osdl.org>:
> For a read error, at the very least you have to say WHICH FILE
> couldn't be read, because it's usually a matter of some file just
> being too short, not some system-wide problem.

and of course Linus is right. Make it so.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-14 00:42:41 -08:00
Junio C Hamano
e861ce1692 Merge branch 'jc/bare'
* jc/bare:
  Disallow working directory commands in a bare repository.
  git-fetch: allow updating the current branch in a bare repository.
  Introduce is_bare_repository() and core.bare configuration variable
  Move initialization of log_all_ref_updates
2007-01-11 16:50:36 -08:00
Junio C Hamano
c388761c15 Merge branch 'jc/detached-head'
* jc/detached-head:
  git-checkout: handle local changes sanely when detaching HEAD
  git-checkout: safety check for detached HEAD checks existing refs
  git-checkout: fix branch name output from the command
  git-checkout: safety when coming back from the detached HEAD state.
  git-checkout: rewording comments regarding detached HEAD.
  git-checkout: do not warn detaching HEAD when it is already detached.
  Detached HEAD (experimental)
  git-branch: show detached HEAD
  git-status: show detached HEAD
2007-01-11 16:47:34 -08:00
Andy Whitcroft
93d26e4cb9 short i/o: fix calls to read to use xread or read_in_full
We have a number of badly checked read() calls.  Often we are
expecting read() to read exactly the size we requested or fail, this
fails to handle interrupts or short reads.  Add a read_in_full()
providing those semantics.  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xread().

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Andy Whitcroft
e08140568a short i/o: clean up the naming for the write_{in,or}_xxx family
We recently introduced a write_in_full() which would either write
the specified object or emit an error message and fail.  In order
to fix the read side we now want to introduce a read_in_full()
but without an error emit.  This patch cleans up the naming
of this family of calls:

1) convert the existing write_or_whine() to write_or_whine_pipe()
   to better indicate its pipe specific nature,
2) convert the existing write_in_full() calls to write_or_whine()
   to better indicate its nature,
3) introduce a write_in_full() providing a write or fail semantic,
   and
4) convert write_or_whine() and write_or_whine_pipe() to use
   write_in_full().

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Junio C Hamano
c847f53712 Detached HEAD (experimental)
This allows "git checkout v1.4.3" to dissociate the HEAD of
repository from any branch.  After this point, "git branch"
starts reporting that you are not on any branch.  You can go
back to an existing branch by saying "git checkout master", for
example.

This is still experimental.  While I think it makes sense to
allow commits on top of detached HEAD, it is rather dangerous
unless you are careful in the current form.  Next "git checkout
master" will obviously lose what you have done, so we might want
to require "git checkout -f" out of a detached HEAD if we find
that the HEAD commit is not an ancestor of any other branches.
There is no such safety valve implemented right now.

On the other hand, the reason the user did not start the ad-hoc
work on a new branch with "git checkout -b" was probably because
the work was of a throw-away nature, so the convenience of not
having that safety valve might be even better.  The user, after
accumulating some commits on top of a detached HEAD, can always
create a new branch with "git checkout -b" not to lose useful
work done while the HEAD was detached.

We'll see.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 03:02:11 -08:00
Junio C Hamano
7d1864ce67 Introduce is_bare_repository() and core.bare configuration variable
This removes the old is_bare_git_dir(const char *) to ask if a
directory, if it is a GIT_DIR, is a bare repository, and
replaces it with is_bare_repository(void *).  The function looks
at core.bare configuration variable if exists but uses the old
heuristics: if it is ".git" or ends with "/.git", then it does
not look like a bare repository, otherwise it does.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-07 21:36:35 -08:00
Junio C Hamano
cf2999eb4c Merge branch 'sp/mmap'
* sp/mmap: (27 commits)
  Spell default packedgitlimit slightly differently
  Increase packedGit{Limit,WindowSize} on 64 bit systems.
  Update packedGit config option documentation.
  mmap: set FD_CLOEXEC for file descriptors we keep open for mmap()
  pack-objects: fix use of use_pack().
  Fix random segfaults in pack-objects.
  Cleanup read_cache_from error handling.
  Replace mmap with xmmap, better handling MAP_FAILED.
  Release pack windows before reporting out of memory.
  Default core.packdGitWindowSize to 1 MiB if NO_MMAP.
  Test suite for sliding window mmap implementation.
  Create pack_report() as a debugging aid.
  Support unmapping windows on 'temporary' packfiles.
  Improve error message when packfile mmap fails.
  Ensure core.packedGitWindowSize cannot be less than 2 pages.
  Load core configuration in git-verify-pack.
  Fully activate the sliding window pack access.
  Unmap individual windows rather than entire files.
  Document why header parsing won't exceed a window.
  Loop over pack_windows when inflating/accessing data.
  ...

Conflicts:

	cache.h
	pack-check.c
2007-01-07 00:12:47 -08:00
Junio C Hamano
e27e609bbf Merge branch 'maint'
* maint:
  pack-check.c::verify_packfile(): don't run SHA-1 update on huge data
  Fix infinite loop when deleting multiple packed refs.
2007-01-04 22:28:21 -08:00
Junio C Hamano
1084b845d9 Fix infinite loop when deleting multiple packed refs.
It was stupid to link the same element twice to lock_file_list
and end up in a loop, so we certainly need a fix.

But it is not like we are taking a lock on multiple files in
this case.  It is just that we leave the linked element on the
list even after commit_lock_file() successfully removes the
cruft.

We cannot remove the list element in commit_lock_file(); if we
are interrupted in the middle of list manipulation, the call to
remove_lock_file_on_signal() will happen with a broken list
structure pointed by lock_file_list, which would cause the cruft
to remain, so not removing the list element is the right thing
to do.  Instead we should be reusing the element already on the
list.

There is already a code for that in lock_file() function in
lockfile.c.  The code checks lk->next and the element is linked
only when it is not already on the list -- which is incorrect
for the last element on the list (which has NULL in its next
field), but if you read the check as "is this element already on
the list?" it actually makes sense.  We do not want to link it
on the list again, nor we would want to set up signal/atexit
over and over.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-03 01:22:35 -08:00
Andy Whitcroft
825cee7b28 send pack check for failure to send revisions list
When passing the revisions list to pack-objects we do not check for
errors nor short writes.  Introduce a new write_in_full which will
handle short writes and report errors to the caller.  Use this to
short cut the send on failure, allowing us to wait for and report
the child in case the failure is its fault.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-02 23:33:21 -08:00
Shawn O. Pearce
a53128b601 Create pack_report() as a debugging aid.
Much like the alloc_report() function can be useful to report on
object allocation statistics while debugging the new pack_report()
function can be useful to report on the behavior of the mmap window
code used for packfile access.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
60bb8b1453 Fully activate the sliding window pack access.
This finally turns on the sliding window behavior for packfile data
access by mapping limited size windows and chaining them under the
packed_git->windows list.

We consider a given byte offset to be within the window only if there
would be at least 20 bytes (one hash worth of data) accessible after
the requested offset.  This range selection relates to the contract
that use_pack() makes with its callers, allowing them to access
one hash or one object header without needing to call use_pack()
for every byte of data obtained.

In the worst case scenario we will map the same page of data twice
into memory: once at the end of one window and once again at the
start of the next window.  This duplicate page mapping will happen
only when an object header or a delta base reference is spanned
over the end of a window and is always limited to just one page of
duplication, as no sane operating system will ever have a page size
smaller than a hash.

I am assuming that the possible wasted page of virtual address
space is going to perform faster than the alternatives, which
would be to copy the object header or ref delta into a temporary
buffer prior to parsing, or to check the window range on every byte
during header parsing.  We may decide to revisit this decision in
the future since this is just a gut instinct decision and has not
actually been proven out by experimental testing.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
03e79c88aa Replace use_packed_git with window cursors.
Part of the implementation concept of the sliding mmap window for
pack access is to permit multiple windows per pack to be mapped
independently.  Since the inuse_cnt is associated with the mmap and
not with the file, this value is in struct pack_window and needs to
be incremented/decremented for each pack_window accessed by any code.

To faciliate that implementation we need to replace all uses of
use_packed_git() and unuse_packed_git() with a different API that
follows struct pack_window objects rather than struct packed_git.

The way this works is when we need to start accessing a pack for
the first time we should setup a new window 'cursor' by declaring
a local and setting it to NULL:

  struct pack_windows *w_curs = NULL;

To obtain the memory region which contains a specific section of
the pack file we invoke use_pack(), supplying the address of our
current window cursor:

  unsigned int len;
  unsigned char *addr = use_pack(p, &w_curs, offset, &len);

the returned address `addr` will be the first byte at `offset`
within the pack file.  The optional variable len will also be
updated with the number of bytes remaining following the address.

Multiple calls to use_pack() with the same window cursor will
update the window cursor, moving it from one window to another
when necessary.  In this way each window cursor variable maintains
only one struct pack_window inuse at a time.

Finally before exiting the scope which originally declared the window
cursor we must invoke unuse_pack() to unuse the current window (which
may be different from the one that was first obtained from use_pack):

  unuse_pack(&w_curs);

This implementation is still not complete with regards to multiple
windows, as only one window per pack file is supported right now.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
9bc879c1ce Refactor how we open pack files to prepare for multiple windows.
To efficiently support mmaping of multiple regions of the same pack
file we want to keep the pack's file descriptor open while we are
actively working with that pack.  So we are now keeping that file
descriptor in packed_git.pack_fd and closing it only after we unmap
the last window.

This is going to increase the number of file descriptors that are
in use at once, however that will be bounded by the total number of
pack files present and therefore should not be very high.  It is
a small tradeoff which we may need to revisit after some testing
can be done on various repositories and systems.

For code clarity we also want to seperate out the implementation
of how we open a pack file from the implementation which locates
a suitable window (or makes a new one) from the given pack file.
Since this is a rather large delta I'm taking advantage of doing
it now, in a fairly isolated change.

When we open a pack file we need to examine the header and trailer
without having a mmap in place, as we may only need to mmap
the middle section of this particular pack.  Consequently the
verification code has been refactored to make use of the new
read_or_die function.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
75025ccdb7 Create read_or_die utility routine.
Like write_or_die read_or_die reads the entire length requested
or it kills the current process with a die call.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
2dc3a23409 Use off_t for index and pack file lengths.
Since the index_size and pack_size members of struct packed_git
are the lengths of those corresponding files we should use the
off_t size of the operating system to store these file lengths,
rather than an unsigned long.  This would help in the future should
we ever resurrect Junio's 64 bit index implementation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
c41ee586dc Refactor packed_git to prepare for sliding mmap windows.
The idea behind the sliding mmap window pack reader implementation
is to have multiple mmap regions active against the same pack file,
thereby allowing the process to mmap in only the active/hot sections
of the pack and reduce overall virtual address space usage.

To implement this we need to refactor the mmap related data
(pack_base, pack_use_cnt) out of struct packed_git and move them
into a new struct pack_window.

We are refactoring the code to support a single struct pack_window
per packfile, thereby emulating the prior behavior of mmap'ing the
entire pack file.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
77ccc5bbd1 Introduce new config option for mmap limit.
Rather than hardcoding the maximum number of bytes which can be
mmapped from pack files we should make this value configurable,
allowing the end user to increase or decrease this limit on a
per-repository basis depending on the size of the repository
and the capabilities of their operating system.

In general users should not need to manually tune such a low-level
setting within the core code, but being able to artifically limit
the number of bytes which we can mmap at once from pack files will
make it easier to craft test cases for the new mmap sliding window
implementation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
4d703a1a90 Replace unpack_entry_gently with unpack_entry.
The unpack_entry_gently function currently has only two callers:
the delta base resolution in sha1_file.c and the main loop of
pack-check.c.  Both of these must change to using unpack_entry
directly when we implement sliding window mmap logic, so I'm doing
it earlier to help break down the change set.

This may cause a slight performance decrease for delta base
resolution as well as for pack-check.c's verify_packfile(), as
the pack use counter will be incremented and decremented for every
object that is unpacked.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Junio C Hamano
d2c11a38c4 UTF-8: introduce i18n.logoutputencoding.
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8).  Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.

The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-27 16:41:33 -08:00
Junio C Hamano
d4ebc36c5e Use preprocessor constants for environment variable names.
We broke the discipline Linus set up to allow compiler help us
avoid typos in environment names in the early days of git over
time.  This defines a handful preprocessor constants for
environment variable names used in relatively core parts of the
system.

I've left out variable names specific to subsystems such as HTTP
and SSL as I do not think they are big problems.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-19 01:51:51 -08:00
Junio C Hamano
bdf17a02fd Merge branch 'js/branch-config'
* js/branch-config:
  git-branch: rename config vars branch.<branch>.*, too
  add a function to rename sections in the config
2006-12-17 18:27:17 -08:00
Shawn O. Pearce
a7f196a746 Default GIT_COMMITTER_NAME to login name in recieve-pack.
If GIT_COMMITTER_NAME is not available in receive-pack but reflogs
are enabled we would normally die out with an error message asking
the user to correct their environment settings.

Now that reflogs are enabled by default in (what we guessed to be)
non-bare Git repositories this may cause problems for some users
who don't have their full name in the gecos field and who don't
have access to the remote system to correct the problem.

So rather than die()'ing out in receive-pack when we try to log a
ref change and have no committer name we default to the username,
as obtained from the host's password database.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:44 -08:00
Johannes Schindelin
0667fcfb62 add a function to rename sections in the config
Given a config like this:

	# A config
	[very.interesting.section]
		not

The command

	$ git repo-config --rename-section very.interesting.section bla.1

will lead to this config:

	# A config
	[bla "1"]
		not

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-16 13:28:20 -08:00
Shawn O. Pearce
0bee591869 Enable reflogs by default in any repository with a working directory.
New and experienced Git users alike are finding out too late that
they forgot to enable reflogs in the current repository, and cannot
use the information stored within it to recover from an incorrectly
entered command such as `git reset --hard HEAD^^^` when they really
meant HEAD^^ (aka HEAD~2).

So enable reflogs by default in all future versions of Git, unless
the user specifically disables it with:

  [core]
    logAllRefUpdates = false

in their .git/config or ~/.gitconfig.

We only enable reflogs in repositories that have a working directory
associated with them, as shared/bare repositories do not have
an easy means to prune away old log entries, or may fail logging
entirely if the user's gecos information is not valid during a push.
This heuristic was suggested on the mailing list by Junio.

Documentation was also updated to indicate the new default behavior.
We probably should start to teach usuing the reflog to recover
from mistakes in some of the tutorial material, as new users are
likely to make a few along the way and will feel better knowing
they can recover from them quickly and easily, without fsck-objects'
lost+found features.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-15 22:31:01 -08:00
Nicolas Pitre
da093d3750 improve fetch-pack's handling of kept packs
Since functions in fetch-clone.c were only used from fetch-pack.c,
its content has been merged with fetch-pack.c.  This allows for better
coupling of features with much simpler implementations.

One new thing is that the (abscence of) --thin also enforce it on
index-pack now, such that index-pack will abort if a thin pack was
_not_ asked for.

The -k or --keep, when provided twice, now causes the fetched pack
to be left as a kept pack just like receive-pack currently does.
Eventually this will be used to close a race against concurrent
repacking.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-03 00:24:07 -08:00
Shawn Pearce
fc04c412d8 Teach receive-pack how to keep pack files based on object count.
Since keeping a pushed pack or exploding it into loose objects
should be a local repository decision this teaches receive-pack
to decide if it should call unpack-objects or index-pack --stdin
--fix-thin based on the setting of receive.unpackLimit and the
number of objects contained in the received pack.

If the number of objects (hdr_entries) in the received pack is
below the value of receive.unpackLimit (which is 5000 by default)
then we unpack-objects as we have in the past.

If the hdr_entries >= receive.unpackLimit then we call index-pack and
ask it to include our pid and hostname in the .keep file to make it
easier to identify why a given pack has been kept in the repository.

Currently this leaves every received pack as a kept pack.  We really
don't want that as received packs will tend to be small.  Instead we
want to delete the .keep file automatically after all refs have
been updated.  That is being left as room for future improvement.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-03 00:24:07 -08:00
Junio C Hamano
58a1e0e83b Merge branch 'lj/refs'
* lj/refs: (63 commits)
  Fix show-ref usagestring
  t3200: git-branch testsuite update
  sha1_name.c: avoid compilation warnings.
  Make git-branch a builtin
  ref-log: fix D/F conflict coming from deleted refs.
  git-revert with conflicts to behave as git-merge with conflicts
  core.logallrefupdates thinko-fix
  git-pack-refs --all
  core.logallrefupdates create new log file only for branch heads.
  Remove bashism from t3210-pack-refs.sh
  ref-log: allow ref@{count} syntax.
  pack-refs: call fflush before fsync.
  pack-refs: use lockfile as everybody else does.
  git-fetch: do not look into $GIT_DIR/refs to see if a tag exists.
  lock_ref_sha1_basic does not remove empty directories on BSD
  Do not create tag leading directories since git update-ref does it.
  Check that a tag exists using show-ref instead of looking for the ref file.
  Use git-update-ref to delete a tag instead of rm()ing the ref file.
  Fix refs.c;:repack_without_ref() clean-up path
  Clean up "git-branch.sh" and add remove recursive dir test cases.
  ...
2006-11-01 08:48:50 -08:00
Shawn Pearce
6fb75bed5c Move deny_non_fast_forwards handling completely into receive-pack.
The 'receive.denynonfastforwards' option has nothing to do with
the repository format version.  Since receive-pack already uses
git_config to initialize itself before executing any updates we
can use the normal configuration strategy and isolate the receive
specific variables away from the core variables.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-30 19:35:16 -08:00
Junio C Hamano
05eb811aa1 Merge branch 'np/pack'
* np/pack:
  add the capability for index-pack to read from a stream
  index-pack: compare only the first 20-bytes of the key.
  git-repack: repo.usedeltabaseoffset
  pack-objects: document --delta-base-offset option
  allow delta data reuse even if base object is a preferred base
  zap a debug remnant
  let the GIT native protocol use offsets to delta base when possible
  make pack data reuse compatible with both delta types
  make git-pack-objects able to create deltas with offset to base
  teach git-index-pack about deltas with offset to base
  teach git-unpack-objects about deltas with offset to base
  introduce delta objects with offset to base
2006-10-22 22:51:42 -07:00
Rene Scharfe
8f9777801d Make write_sha1_file_prepare() static
There are no callers of write_sha1_file_prepare() left outside of
sha1_file.c, so make it static.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-14 11:49:59 -07:00
Rene Scharfe
abdc3fc842 Add hash_sha1_file()
Most callers of write_sha1_file_prepare() are only interested in the
resulting hash but don't care about the returned file name or the header.
This patch adds a simple wrapper named hash_sha1_file() which does just
that, and converts potential callers.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-14 11:49:52 -07:00
Junio C Hamano
2958d9b5db Merge branch 'master' into lj/refs
* master: (72 commits)
  runstatus: do not recurse into subdirectories if not needed
  grep: fix --fixed-strings combined with expression.
  grep: free expressions and patterns when done.
  Corrected copy-and-paste thinko in ignore executable bit test case.
  An illustration of rev-list --parents --pretty=raw
  Allow git-checkout when on a non-existant branch.
  gitweb: Decode long title for link tooltips
  git-svn: Fix fetch --no-ignore-externals with GIT_SVN_NO_LIB=1
  Ignore executable bit when adding files if filemode=0.
  Remove empty ref directories that prevent creating a ref.
  Use const for interpolate arguments
  git-archive: update documentation
  Deprecate merge-recursive.py
  gitweb: fix over-eager application of esc_html().
  Allow '(no author)' in git-svn's authors file.
  Allow 'svn fetch' on '(no date)' revisions in Subversion.
  git-repack: allow git-repack to run in subdirectory
  Remove upload-tar and make git-tar-tree a thin wrapper to git-archive
  git-tar-tree: Move code for git-archive --format=tar to archive-tar.c
  git-tar-tree: Remove duplicate git_config() call
  ...
2006-09-27 22:23:12 -07:00
Junio C Hamano
ac5409e420 update-ref: -d flag and ref creation safety.
This adds -d flag to update-ref to allow safe deletion of ref.
Before deleting it, the command checks if the given <oldvalue>
still matches the value the caller thought the ref contained.

Similarly, it also accepts 0{40} or an empty string as <oldvalue>
to allow safe creation of a new ref.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 02:01:42 -07:00
Nicolas Pitre
eb32d236df introduce delta objects with offset to base
This adds a new object, namely OBJ_OFS_DELTA, renames OBJ_DELTA to
OBJ_REF_DELTA to better make the distinction between those two delta
objects, and adds support for the handling of those new delta objects
in sha1_file.c only.

The OBJ_OFS_DELTA contains a relative offset from the delta object's
position in a pack instead of the 20-byte SHA1 reference to identify
the base object.  Since the base is likely to be not so far away, the
relative offset is more likely to have a smaller encoding on average
than an absolute offset.  And for those delta objects the base must
always be stored first because there is no way to know the distance of
later objects when streaming a pack.  Hence this relative offset is
always meant to be negative.

The offset encoding is slightly denser than the one used for object
size -- credits to <linux@horizon.com> (whoever this is) for bringing
it to my attention.

This allows for pack size reduction between 3.2% (Linux-2.6) to over 5%
(linux-historic).  Runtime pack access should be faster too since delta
replay does skip a search in the pack index for each delta in a chain.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 00:11:59 -07:00
Nicolas Pitre
43057304c0 many cleanups to sha1_file.c
Those cleanups are mainly to set the table for the support of deltas
with base objects referenced by offsets instead of sha1.  This means
that many pack lookup functions are converted to take a pack/offset
tuple instead of a sha1.

This eliminates many struct pack_entry usages since this structure
carried redundent information in many cases, and it increased stack
footprint needlessly for a couple recursively called functions that used
to declare a local copy of it for every recursion loop.

In the process, packed_object_info_detail() has been reorganized as well
so to look much saner and more amenable to deltas with offset support.

Finally the appropriate adjustments have been made to functions that
depend on the above changes.  But there is no functionality changes yet
simply some code refactoring at this point.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-23 01:51:33 -07:00
Junio C Hamano
8da1977554 Tell between packed, unpacked and symbolic refs.
This adds a "int *flag" parameter to resolve_ref() and makes
for_each_ref() family to call callback function with an extra
"int flag" parameter.  They are used to give two bits of
information (REF_ISSYMREF and REF_ISPACKED) about the ref.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 22:02:01 -07:00
Johannes Schindelin
11031d7e9f add receive.denyNonFastforwards config variable
If receive.denyNonFastforwards is set to true, git-receive-pack will deny
non fast-forwards, i.e. forced updates. Most notably, a push to a repository
which has that flag set will fail.

As a first user, 'git-init-db --shared' sets this flag, since in a shared
setup, you are most unlikely to want forced pushes to succeed.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 16:15:45 -07:00
Junio C Hamano
e49521b56d Make hexval() available to others.
builtin-mailinfo.c has its own hexval implementaiton but it can
share the table-lookup one recently implemented in sha1_file.c

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 16:08:14 -07:00
Linus Torvalds
ed378ec7e8 Make ref resolution saner
The old code used to totally mix up the notion of a ref-name and the path
that that ref was associated with.  That was not only horribly ugly (a
number of users got the path, and then wanted to try to turn it back into
a ref-name again), but it fundamnetally doesn't work at all once we do any
setup where a ref doesn't have a 1:1 relationship with a particular
pathname.

This fixes things up so that we use the ref-name throughout, and only
turn it into a pathname once we actually look it up in the filesystem.
That makes a lot of things much clearer and more straightforward.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-17 19:09:11 -07:00
Junio C Hamano
4405fb77f4 Merge branch 'jc/pack'
* jc/pack:
  pack-objects: document --revs, --unpacked and --all.
  pack-objects --unpacked=<existing pack> option.
  pack-objects: further work on internal rev-list logic.
  pack-objects: run rev-list equivalent internally.
  Separate object listing routines out of rev-list
2006-09-17 18:32:03 -07:00
Franck Bui-Huu
f42a5c4eb0 connect.c: finish_connect(): allow null pid parameter
git_connect() can return 0 if we use git protocol for example.
Users of this function don't know and don't care if a process
had been created or not, and to avoid them to check it before
calling finish_connect() this patch allows finish_connect() to
take a null pid. And in that case return 0.

[jc: updated function signature of git_connect() with a comment on
 its return value. ]

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-12 22:30:32 -07:00
Junio C Hamano
106d710bc1 pack-objects --unpacked=<existing pack> option.
Incremental repack without -a essentially boils down to:

	rev-list --objects --unpacked --all |
        pack-objects $new_pack

which picks up all loose objects that are still live and creates
a new pack.

This implements --unpacked=<existing pack> option to tell the
revision walking machinery to pretend as if objects in such a
pack are unpacked for the purpose of object listing.  With this,
we could say:

	rev-list --objects --unpacked=$active_pack --all |
	pack-objects $new_pack

instead, to mean "all live loose objects but pretend as if
objects that are in this pack are also unpacked".  The newly
created pack would be perfect for updating $active_pack by
replacing it.

Since pack-objects now knows how to do the rev-list's work
itself internally, you can also write the above example by:

	pack-objects --unpacked=$active_pack --all $new_pack </dev/null

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-07 02:46:03 -07:00
Junio C Hamano
72518e9c26 more lightweight revalidation while reusing deflated stream in packing
When copying from an existing pack and when copying from a loose
object with new style header, the code makes sure that the piece
we are going to copy out inflates well and inflate() consumes
the data in full while doing so.

The check to see if the xdelta really apply is quite expensive
as you described, because you would need to have the image of
the base object which can be represented as a delta against
something else.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-03 21:09:18 -07:00
Christian Couder
6ce4e61f1b Trace into a file or an open fd and refactor tracing code.
If GIT_TRACE is set to an absolute path (starting with a
'/' character), we interpret this as a file path and we
trace into it.

Also if GIT_TRACE is set to an integer value greater than
1 and lower than 10, we interpret this as an open fd value
and we trace into it.

Note that this behavior is not compatible with the
previous one.

We also trace whole messages using one write(2) call to
make sure messages from processes do net get mixed up in
the middle.

This patch makes it possible to get trace information when
running "make test".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-02 14:47:53 -07:00
Junio C Hamano
839837b953 Constness tightening for move/link_temp_to_file()
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-01 00:24:06 -07:00
Junio C Hamano
1e49cb8ad4 Merge branch 'js/c-merge-recursive'
* js/c-merge-recursive: (21 commits)
  discard_cache(): discard index, even if no file was mmap()ed
  merge-recur: do not die unnecessarily
  merge-recur: try to merge older merge bases first
  merge-recur: if there is no common ancestor, fake empty one
  merge-recur: do not setenv("GIT_INDEX_FILE")
  merge-recur: do not call git-write-tree
  merge-recursive: fix rename handling
  .gitignore: git-merge-recur is a built file.
  merge-recur: virtual commits shall never be parsed
  merge-recur: use the unpack_trees() interface instead of exec()ing read-tree
  merge-recur: fix thinko in unique_path()
  Makefile: git-merge-recur depends on xdiff libraries.
  merge-recur: Explain why sha_eq() and struct stage_data cannot go
  merge-recur: Cleanup last mixedCase variables...
  merge-recur: Fix compiler warning with -pedantic
  merge-recur: Remove dead code
  merge-recur: Get rid of debug code
  merge-recur: Convert variable names to lower_case
  Cumulative update of merge-recursive in C
  recur vs recursive: help testing without touching too many stuff.
  ...

This is an evil merge that removes TEST script from the toplevel.
2006-08-27 20:33:46 -07:00
Linus Torvalds
9a8e35e987 Relative timestamps in git log
I noticed that I was looking at the kernel gitweb output at some point
rather than just do "git log", simply because I liked seeing the
simplified date-format, ie the "5 days ago" rather than a full date.

This adds infrastructure to do that for "git log" too. It does NOT add the
actual flag to enable it, though, so right now this patch is a no-op, but
it should now be easy to add a command line flag (and possibly a config
file option) to just turn on the "relative" date format.

The exact cut-off points when it switches from days to weeks etc are
totally arbitrary, but are picked somewhat to avoid the "1 weeks ago"
thing (by making it show "10 days ago" rather than "1 week", or "70
minutes ago" rather than "1 hour ago").

[jc: with minor fix and tweak around "month" and "week" area.]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 19:12:03 -07:00
Junio C Hamano
a7f051987c Merge branch 'gl/cleanup'
* gl/cleanup:
  Convert memset(hash,0,20) to hashclr(hash).
  Convert memcpy(a,b,20) to hashcpy(a,b).
2006-08-26 01:06:22 -07:00
Pierre Habouzit
c5fba16c50 git_dir holds pointers to local strings, hence MUST be const.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 18:47:38 -07:00
Junio C Hamano
a8e0d16d85 Convert memset(hash,0,20) to hashclr(hash).
In the same spirit as hashcmp() and hashcpy().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:57:23 -07:00
Shawn Pearce
e702496e43 Convert memcpy(a,b,20) to hashcpy(a,b).
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.

A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*.  This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.

[jc: Splitted the patch to "master" part, to be followed by a
 patch for merge-recursive.c which is not in "master" yet.

 Fixed the cast in the latter hunk to combine-diff.c which was
 wrong in the original.

 Also converted ones left-over in combine-diff.c, diff-lib.c and
 upload-pack.c ]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:53:10 -07:00
Rene Scharfe
7230e6d042 Add write_or_die(), a helper function
The little helper write_or_die() won't come back with bad news about
full disks or broken pipes.  It either succeeds or terminates the
program, making additional error handling unnecessary.

This patch adds the new function and uses it to replace two similar
ones (the one in tar-tree originally has been copied from cat-file
btw.).  I chose to add the fd parameter which both lacked to make
write_or_die() just as flexible as write() and thus suitable for
lib-ification.

There is a regression: error messages emitted by this function don't
show the program name, while the replaced two functions did.  That's
acceptable, I think; a lot of other functions do the same.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-21 20:22:23 -07:00
David Rientjes
a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
David Rientjes
0bef57ee44 make inline is_null_sha1 global
Replace sha1 comparisons to null_sha1 with a global inline (which previously an
unused static inline in builtin-apply.c)

[jc: with a fix from Jonas Fonseca.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 15:06:03 -07:00
Junio C Hamano
647377c4c9 Merge branch 'jc/pack-objects' 2006-08-12 19:33:16 -07:00
Junio C Hamano
eed94a570e Merge branch 'master' into js/c-merge-recursive
Adjust to hold_lock_file_for_update() change on the master.
2006-08-12 18:35:14 -07:00
Junio C Hamano
40aaae88ad Better error message when we are unable to lock the index file
Most of the callers except the one in refs.c use the function to
update the index file.  Among the index writers, everybody
except write-tree dies if they cannot open it for writing.

This gives the function an extra argument, to tell it to die
when it cannot create a new file as the lockfile.

The only caller that does not have to die is write-tree, because
updating the index for the cache-tree part is optional and not
being able to do so does not affect the correctness.  I think we
do not have to be so careful and make the failure into die() the
same way as other callers, but that would be a different patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-12 17:08:25 -07:00
Matthias Lederhofer
aa086eb813 pager: config variable pager.color
enable/disable colored output when the pager is in use

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-31 15:32:24 -07:00
Junio C Hamano
c1a788acee Merge branch 'js/read-tree' into js/c-merge-recursive
* js/read-tree: (107 commits)
  read-tree: move merge functions to the library
  read-trees: refactor the unpack_trees() part
  tar-tree: illustrate an obscure feature better
  git.c: allow alias expansion without a git directory
  setup_git_directory_gently: do not barf when GIT_DIR is given.
  Build on Debian GNU/kFreeBSD
  Call setup_git_directory() much earlier
  Call setup_git_directory() early
  Display an error from update-ref if target ref name is invalid.
  Fix http-fetch
  t4103: fix binary patch application test.
  git-apply -R: binary patches are irreversible for now.
  Teach git-apply about '-R'
  Makefile: ssh-pull.o depends on ssh-fetch.c
  log and diff family: honor config even from subdirectories
  git-reset: detect update-ref error and report it.
  lost-found: use fsck-objects --full
  Teach git-http-fetch the --stdin switch
  Teach git-local-fetch the --stdin switch
  Make pull() support fetching multiple targets at once
  ...
2006-07-30 23:42:10 -07:00
Johannes Schindelin
11be42a476 Make git-mv a builtin
This also moves add_file_to_index() to read-cache.c. Oh, and while
touching builtin-add.c, it also removes a duplicate git_config() call.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-26 13:36:36 -07:00
Johannes Schindelin
8fd2cb4069 Extract helper bits from c-merge-recursive work
This backports the pieces that are not uncooked from the merge-recursive
WIP we have seen earlier, to be used in git-mv rewritten in C.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-26 13:36:36 -07:00
Junio C Hamano
bb6b8e4f87 sha1_file.c: expose map_sha1_file() interface.
This exposes map_sha1_file() interface to mmap a loose object file,
and legacy_loose_object() function, split from unpack_sha1_header().

They will be used in the next patch to reuse the deflated data from
new-style loose object files when generating packs.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-25 14:15:48 -07:00
Linus Torvalds
93821bd97a sha1_file: add the ability to parse objects in "pack file format"
The pack-file format is slightly different from the traditional git
object format, in that it has a much denser binary header encoding.
The traditional format uses an ASCII string with type and length
information, which is somewhat wasteful.

A new object format starts with uncompressed binary header
followed by compressed payload -- this will allow us later to
copy the payload straight to packfiles.

Obviously they cannot be read by older versions of git, so for
now new object files are created with the traditional format.
core.legacyheaders configuration item, when set to false makes
the code write in new format for people to experiment with.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 23:11:56 -07:00
Johannes Schindelin
6d297f8137 Status update on merge-recursive in C
This is just an update for people being interested. Alex and me were
busy with that project for a few days now. While it has progressed nicely,
there are quite a couple TODOs in merge-recursive.c, just search for "TODO".

For impatient people: yes, it passes all the tests, and yes, according
to the evil test Alex did, it is faster than the Python script.

But no, it is not yet finished. Biggest points are:

- there are still three external calls
- in the end, it should not be necessary to write the index more than once
  (just before exiting)
- a lot of things can be refactored to make the code easier and shorter

BTW we cannot just plug in git-merge-tree yet, because git-merge-tree
does not handle renames at all.

This patch is meant for testing, and as such,

- it compile the program to git-merge-recur
- it adjusts the scripts and tests to use git-merge-recur instead of
  git-merge-recursive
- it provides "TEST", a script to execute the tests regarding -recursive
- it inlines the changes to read-cache.c (read_cache_from(), discard_cache()
  and refresh_cache_entry())

Brought to you by Alex Riesen and Dscho

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 23:10:19 -07:00
Linus Torvalds
38d3874ddc Make the unpacked object header functions static to sha1_file.c
Nobody else uses them, and I'm going to start changing them.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-11 12:58:53 -07:00
Junio C Hamano
85fb65ed6e "git -p cmd" to page anywhere
This allows you to say:

	git -p diff v2.6.16-rc5..

and the command pipes the output of any git command to your pager.

[jc: this resurrects a month old RFC patch with improvement
 suggested by Linus to call it --paginate instead of --less.]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-09 03:27:03 -07:00
Linus Torvalds
2718ff098a Improve git-peek-remote
This makes git-peek-remote able to basically do everything that
git-ls-remote does (but obviously just for the native protocol, so no
http[s]: or rsync: support).

The default behaviour is the same, but you can now give a mixture of
"--refs", "--tags" and "--heads" flags, where "--refs" forces
git-peek-remote to only show real refs (ie none of the fakey tag lookups,
but also not the special pseudo-refs like HEAD and MERGE_HEAD).

The "--tags" and "--heads" flags respectively limit the output to just
regular tags and heads, of course.

You can still also ask to limit them by name too.

You can combine the flags, so

	git peek-remote --refs --tags .

will show all local _true_ tags, without the generated tag lookups
(compare the output without the "--refs" flag).

And "--tags --heads" will show both tags and heads, but will avoid (for
example) any special refs outside of the standard locations.

I'm also planning on adding a "--ignore-local" flag that allows us to ask
it to ignore any refs that we already have in the local tree, but that's
an independent thing.

All this is obviously gearing up to making "git fetch" cheaper.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-04 14:50:35 -07:00
Joachim B Haga
12f6c308d5 Make zlib compression level configurable, and change default.
With the change in default, "git add ." on kernel dir is about
twice as fast as before, with only minimal (0.5%) change in
object size. The speed difference is even more noticeable
when committing large files, which is now up to 8 times faster.

The configurability is through setting core.compression = [-1..9]
which maps to the zlib constants; -1 is the default, 0 is no
compression, and 1..9 are various speed/size tradeoffs, 9
being slowest.

Signed-off-by: Joachim B Haga (cjhaga@fys.uio.no)
Acked-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-03 13:55:11 -07:00
Junio C Hamano
801235c5e6 diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.

	[diff]
		color = auto

	[diff.color]
		new = blue
		old = yellow
		frag = reverse

When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal.  Other choices are "always",
and "never".  Usual boolean true/false can also be used.

The colormap entries can specify colors for the following slots:

	plain	- lines that appear in both old and new file (context)
	meta	- diff --git header and extended git diff headers
	frag	- @@ -n,m +l,k @@ lines (hunk header)
	old	- lines deleted from old file
	new	- lines added to new file

The following color names can be used:

	normal, bold, dim, l, blink, reverse, reset,
	black, red, green, yellow, blue, magenta, cyan,
	white

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-25 00:39:13 -07:00
Peter Eriksen
817151e61a Rename safe_strncpy() to strlcpy().
This cleans up the use of safe_strncpy() even more.  Since it has the
same semantics as strlcpy() use this name instead.  Also move the
definition from inside path.c to its own file compat/strlcpy.c, and use
it conditionally at compile time, since some platforms already has
strlcpy().  It's included in the same way as compat/setenv.c.

Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 23:16:25 -07:00
Junio C Hamano
583b7ea31b upload-pack/fetch-pack: support side-band communication
This implements a protocol extension between fetch-pack and
upload-pack to allow stderr stream from upload-pack (primarily
used for the progress bar display) to be passed back.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-21 02:50:32 -07:00
Linus Torvalds
855419f764 Add specialized object allocator
This creates a simple specialized object allocator for basic
objects.

This avoids wasting space with malloc overhead (metadata and
extra alignment), since the specialized allocator knows the
alignment, and that objects, once allocated, are never freed.

It also allows us to track some basic statistics about object
allocations. For example, for the mozilla import, it shows
object usage as follows:

     blobs:   627629 (14710 kB)
     trees:  1119035 (34969 kB)
   commits:   196423  (8440 kB)
      tags:     1336    (46 kB)

and the simpler allocator shaves off about 2.5% off the memory
footprint off a "git-rev-list --all --objects", and is a bit
faster too.

[ Side note: this concludes the series of "save memory in object storage".
  The thing is, there simply isn't much more to be saved on the objects.

  Doing "git-rev-list --all --objects" on the mozilla archive has a final
  total RSS of 131498 pages for me: that's about 513MB. Of that, the
  object overhead is now just 56MB, the rest is going somewhere else (put
  another way: the fact that this patch shaves off 2.5% of the total
  memory overhead, considering that objects are now not much more than 10%
  of the total shows how big the wasted space really was: this makes
  object allocations much more memory- and time-efficient).

  I haven't looked at where the rest is, but I suspect the bulk of it is
  just the pack-file loading. It may be that we should pack the tree
  objects separately from the blob objects: for git-rev-list --objects, we
  don't actually ever need to even look at the blobs, but since trees and
  blobs are interspersed in the pack-file, we end up not being dense in
  the tree accesses, so we end up looking at more pages than we strictly
  need to.

  So with a 535MB pack-file, it's entirely possible - even likely - that
  most of the remaining RSS is just the mmap of the pack-file itself. We
  don't need to map in _all_ of it, but we do end up mapping a fair
  amount. ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-19 18:42:21 -07:00
Junio C Hamano
d9faecac64 Merge branch 'jc/shared'
* jc/shared:
  shared repository: optionally allow reading to "others".
2006-06-18 20:19:09 -07:00
Peter Eriksen
bfbd0bb6ec Implement safe_strncpy() as strlcpy() and use it more.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-16 22:45:12 -07:00
Junio C Hamano
94df2506ed shared repository: optionally allow reading to "others".
This enhances core.sharedrepository to have additionally
specify that read and exec permissions to be given to others as
well.  It is useful when serving a repository via gitweb and
git-daemon that runs as a user outside the project group.

The configuration item can take the following values:

    [core]
	sharedrepository 	 ; the same as "group"
	sharedrepository = true  ; ditto
	sharedrepository = 1	 ; ditto
	sharedrepository = group ; allow rwx to group
	sharedrepository = all   ; allow rwx to group, allow rx to other
	sharedrepository = umask ; not shared - use umask

It also extends "git init-db" to take "--shared=all" and friends
from the command line.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-10 01:31:31 -07:00
Junio C Hamano
021b6e4549 Make index file locking code reusable to others.
The framework to create lockfiles that are removed at exit is
first used to reliably write the index file, but it is
applicable to other things, so stop calling it "cache_file".

This also rewords a few remaining error message that called the
index file "cache file".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-06 14:30:58 -07:00
Junio C Hamano
f0679f474a Merge branch 'sp/reflog'
* sp/reflog:
  fetch.c: do not pass uninitialized lock to unlock_ref().
  Test that git-branch -l works.
  Verify git-commit provides a reflog message.
  Enable ref log creation in git checkout -b.
  Create/delete branch ref logs.
  Include ref log detail in commit, reset, etc.
  Change order of -m option to update-ref.
  Correct force_write bug in refs.c
  Change 'master@noon' syntax to 'master@{noon}'.
  Log ref updates made by fetch.
  Force writing ref if it doesn't exist.
  Added logs/ directory to repository layout.
  General ref log reading improvements.
  Fix ref log parsing so it works properly.
  Support 'master@2 hours ago' syntax
  Log ref updates to logs/refs/<ref>
  Convert update-ref to use ref_lock API.
  Improve abstraction of ref lock/write.
2006-06-03 23:59:03 -07:00
Junio C Hamano
3f69d405d7 Merge branch 'jc/cache-tree'
* jc/cache-tree: (26 commits)
  builtin-rm: squelch compiler warnings.
  git-write-tree writes garbage on sparc64
  Fix crash when reading the empty tree
  fsck-objects: do not segfault on missing tree in cache-tree
  cache-tree: a bit more debugging support.
  read-tree: invalidate cache-tree entry when a new index entry is added.
  Fix test-dump-cache-tree in one-tree disappeared case.
  fsck-objects: mark objects reachable from cache-tree
  cache-tree: replace a sscanf() by two strtol() calls
  cache-tree.c: typefix
  test-dump-cache-tree: validate the cached data as well.
  cache_tree_update: give an option to update cache-tree only.
  read-tree: teach 1-way merege and plain read to prime cache-tree.
  read-tree: teach 1 and 2 way merges about cache-tree.
  update-index: when --unresolve, smudge the relevant cache-tree entries.
  test-dump-cache-tree: report number of subtrees.
  cache-tree: sort the subtree entries.
  Teach fsck-objects about cache-tree.
  index: make the index file format extensible.
  cache-tree: protect against "git prune".
  ...

Conflicts:

	Makefile, builtin.h, git.c: resolved the same way as in next.
2006-05-28 22:57:47 -07:00
Junio C Hamano
a5c8a98ca7 Merge branch 'master' into sp/reflog
* master: (90 commits)
  fetch.c: remove an unused variable and dead code.
  Clean up sha1 file writing
  Builtin git-cat-file
  builtin format-patch: squelch content-type for 7-bit ASCII
  CMIT_FMT_EMAIL: Q-encode Subject: and display-name part of From: fields.
  add more informative error messages to git-mktag
  remove the artificial restriction tagsize < 8kb
  git-rebase: use canonical A..B syntax to format-patch
  git-format-patch: now built-in.
  fmt-patch: Support --attach
  fmt-patch: understand old <his> notation
  Teach fmt-patch about --keep-subject
  Teach fmt-patch about --numbered
  fmt-patch: implement -o <dir>
  fmt-patch: output file names to stdout
  Teach fmt-patch to write individual files.
  built-in tar-tree and remote tar-tree
  Builtin git-diff-files, git-diff-index, git-diff-stages, and git-diff-tree.
  Builtin git-show-branch.
  Builtin git-apply.
  ...
2006-05-24 16:49:24 -07:00
Junio C Hamano
a861b58bbf Merge branch 'be/tag'
* be/tag:
  add more informative error messages to git-mktag
  remove the artificial restriction tagsize < 8kb
2006-05-24 12:20:48 -07:00
Junio C Hamano
73f0a1577b Merge branch 'js/fmt-patch'
This makes "git format-patch" a built-in.

* js/fmt-patch:
  git-rebase: use canonical A..B syntax to format-patch
  git-format-patch: now built-in.
  fmt-patch: Support --attach
  fmt-patch: understand old <his> notation
  Teach fmt-patch about --keep-subject
  Teach fmt-patch about --numbered
  fmt-patch: implement -o <dir>
  fmt-patch: output file names to stdout
  Teach fmt-patch to write individual files.
  Use RFC2822 dates from "git fmt-patch".
  git-fmt-patch: thinkofix to show [PATCH] properly.
  rename internal format-patch wip
  Minor tweak on subject line in --pretty=email
  Tentative built-in format-patch.
2006-05-24 12:19:47 -07:00
Junio C Hamano
376bb3a352 Merge branch 'lt/dirwalk'
This makes 'git add' and 'git rm' built-ins.

* lt/dirwalk:
  Add builtin "git rm" command
  Move pathspec matching from builtin-add.c into dir.c
  Prevent bogus paths from being added to the index.
  builtin-add: fix unmatched pathspec warnings.
  Remove old "git-add.sh" remnants
  builtin-add: warn on unmatched pathspecs
  Do "git add" as a builtin
  Clean up git-ls-file directory walking library interface
  libify git-ls-files directory traversal
2006-05-24 11:04:16 -07:00
Björn Engelmann
e7332f96b3 remove the artificial restriction tagsize < 8kb
Signed-off-by: Björn Engelmann <BjEngelmann@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 13:38:29 -07:00
Junio C Hamano
328b710d80 Merge branch 'master' into js/fmt-patch
* master: (119 commits)
  diff family: add --check option
  Document that "git add" only adds non-ignored files.
  Add a conversion tool to migrate remote information into the config
  fetch, pull: ask config for remote information
  Fix build procedure for builtin-init-db
  read-tree -m -u: do not overwrite or remove untracked working tree files.
  apply --cached: do not check newly added file in the working tree
  Implement a --dry-run option to git-quiltimport
  Implement git-quiltimport
  Revert "builtin-grep: workaround for non GNU grep."
  builtin-grep: workaround for non GNU grep.
  builtin-grep: workaround for non GNU grep.
  git-am: use apply --cached
  apply --cached: apply a patch without using working tree.
  apply --numstat: show new name, not old name.
  Documentation/Makefile: create tarballs for the man pages and html files
  Allow pickaxe and diff-filter options to be used by git log.
  Libify the index refresh logic
  Builtin git-init-db
  Remove unnecessary local in get_ref_sha1.
  ...
2006-05-21 01:34:54 -07:00
Junio C Hamano
93872e0700 Merge branch 'lt/dirwalk' into jc/dirwalk-n-cache-tree
This commit is what this branch is all about.  It records the
evil merge needed to adjust built-in git-add and git-rm for
the cache-tree extension.

* lt/dirwalk:
  Add builtin "git rm" command
  Move pathspec matching from builtin-add.c into dir.c
  Prevent bogus paths from being added to the index.
  builtin-add: fix unmatched pathspec warnings.
  Remove old "git-add.sh" remnants
  builtin-add: warn on unmatched pathspecs
  Do "git add" as a builtin
  Clean up git-ls-file directory walking library interface
  libify git-ls-files directory traversal

Conflicts:

	Makefile
	builtin.h
	git.c
	update-index.c
2006-05-20 01:52:19 -07:00
Junio C Hamano
283c8eef6c Merge branch 'jc/cache-tree' into jc/dirwalk-n-cache-tree
* jc/cache-tree: (24 commits)
  Fix crash when reading the empty tree
  fsck-objects: do not segfault on missing tree in cache-tree
  cache-tree: a bit more debugging support.
  read-tree: invalidate cache-tree entry when a new index entry is added.
  Fix test-dump-cache-tree in one-tree disappeared case.
  fsck-objects: mark objects reachable from cache-tree
  cache-tree: replace a sscanf() by two strtol() calls
  cache-tree.c: typefix
  test-dump-cache-tree: validate the cached data as well.
  cache_tree_update: give an option to update cache-tree only.
  read-tree: teach 1-way merege and plain read to prime cache-tree.
  read-tree: teach 1 and 2 way merges about cache-tree.
  update-index: when --unresolve, smudge the relevant cache-tree entries.
  test-dump-cache-tree: report number of subtrees.
  cache-tree: sort the subtree entries.
  Teach fsck-objects about cache-tree.
  index: make the index file format extensible.
  cache-tree: protect against "git prune".
  Add test-dump-cache-tree
  Use cache-tree in update-index.
  ...
2006-05-20 00:56:11 -07:00
Linus Torvalds
405e5b2fe0 Libify the index refresh logic
This cleans up and libifies the "git update-index --[really-]refresh"
functionality. This will be eventually required for eventually doing the
"commit" and "status" commands as built-ins.

It really just moves "refresh_index()" from update-index.c to
read-cache.c, but it also has to change the calling convention so that the
function uses a "unsigned int flags" argument instead of various static
flags variables for passing down the information about whether to be quiet
or not, and allow unmerged entries etc.

That actually cleans up update-index.c too, since it turns out that all
those flags were really specific to that one function of the index update,
so they shouldn't have had file-scope visibility even before.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-19 15:59:18 -07:00
Junio C Hamano
6858d49492 Merge part of 'js/fmt-patch' for RFC2822 dates into 'sp/reflog'
An earlier patch from Shawn Pearce dependes on a change that is
only in "next".  I do not want to make this series hostage to
the yet-to-graduate js/fmt-patch branch, but let's try fixing it
by merging the early parts of the branch to see what happens.

Right now, 'sp/reflog' will not be in "next" for now, so I won't
have to regret this -- if this merge causes problem down the road
merging I can always rebuild the topic branch ;-).
2006-05-19 15:25:57 -07:00
Linus Torvalds
8dcf39c46e Prevent bogus paths from being added to the index.
With this one, it's now a fatal error to try to add a pathname
that cannot be added with "git add", i.e.

	[torvalds@g5 git]$ git add .git/config
	fatal: unable to add .git/config to index

and

	[torvalds@g5 git]$ git add foo/../bar
	fatal: unable to add foo/../bar to index

instead of the old "Ignoring path xyz" warning that would end up
silently succeeding on any other paths.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-18 12:07:31 -07:00
Shawn Pearce
6de08ae688 Log ref updates to logs/refs/<ref>
If config parameter core.logAllRefUpdates is true or the log
file already exists then append a line to ".git/logs/refs/<ref>"
whenever git-update-ref <ref> is executed.  Each log line contains
the following information:

  oldsha1 <SP> newsha1 <SP> committer <LF>

where committer is the current user, date, time and timezone in
the standard GIT ident format.  If the caller is unable to append
to the log file then git-update-ref will fail without updating <ref>.

An optional message may be included in the log line with the -m flag.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-17 17:36:36 -07:00
Junio C Hamano
c66b6c067e Merge branch 'master' into js/fmt-patch
* master: (109 commits)
  t1300-repo-config: two new config parsing tests.
  Another config file parsing fix.
  update-index: plug memory leak from prefix_path()
  checkout-index: plug memory leak from prefix_path()
  update-index --unresolve: work from a subdirectory.
  pack-object: squelch eye-candy on non-tty
  core.prefersymlinkrefs: use symlinks for .git/HEAD
  repo-config: trim white-space before comment
  Fix for config file section parsing.
  Clarify git-cherry documentation.
  Update git-unpack-objects documentation.
  Fix up docs where "--" isn't displayed correctly.
  Several trivial documentation touch ups.
  git-svn 1.0.0
  git-svn: documentation updates
  delta: stricter constness
  Makefile: do not link rev-list any specially.
  builtin-push: --all and --tags _are_ explicit refspecs
  builtin-log/whatchanged/show: make them official.
  show-branch: omit uninteresting merges.
  ...
2006-05-06 14:42:59 -07:00
Junio C Hamano
0660626caf binary diff: further updates.
This updates the user interface and generated diff data format.

 * "diff --binary" is used to signal that we want an e-mailable
   binary patch.  It implies --full-index and -p.

 * "apply --allow-binary-replacement" acquired a short synonym
   "apply --binary".

 * After the "GIT binary patch\n" header line there is a token
   to record which binary patch mechanism was used, so that we
   can extend it later.  Currently there are two mechanisms
   defined: "literal" and "delta".  The former records the
   deflated postimage and the latter records the deflated delta
   from the preimage to postimage.

   For purely implementation convenience, I added the deflated
   length after these "literal/delta" tokens (otherwise the
   decoding side needs to guess and reallocate the buffer while
   inflating).  Improvement patches are very welcomed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 15:24:32 -07:00
Junio C Hamano
051308f6e9 binary patch.
This adds "binary patch" to the diff output and teaches apply
what to do with them.

On the diff generation side, traditionally, we said "Binary
files differ\n" without giving anything other than the preimage
and postimage object name on the index line.  This was good
enough for applying a patch generated from your own repository
(very useful while rebasing), because the postimage would be
available in such a case.  However, this was not useful when the
recipient of such a patch via e-mail were to apply it, even if
the preimage was available.

This patch allows the diff to generate "binary" patch when
operating under --full-index option.  The binary patch follows
the usual extended git diff headers, and looks like this:

	"GIT binary patch\n"
	<length byte><data>"\n"
	...
	"\n"

Each line is prefixed with a "length-byte", whose value is upper
or lowercase alphabet that encodes number of bytes that the data
on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ...,
'Z' means 26, 'a' means 27, ...).  <data> is 1 or more groups of
5-byte sequence, each of which encodes up to 4 bytes in base85
encoding.  Because 52 / 4 * 5 = 65 and we have the length byte,
an output line is capped to 66 characters.  The payload is the
same diff-delta as we use in the packfiles.

On the consumption side, git-apply now can decode and apply the
binary patch when --allow-binary-replacement is given, the diff
was generated with --full-index, and the receiving repository
has the preimage blob, which is the same condition as it always
required when accepting an "Binary files differ\n" patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 15:24:32 -07:00
Junio C Hamano
9f0bb90d16 core.prefersymlinkrefs: use symlinks for .git/HEAD
When inspecting a project whose build infrastructure used to
assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs
in the config file of such a project would help to bisect its
history.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-02 20:09:56 -07:00
Junio C Hamano
2a38704323 Use RFC2822 dates from "git fmt-patch".
Still Work-in-progress git fmt-patch (should it be known as
format-patch-ng?) is matched with the fix made by Huw Davies
in 262a6ef76a commit to use
RFC2822 date format.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-01 01:44:33 -07:00
Junio C Hamano
e7afa1115b Merge branch 'master' into jc/cache-tree
* master:
  t0000-basic: more commit-tree tests.
  commit-tree.c: check_valid() microoptimization.
  Fix filename verification when in a subdirectory
  rebase: typofix.
  socksetup: don't return on set_reuse_addr() error
2006-04-26 18:32:45 -07:00
Junio C Hamano
ea92f41ff9 revision parsing: make "rev -- paths" checks stronger.
If you don't have a "--" marker, then:

 - all of the arguments we are going to assume are pathspecs
   must exist in the working tree.

 - none of the arguments we parsed as revisions could be
   interpreted as a filename.

so that there really isn't any possibility of confusion in case
somebody does have a revision that looks like a pathname too.

The former rule has been in effect; this implements the latter.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-26 17:08:44 -07:00
Linus Torvalds
e23d0b4a4a Fix filename verification when in a subdirectory
When we are in a subdirectory of a git archive, we need to take the prefix
of that subdirectory into accoung when we verify filename arguments.

Noted by Matthias Lederhofer

This also uses the improved error reporting for all the other git commands
that use the revision parsing interfaces, not just git-rev-parse. Also, it
makes the error reporting for mixed filenames and argument flags clearer
(you cannot put flags after the start of the pathname list).

[jc: with fix to a trivial typo noticed by Timo Hirvonen]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-26 12:16:21 -07:00
Junio C Hamano
bad68ec924 index: make the index file format extensible.
... and move the cache-tree data into it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-24 21:24:13 -07:00
Junio C Hamano
1af1c2b63d read-cache/write-cache: optionally return cache checksum SHA1.
read_cache_1() and write_cache_1() takes an extra parameter
*sha1 that returns the checksum of the index file when non-NULL.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-23 16:57:40 -07:00
Junio C Hamano
1b0c7174a1 tree/diff header cleanup.
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.

Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h.  This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-29 23:54:13 -08:00
Junio C Hamano
2f8acdb38e core.warnambiguousrefs: warns when "name" is used and both "name" branch and tag exists.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-20 23:34:17 -08:00
Shawn Pearce
de84f99c12 Add --temp and --stage=all options to checkout-index.
Sometimes it is convient for a Porcelain to be able to checkout all
unmerged files in all stages so that an external merge tool can be
executed by the Porcelain or the end-user.  Using git-unpack-file
on each stage individually incurs a rather high penalty due to the
need to fork for each file version obtained.  git-checkout-index -a
--stage=all will now do the same thing, but faster.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-05 00:58:13 -08:00
Junio C Hamano
21dbe12c76 Merge branch 'lt/rev-list'
* lt/rev-list:
  setup_revisions(): handle -n<n> and -<n> internally.
  git-log (internal): more options.
  git-log (internal): add approxidate.
  Rip out merge-order and make "git log <paths>..." work again.
  Tie it all together: "git log"
  Introduce trivial new pager.c helper infrastructure
  git-rev-list libification: rev-list walking
  Splitting rev-list into revisions lib, end of beginning.
  rev-list split: minimum fixup.
  First cut at libifying revlist generation
2006-03-04 13:21:17 -08:00
Linus Torvalds
f67b45f862 Introduce trivial new pager.c helper infrastructure
This introduces the new function

	void setup_pager(void);

to set up output to be written through a pager applocation.

All in preparation for doing the simple scripts in C.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-28 14:49:32 -08:00
Junio C Hamano
2ae1c53b51 apply --whitespace: configuration option.
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip".  When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.

Andrew can now tell people who feed him git trees to update to
this version and say:

	git repo-config apply.whitespace error

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-27 14:47:45 -08:00
Timo Hirvonen
962554c616 Use setenv(), fix warnings
- Fix -Wundef -Wold-style-definition warnings
  - Make pll_free() static

[jc: original patch by Timo had another unrelated bits:

  - Use setenv() instead of putenv()

 I'm postponing that part for now.]

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-26 15:06:45 -08:00
Junio C Hamano
ee072260db Merge branch 'jc/nostat'
* jc/nostat:
  cache_name_compare() compares name and stage, nothing else.
  "assume unchanged" git: documentation.
  ls-files: split "show-valid-bit" into a different option.
  "Assume unchanged" git: --really-refresh fix.
  ls-files: debugging aid for CE_VALID changes.
  "Assume unchanged" git: do not set CE_VALID with --refresh
  "Assume unchanged" git
2006-02-21 22:33:21 -08:00
Junio C Hamano
749be728d4 Delay "empty ident" errors until they really matter.
Previous one warned people upfront to encourage fixing their
environment early, but some people just use repositories and git
tools read-only without making any changes, and in such a case
there is not much point insisting on them having a usable ident.

This round attempts to move the error until either "git-var"
asks for the ident explicitly or "commit-tree" wants to use it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-18 20:31:05 -08:00