Commit Graph

265 Commits

Author SHA1 Message Date
Junio C Hamano
cf2999eb4c Merge branch 'sp/mmap'
* sp/mmap: (27 commits)
  Spell default packedgitlimit slightly differently
  Increase packedGit{Limit,WindowSize} on 64 bit systems.
  Update packedGit config option documentation.
  mmap: set FD_CLOEXEC for file descriptors we keep open for mmap()
  pack-objects: fix use of use_pack().
  Fix random segfaults in pack-objects.
  Cleanup read_cache_from error handling.
  Replace mmap with xmmap, better handling MAP_FAILED.
  Release pack windows before reporting out of memory.
  Default core.packdGitWindowSize to 1 MiB if NO_MMAP.
  Test suite for sliding window mmap implementation.
  Create pack_report() as a debugging aid.
  Support unmapping windows on 'temporary' packfiles.
  Improve error message when packfile mmap fails.
  Ensure core.packedGitWindowSize cannot be less than 2 pages.
  Load core configuration in git-verify-pack.
  Fully activate the sliding window pack access.
  Unmap individual windows rather than entire files.
  Document why header parsing won't exceed a window.
  Loop over pack_windows when inflating/accessing data.
  ...

Conflicts:

	cache.h
	pack-check.c
2007-01-07 00:12:47 -08:00
Junio C Hamano
e27e609bbf Merge branch 'maint'
* maint:
  pack-check.c::verify_packfile(): don't run SHA-1 update on huge data
  Fix infinite loop when deleting multiple packed refs.
2007-01-04 22:28:21 -08:00
Junio C Hamano
1084b845d9 Fix infinite loop when deleting multiple packed refs.
It was stupid to link the same element twice to lock_file_list
and end up in a loop, so we certainly need a fix.

But it is not like we are taking a lock on multiple files in
this case.  It is just that we leave the linked element on the
list even after commit_lock_file() successfully removes the
cruft.

We cannot remove the list element in commit_lock_file(); if we
are interrupted in the middle of list manipulation, the call to
remove_lock_file_on_signal() will happen with a broken list
structure pointed by lock_file_list, which would cause the cruft
to remain, so not removing the list element is the right thing
to do.  Instead we should be reusing the element already on the
list.

There is already a code for that in lock_file() function in
lockfile.c.  The code checks lk->next and the element is linked
only when it is not already on the list -- which is incorrect
for the last element on the list (which has NULL in its next
field), but if you read the check as "is this element already on
the list?" it actually makes sense.  We do not want to link it
on the list again, nor we would want to set up signal/atexit
over and over.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-03 01:22:35 -08:00
Andy Whitcroft
825cee7b28 send pack check for failure to send revisions list
When passing the revisions list to pack-objects we do not check for
errors nor short writes.  Introduce a new write_in_full which will
handle short writes and report errors to the caller.  Use this to
short cut the send on failure, allowing us to wait for and report
the child in case the failure is its fault.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-02 23:33:21 -08:00
Shawn O. Pearce
a53128b601 Create pack_report() as a debugging aid.
Much like the alloc_report() function can be useful to report on
object allocation statistics while debugging the new pack_report()
function can be useful to report on the behavior of the mmap window
code used for packfile access.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
60bb8b1453 Fully activate the sliding window pack access.
This finally turns on the sliding window behavior for packfile data
access by mapping limited size windows and chaining them under the
packed_git->windows list.

We consider a given byte offset to be within the window only if there
would be at least 20 bytes (one hash worth of data) accessible after
the requested offset.  This range selection relates to the contract
that use_pack() makes with its callers, allowing them to access
one hash or one object header without needing to call use_pack()
for every byte of data obtained.

In the worst case scenario we will map the same page of data twice
into memory: once at the end of one window and once again at the
start of the next window.  This duplicate page mapping will happen
only when an object header or a delta base reference is spanned
over the end of a window and is always limited to just one page of
duplication, as no sane operating system will ever have a page size
smaller than a hash.

I am assuming that the possible wasted page of virtual address
space is going to perform faster than the alternatives, which
would be to copy the object header or ref delta into a temporary
buffer prior to parsing, or to check the window range on every byte
during header parsing.  We may decide to revisit this decision in
the future since this is just a gut instinct decision and has not
actually been proven out by experimental testing.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Shawn O. Pearce
03e79c88aa Replace use_packed_git with window cursors.
Part of the implementation concept of the sliding mmap window for
pack access is to permit multiple windows per pack to be mapped
independently.  Since the inuse_cnt is associated with the mmap and
not with the file, this value is in struct pack_window and needs to
be incremented/decremented for each pack_window accessed by any code.

To faciliate that implementation we need to replace all uses of
use_packed_git() and unuse_packed_git() with a different API that
follows struct pack_window objects rather than struct packed_git.

The way this works is when we need to start accessing a pack for
the first time we should setup a new window 'cursor' by declaring
a local and setting it to NULL:

  struct pack_windows *w_curs = NULL;

To obtain the memory region which contains a specific section of
the pack file we invoke use_pack(), supplying the address of our
current window cursor:

  unsigned int len;
  unsigned char *addr = use_pack(p, &w_curs, offset, &len);

the returned address `addr` will be the first byte at `offset`
within the pack file.  The optional variable len will also be
updated with the number of bytes remaining following the address.

Multiple calls to use_pack() with the same window cursor will
update the window cursor, moving it from one window to another
when necessary.  In this way each window cursor variable maintains
only one struct pack_window inuse at a time.

Finally before exiting the scope which originally declared the window
cursor we must invoke unuse_pack() to unuse the current window (which
may be different from the one that was first obtained from use_pack):

  unuse_pack(&w_curs);

This implementation is still not complete with regards to multiple
windows, as only one window per pack file is supported right now.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
9bc879c1ce Refactor how we open pack files to prepare for multiple windows.
To efficiently support mmaping of multiple regions of the same pack
file we want to keep the pack's file descriptor open while we are
actively working with that pack.  So we are now keeping that file
descriptor in packed_git.pack_fd and closing it only after we unmap
the last window.

This is going to increase the number of file descriptors that are
in use at once, however that will be bounded by the total number of
pack files present and therefore should not be very high.  It is
a small tradeoff which we may need to revisit after some testing
can be done on various repositories and systems.

For code clarity we also want to seperate out the implementation
of how we open a pack file from the implementation which locates
a suitable window (or makes a new one) from the given pack file.
Since this is a rather large delta I'm taking advantage of doing
it now, in a fairly isolated change.

When we open a pack file we need to examine the header and trailer
without having a mmap in place, as we may only need to mmap
the middle section of this particular pack.  Consequently the
verification code has been refactored to make use of the new
read_or_die function.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
75025ccdb7 Create read_or_die utility routine.
Like write_or_die read_or_die reads the entire length requested
or it kills the current process with a die call.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
2dc3a23409 Use off_t for index and pack file lengths.
Since the index_size and pack_size members of struct packed_git
are the lengths of those corresponding files we should use the
off_t size of the operating system to store these file lengths,
rather than an unsigned long.  This would help in the future should
we ever resurrect Junio's 64 bit index implementation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
c41ee586dc Refactor packed_git to prepare for sliding mmap windows.
The idea behind the sliding mmap window pack reader implementation
is to have multiple mmap regions active against the same pack file,
thereby allowing the process to mmap in only the active/hot sections
of the pack and reduce overall virtual address space usage.

To implement this we need to refactor the mmap related data
(pack_base, pack_use_cnt) out of struct packed_git and move them
into a new struct pack_window.

We are refactoring the code to support a single struct pack_window
per packfile, thereby emulating the prior behavior of mmap'ing the
entire pack file.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
77ccc5bbd1 Introduce new config option for mmap limit.
Rather than hardcoding the maximum number of bytes which can be
mmapped from pack files we should make this value configurable,
allowing the end user to increase or decrease this limit on a
per-repository basis depending on the size of the repository
and the capabilities of their operating system.

In general users should not need to manually tune such a low-level
setting within the core code, but being able to artifically limit
the number of bytes which we can mmap at once from pack files will
make it easier to craft test cases for the new mmap sliding window
implementation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Shawn O. Pearce
4d703a1a90 Replace unpack_entry_gently with unpack_entry.
The unpack_entry_gently function currently has only two callers:
the delta base resolution in sha1_file.c and the main loop of
pack-check.c.  Both of these must change to using unpack_entry
directly when we implement sliding window mmap logic, so I'm doing
it earlier to help break down the change set.

This may cause a slight performance decrease for delta base
resolution as well as for pack-check.c's verify_packfile(), as
the pack use counter will be incremented and decremented for every
object that is unpacked.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:44 -08:00
Junio C Hamano
d2c11a38c4 UTF-8: introduce i18n.logoutputencoding.
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8).  Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.

The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-27 16:41:33 -08:00
Junio C Hamano
d4ebc36c5e Use preprocessor constants for environment variable names.
We broke the discipline Linus set up to allow compiler help us
avoid typos in environment names in the early days of git over
time.  This defines a handful preprocessor constants for
environment variable names used in relatively core parts of the
system.

I've left out variable names specific to subsystems such as HTTP
and SSL as I do not think they are big problems.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-19 01:51:51 -08:00
Junio C Hamano
bdf17a02fd Merge branch 'js/branch-config'
* js/branch-config:
  git-branch: rename config vars branch.<branch>.*, too
  add a function to rename sections in the config
2006-12-17 18:27:17 -08:00
Shawn O. Pearce
a7f196a746 Default GIT_COMMITTER_NAME to login name in recieve-pack.
If GIT_COMMITTER_NAME is not available in receive-pack but reflogs
are enabled we would normally die out with an error message asking
the user to correct their environment settings.

Now that reflogs are enabled by default in (what we guessed to be)
non-bare Git repositories this may cause problems for some users
who don't have their full name in the gecos field and who don't
have access to the remote system to correct the problem.

So rather than die()'ing out in receive-pack when we try to log a
ref change and have no committer name we default to the username,
as obtained from the host's password database.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-17 01:14:44 -08:00
Johannes Schindelin
0667fcfb62 add a function to rename sections in the config
Given a config like this:

	# A config
	[very.interesting.section]
		not

The command

	$ git repo-config --rename-section very.interesting.section bla.1

will lead to this config:

	# A config
	[bla "1"]
		not

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-16 13:28:20 -08:00
Shawn O. Pearce
0bee591869 Enable reflogs by default in any repository with a working directory.
New and experienced Git users alike are finding out too late that
they forgot to enable reflogs in the current repository, and cannot
use the information stored within it to recover from an incorrectly
entered command such as `git reset --hard HEAD^^^` when they really
meant HEAD^^ (aka HEAD~2).

So enable reflogs by default in all future versions of Git, unless
the user specifically disables it with:

  [core]
    logAllRefUpdates = false

in their .git/config or ~/.gitconfig.

We only enable reflogs in repositories that have a working directory
associated with them, as shared/bare repositories do not have
an easy means to prune away old log entries, or may fail logging
entirely if the user's gecos information is not valid during a push.
This heuristic was suggested on the mailing list by Junio.

Documentation was also updated to indicate the new default behavior.
We probably should start to teach usuing the reflog to recover
from mistakes in some of the tutorial material, as new users are
likely to make a few along the way and will feel better knowing
they can recover from them quickly and easily, without fsck-objects'
lost+found features.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-15 22:31:01 -08:00
Nicolas Pitre
da093d3750 improve fetch-pack's handling of kept packs
Since functions in fetch-clone.c were only used from fetch-pack.c,
its content has been merged with fetch-pack.c.  This allows for better
coupling of features with much simpler implementations.

One new thing is that the (abscence of) --thin also enforce it on
index-pack now, such that index-pack will abort if a thin pack was
_not_ asked for.

The -k or --keep, when provided twice, now causes the fetched pack
to be left as a kept pack just like receive-pack currently does.
Eventually this will be used to close a race against concurrent
repacking.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-03 00:24:07 -08:00
Shawn Pearce
fc04c412d8 Teach receive-pack how to keep pack files based on object count.
Since keeping a pushed pack or exploding it into loose objects
should be a local repository decision this teaches receive-pack
to decide if it should call unpack-objects or index-pack --stdin
--fix-thin based on the setting of receive.unpackLimit and the
number of objects contained in the received pack.

If the number of objects (hdr_entries) in the received pack is
below the value of receive.unpackLimit (which is 5000 by default)
then we unpack-objects as we have in the past.

If the hdr_entries >= receive.unpackLimit then we call index-pack and
ask it to include our pid and hostname in the .keep file to make it
easier to identify why a given pack has been kept in the repository.

Currently this leaves every received pack as a kept pack.  We really
don't want that as received packs will tend to be small.  Instead we
want to delete the .keep file automatically after all refs have
been updated.  That is being left as room for future improvement.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-03 00:24:07 -08:00
Junio C Hamano
58a1e0e83b Merge branch 'lj/refs'
* lj/refs: (63 commits)
  Fix show-ref usagestring
  t3200: git-branch testsuite update
  sha1_name.c: avoid compilation warnings.
  Make git-branch a builtin
  ref-log: fix D/F conflict coming from deleted refs.
  git-revert with conflicts to behave as git-merge with conflicts
  core.logallrefupdates thinko-fix
  git-pack-refs --all
  core.logallrefupdates create new log file only for branch heads.
  Remove bashism from t3210-pack-refs.sh
  ref-log: allow ref@{count} syntax.
  pack-refs: call fflush before fsync.
  pack-refs: use lockfile as everybody else does.
  git-fetch: do not look into $GIT_DIR/refs to see if a tag exists.
  lock_ref_sha1_basic does not remove empty directories on BSD
  Do not create tag leading directories since git update-ref does it.
  Check that a tag exists using show-ref instead of looking for the ref file.
  Use git-update-ref to delete a tag instead of rm()ing the ref file.
  Fix refs.c;:repack_without_ref() clean-up path
  Clean up "git-branch.sh" and add remove recursive dir test cases.
  ...
2006-11-01 08:48:50 -08:00
Shawn Pearce
6fb75bed5c Move deny_non_fast_forwards handling completely into receive-pack.
The 'receive.denynonfastforwards' option has nothing to do with
the repository format version.  Since receive-pack already uses
git_config to initialize itself before executing any updates we
can use the normal configuration strategy and isolate the receive
specific variables away from the core variables.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-30 19:35:16 -08:00
Junio C Hamano
05eb811aa1 Merge branch 'np/pack'
* np/pack:
  add the capability for index-pack to read from a stream
  index-pack: compare only the first 20-bytes of the key.
  git-repack: repo.usedeltabaseoffset
  pack-objects: document --delta-base-offset option
  allow delta data reuse even if base object is a preferred base
  zap a debug remnant
  let the GIT native protocol use offsets to delta base when possible
  make pack data reuse compatible with both delta types
  make git-pack-objects able to create deltas with offset to base
  teach git-index-pack about deltas with offset to base
  teach git-unpack-objects about deltas with offset to base
  introduce delta objects with offset to base
2006-10-22 22:51:42 -07:00
Rene Scharfe
8f9777801d Make write_sha1_file_prepare() static
There are no callers of write_sha1_file_prepare() left outside of
sha1_file.c, so make it static.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-14 11:49:59 -07:00
Rene Scharfe
abdc3fc842 Add hash_sha1_file()
Most callers of write_sha1_file_prepare() are only interested in the
resulting hash but don't care about the returned file name or the header.
This patch adds a simple wrapper named hash_sha1_file() which does just
that, and converts potential callers.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-14 11:49:52 -07:00
Junio C Hamano
2958d9b5db Merge branch 'master' into lj/refs
* master: (72 commits)
  runstatus: do not recurse into subdirectories if not needed
  grep: fix --fixed-strings combined with expression.
  grep: free expressions and patterns when done.
  Corrected copy-and-paste thinko in ignore executable bit test case.
  An illustration of rev-list --parents --pretty=raw
  Allow git-checkout when on a non-existant branch.
  gitweb: Decode long title for link tooltips
  git-svn: Fix fetch --no-ignore-externals with GIT_SVN_NO_LIB=1
  Ignore executable bit when adding files if filemode=0.
  Remove empty ref directories that prevent creating a ref.
  Use const for interpolate arguments
  git-archive: update documentation
  Deprecate merge-recursive.py
  gitweb: fix over-eager application of esc_html().
  Allow '(no author)' in git-svn's authors file.
  Allow 'svn fetch' on '(no date)' revisions in Subversion.
  git-repack: allow git-repack to run in subdirectory
  Remove upload-tar and make git-tar-tree a thin wrapper to git-archive
  git-tar-tree: Move code for git-archive --format=tar to archive-tar.c
  git-tar-tree: Remove duplicate git_config() call
  ...
2006-09-27 22:23:12 -07:00
Junio C Hamano
ac5409e420 update-ref: -d flag and ref creation safety.
This adds -d flag to update-ref to allow safe deletion of ref.
Before deleting it, the command checks if the given <oldvalue>
still matches the value the caller thought the ref contained.

Similarly, it also accepts 0{40} or an empty string as <oldvalue>
to allow safe creation of a new ref.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 02:01:42 -07:00
Nicolas Pitre
eb32d236df introduce delta objects with offset to base
This adds a new object, namely OBJ_OFS_DELTA, renames OBJ_DELTA to
OBJ_REF_DELTA to better make the distinction between those two delta
objects, and adds support for the handling of those new delta objects
in sha1_file.c only.

The OBJ_OFS_DELTA contains a relative offset from the delta object's
position in a pack instead of the 20-byte SHA1 reference to identify
the base object.  Since the base is likely to be not so far away, the
relative offset is more likely to have a smaller encoding on average
than an absolute offset.  And for those delta objects the base must
always be stored first because there is no way to know the distance of
later objects when streaming a pack.  Hence this relative offset is
always meant to be negative.

The offset encoding is slightly denser than the one used for object
size -- credits to <linux@horizon.com> (whoever this is) for bringing
it to my attention.

This allows for pack size reduction between 3.2% (Linux-2.6) to over 5%
(linux-historic).  Runtime pack access should be faster too since delta
replay does skip a search in the pack index for each delta in a chain.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 00:11:59 -07:00
Nicolas Pitre
43057304c0 many cleanups to sha1_file.c
Those cleanups are mainly to set the table for the support of deltas
with base objects referenced by offsets instead of sha1.  This means
that many pack lookup functions are converted to take a pack/offset
tuple instead of a sha1.

This eliminates many struct pack_entry usages since this structure
carried redundent information in many cases, and it increased stack
footprint needlessly for a couple recursively called functions that used
to declare a local copy of it for every recursion loop.

In the process, packed_object_info_detail() has been reorganized as well
so to look much saner and more amenable to deltas with offset support.

Finally the appropriate adjustments have been made to functions that
depend on the above changes.  But there is no functionality changes yet
simply some code refactoring at this point.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-23 01:51:33 -07:00
Junio C Hamano
8da1977554 Tell between packed, unpacked and symbolic refs.
This adds a "int *flag" parameter to resolve_ref() and makes
for_each_ref() family to call callback function with an extra
"int flag" parameter.  They are used to give two bits of
information (REF_ISSYMREF and REF_ISPACKED) about the ref.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 22:02:01 -07:00
Johannes Schindelin
11031d7e9f add receive.denyNonFastforwards config variable
If receive.denyNonFastforwards is set to true, git-receive-pack will deny
non fast-forwards, i.e. forced updates. Most notably, a push to a repository
which has that flag set will fail.

As a first user, 'git-init-db --shared' sets this flag, since in a shared
setup, you are most unlikely to want forced pushes to succeed.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 16:15:45 -07:00
Junio C Hamano
e49521b56d Make hexval() available to others.
builtin-mailinfo.c has its own hexval implementaiton but it can
share the table-lookup one recently implemented in sha1_file.c

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20 16:08:14 -07:00
Linus Torvalds
ed378ec7e8 Make ref resolution saner
The old code used to totally mix up the notion of a ref-name and the path
that that ref was associated with.  That was not only horribly ugly (a
number of users got the path, and then wanted to try to turn it back into
a ref-name again), but it fundamnetally doesn't work at all once we do any
setup where a ref doesn't have a 1:1 relationship with a particular
pathname.

This fixes things up so that we use the ref-name throughout, and only
turn it into a pathname once we actually look it up in the filesystem.
That makes a lot of things much clearer and more straightforward.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-17 19:09:11 -07:00
Junio C Hamano
4405fb77f4 Merge branch 'jc/pack'
* jc/pack:
  pack-objects: document --revs, --unpacked and --all.
  pack-objects --unpacked=<existing pack> option.
  pack-objects: further work on internal rev-list logic.
  pack-objects: run rev-list equivalent internally.
  Separate object listing routines out of rev-list
2006-09-17 18:32:03 -07:00
Franck Bui-Huu
f42a5c4eb0 connect.c: finish_connect(): allow null pid parameter
git_connect() can return 0 if we use git protocol for example.
Users of this function don't know and don't care if a process
had been created or not, and to avoid them to check it before
calling finish_connect() this patch allows finish_connect() to
take a null pid. And in that case return 0.

[jc: updated function signature of git_connect() with a comment on
 its return value. ]

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-12 22:30:32 -07:00
Junio C Hamano
106d710bc1 pack-objects --unpacked=<existing pack> option.
Incremental repack without -a essentially boils down to:

	rev-list --objects --unpacked --all |
        pack-objects $new_pack

which picks up all loose objects that are still live and creates
a new pack.

This implements --unpacked=<existing pack> option to tell the
revision walking machinery to pretend as if objects in such a
pack are unpacked for the purpose of object listing.  With this,
we could say:

	rev-list --objects --unpacked=$active_pack --all |
	pack-objects $new_pack

instead, to mean "all live loose objects but pretend as if
objects that are in this pack are also unpacked".  The newly
created pack would be perfect for updating $active_pack by
replacing it.

Since pack-objects now knows how to do the rev-list's work
itself internally, you can also write the above example by:

	pack-objects --unpacked=$active_pack --all $new_pack </dev/null

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-07 02:46:03 -07:00
Junio C Hamano
72518e9c26 more lightweight revalidation while reusing deflated stream in packing
When copying from an existing pack and when copying from a loose
object with new style header, the code makes sure that the piece
we are going to copy out inflates well and inflate() consumes
the data in full while doing so.

The check to see if the xdelta really apply is quite expensive
as you described, because you would need to have the image of
the base object which can be represented as a delta against
something else.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-03 21:09:18 -07:00
Christian Couder
6ce4e61f1b Trace into a file or an open fd and refactor tracing code.
If GIT_TRACE is set to an absolute path (starting with a
'/' character), we interpret this as a file path and we
trace into it.

Also if GIT_TRACE is set to an integer value greater than
1 and lower than 10, we interpret this as an open fd value
and we trace into it.

Note that this behavior is not compatible with the
previous one.

We also trace whole messages using one write(2) call to
make sure messages from processes do net get mixed up in
the middle.

This patch makes it possible to get trace information when
running "make test".

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-02 14:47:53 -07:00
Junio C Hamano
839837b953 Constness tightening for move/link_temp_to_file()
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-01 00:24:06 -07:00
Junio C Hamano
1e49cb8ad4 Merge branch 'js/c-merge-recursive'
* js/c-merge-recursive: (21 commits)
  discard_cache(): discard index, even if no file was mmap()ed
  merge-recur: do not die unnecessarily
  merge-recur: try to merge older merge bases first
  merge-recur: if there is no common ancestor, fake empty one
  merge-recur: do not setenv("GIT_INDEX_FILE")
  merge-recur: do not call git-write-tree
  merge-recursive: fix rename handling
  .gitignore: git-merge-recur is a built file.
  merge-recur: virtual commits shall never be parsed
  merge-recur: use the unpack_trees() interface instead of exec()ing read-tree
  merge-recur: fix thinko in unique_path()
  Makefile: git-merge-recur depends on xdiff libraries.
  merge-recur: Explain why sha_eq() and struct stage_data cannot go
  merge-recur: Cleanup last mixedCase variables...
  merge-recur: Fix compiler warning with -pedantic
  merge-recur: Remove dead code
  merge-recur: Get rid of debug code
  merge-recur: Convert variable names to lower_case
  Cumulative update of merge-recursive in C
  recur vs recursive: help testing without touching too many stuff.
  ...

This is an evil merge that removes TEST script from the toplevel.
2006-08-27 20:33:46 -07:00
Linus Torvalds
9a8e35e987 Relative timestamps in git log
I noticed that I was looking at the kernel gitweb output at some point
rather than just do "git log", simply because I liked seeing the
simplified date-format, ie the "5 days ago" rather than a full date.

This adds infrastructure to do that for "git log" too. It does NOT add the
actual flag to enable it, though, so right now this patch is a no-op, but
it should now be easy to add a command line flag (and possibly a config
file option) to just turn on the "relative" date format.

The exact cut-off points when it switches from days to weeks etc are
totally arbitrary, but are picked somewhat to avoid the "1 weeks ago"
thing (by making it show "10 days ago" rather than "1 week", or "70
minutes ago" rather than "1 hour ago").

[jc: with minor fix and tweak around "month" and "week" area.]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-26 19:12:03 -07:00
Junio C Hamano
a7f051987c Merge branch 'gl/cleanup'
* gl/cleanup:
  Convert memset(hash,0,20) to hashclr(hash).
  Convert memcpy(a,b,20) to hashcpy(a,b).
2006-08-26 01:06:22 -07:00
Pierre Habouzit
c5fba16c50 git_dir holds pointers to local strings, hence MUST be const.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 18:47:38 -07:00
Junio C Hamano
a8e0d16d85 Convert memset(hash,0,20) to hashclr(hash).
In the same spirit as hashcmp() and hashcpy().

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:57:23 -07:00
Shawn Pearce
e702496e43 Convert memcpy(a,b,20) to hashcpy(a,b).
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.

A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*.  This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.

[jc: Splitted the patch to "master" part, to be followed by a
 patch for merge-recursive.c which is not in "master" yet.

 Fixed the cast in the latter hunk to combine-diff.c which was
 wrong in the original.

 Also converted ones left-over in combine-diff.c, diff-lib.c and
 upload-pack.c ]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:53:10 -07:00
Rene Scharfe
7230e6d042 Add write_or_die(), a helper function
The little helper write_or_die() won't come back with bad news about
full disks or broken pipes.  It either succeeds or terminates the
program, making additional error handling unnecessary.

This patch adds the new function and uses it to replace two similar
ones (the one in tar-tree originally has been copied from cat-file
btw.).  I chose to add the fd parameter which both lacked to make
write_or_die() just as flexible as write() and thus suitable for
lib-ification.

There is a regression: error messages emitted by this function don't
show the program name, while the replaced two functions did.  That's
acceptable, I think; a lot of other functions do the same.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-21 20:22:23 -07:00
David Rientjes
a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
David Rientjes
0bef57ee44 make inline is_null_sha1 global
Replace sha1 comparisons to null_sha1 with a global inline (which previously an
unused static inline in builtin-apply.c)

[jc: with a fix from Jonas Fonseca.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 15:06:03 -07:00
Junio C Hamano
647377c4c9 Merge branch 'jc/pack-objects' 2006-08-12 19:33:16 -07:00