This creates a simple specialized object allocator for basic
objects.
This avoids wasting space with malloc overhead (metadata and
extra alignment), since the specialized allocator knows the
alignment, and that objects, once allocated, are never freed.
It also allows us to track some basic statistics about object
allocations. For example, for the mozilla import, it shows
object usage as follows:
blobs: 627629 (14710 kB)
trees: 1119035 (34969 kB)
commits: 196423 (8440 kB)
tags: 1336 (46 kB)
and the simpler allocator shaves off about 2.5% off the memory
footprint off a "git-rev-list --all --objects", and is a bit
faster too.
[ Side note: this concludes the series of "save memory in object storage".
The thing is, there simply isn't much more to be saved on the objects.
Doing "git-rev-list --all --objects" on the mozilla archive has a final
total RSS of 131498 pages for me: that's about 513MB. Of that, the
object overhead is now just 56MB, the rest is going somewhere else (put
another way: the fact that this patch shaves off 2.5% of the total
memory overhead, considering that objects are now not much more than 10%
of the total shows how big the wasted space really was: this makes
object allocations much more memory- and time-efficient).
I haven't looked at where the rest is, but I suspect the bulk of it is
just the pack-file loading. It may be that we should pack the tree
objects separately from the blob objects: for git-rev-list --objects, we
don't actually ever need to even look at the blobs, but since trees and
blobs are interspersed in the pack-file, we end up not being dense in
the tree accesses, so we end up looking at more pages than we strictly
need to.
So with a 535MB pack-file, it's entirely possible - even likely - that
most of the remaining RSS is just the mmap of the pack-file itself. We
don't need to map in _all_ of it, but we do end up mapping a fair
amount. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This enhances core.sharedrepository to have additionally
specify that read and exec permissions to be given to others as
well. It is useful when serving a repository via gitweb and
git-daemon that runs as a user outside the project group.
The configuration item can take the following values:
[core]
sharedrepository ; the same as "group"
sharedrepository = true ; ditto
sharedrepository = 1 ; ditto
sharedrepository = group ; allow rwx to group
sharedrepository = all ; allow rwx to group, allow rx to other
sharedrepository = umask ; not shared - use umask
It also extends "git init-db" to take "--shared=all" and friends
from the command line.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The framework to create lockfiles that are removed at exit is
first used to reliably write the index file, but it is
applicable to other things, so stop calling it "cache_file".
This also rewords a few remaining error message that called the
index file "cache file".
Signed-off-by: Junio C Hamano <junkio@cox.net>
* sp/reflog:
fetch.c: do not pass uninitialized lock to unlock_ref().
Test that git-branch -l works.
Verify git-commit provides a reflog message.
Enable ref log creation in git checkout -b.
Create/delete branch ref logs.
Include ref log detail in commit, reset, etc.
Change order of -m option to update-ref.
Correct force_write bug in refs.c
Change 'master@noon' syntax to 'master@{noon}'.
Log ref updates made by fetch.
Force writing ref if it doesn't exist.
Added logs/ directory to repository layout.
General ref log reading improvements.
Fix ref log parsing so it works properly.
Support 'master@2 hours ago' syntax
Log ref updates to logs/refs/<ref>
Convert update-ref to use ref_lock API.
Improve abstraction of ref lock/write.
* jc/cache-tree: (26 commits)
builtin-rm: squelch compiler warnings.
git-write-tree writes garbage on sparc64
Fix crash when reading the empty tree
fsck-objects: do not segfault on missing tree in cache-tree
cache-tree: a bit more debugging support.
read-tree: invalidate cache-tree entry when a new index entry is added.
Fix test-dump-cache-tree in one-tree disappeared case.
fsck-objects: mark objects reachable from cache-tree
cache-tree: replace a sscanf() by two strtol() calls
cache-tree.c: typefix
test-dump-cache-tree: validate the cached data as well.
cache_tree_update: give an option to update cache-tree only.
read-tree: teach 1-way merege and plain read to prime cache-tree.
read-tree: teach 1 and 2 way merges about cache-tree.
update-index: when --unresolve, smudge the relevant cache-tree entries.
test-dump-cache-tree: report number of subtrees.
cache-tree: sort the subtree entries.
Teach fsck-objects about cache-tree.
index: make the index file format extensible.
cache-tree: protect against "git prune".
...
Conflicts:
Makefile, builtin.h, git.c: resolved the same way as in next.
* master: (90 commits)
fetch.c: remove an unused variable and dead code.
Clean up sha1 file writing
Builtin git-cat-file
builtin format-patch: squelch content-type for 7-bit ASCII
CMIT_FMT_EMAIL: Q-encode Subject: and display-name part of From: fields.
add more informative error messages to git-mktag
remove the artificial restriction tagsize < 8kb
git-rebase: use canonical A..B syntax to format-patch
git-format-patch: now built-in.
fmt-patch: Support --attach
fmt-patch: understand old <his> notation
Teach fmt-patch about --keep-subject
Teach fmt-patch about --numbered
fmt-patch: implement -o <dir>
fmt-patch: output file names to stdout
Teach fmt-patch to write individual files.
built-in tar-tree and remote tar-tree
Builtin git-diff-files, git-diff-index, git-diff-stages, and git-diff-tree.
Builtin git-show-branch.
Builtin git-apply.
...
This makes "git format-patch" a built-in.
* js/fmt-patch:
git-rebase: use canonical A..B syntax to format-patch
git-format-patch: now built-in.
fmt-patch: Support --attach
fmt-patch: understand old <his> notation
Teach fmt-patch about --keep-subject
Teach fmt-patch about --numbered
fmt-patch: implement -o <dir>
fmt-patch: output file names to stdout
Teach fmt-patch to write individual files.
Use RFC2822 dates from "git fmt-patch".
git-fmt-patch: thinkofix to show [PATCH] properly.
rename internal format-patch wip
Minor tweak on subject line in --pretty=email
Tentative built-in format-patch.
This makes 'git add' and 'git rm' built-ins.
* lt/dirwalk:
Add builtin "git rm" command
Move pathspec matching from builtin-add.c into dir.c
Prevent bogus paths from being added to the index.
builtin-add: fix unmatched pathspec warnings.
Remove old "git-add.sh" remnants
builtin-add: warn on unmatched pathspecs
Do "git add" as a builtin
Clean up git-ls-file directory walking library interface
libify git-ls-files directory traversal
* master: (119 commits)
diff family: add --check option
Document that "git add" only adds non-ignored files.
Add a conversion tool to migrate remote information into the config
fetch, pull: ask config for remote information
Fix build procedure for builtin-init-db
read-tree -m -u: do not overwrite or remove untracked working tree files.
apply --cached: do not check newly added file in the working tree
Implement a --dry-run option to git-quiltimport
Implement git-quiltimport
Revert "builtin-grep: workaround for non GNU grep."
builtin-grep: workaround for non GNU grep.
builtin-grep: workaround for non GNU grep.
git-am: use apply --cached
apply --cached: apply a patch without using working tree.
apply --numstat: show new name, not old name.
Documentation/Makefile: create tarballs for the man pages and html files
Allow pickaxe and diff-filter options to be used by git log.
Libify the index refresh logic
Builtin git-init-db
Remove unnecessary local in get_ref_sha1.
...
This commit is what this branch is all about. It records the
evil merge needed to adjust built-in git-add and git-rm for
the cache-tree extension.
* lt/dirwalk:
Add builtin "git rm" command
Move pathspec matching from builtin-add.c into dir.c
Prevent bogus paths from being added to the index.
builtin-add: fix unmatched pathspec warnings.
Remove old "git-add.sh" remnants
builtin-add: warn on unmatched pathspecs
Do "git add" as a builtin
Clean up git-ls-file directory walking library interface
libify git-ls-files directory traversal
Conflicts:
Makefile
builtin.h
git.c
update-index.c
* jc/cache-tree: (24 commits)
Fix crash when reading the empty tree
fsck-objects: do not segfault on missing tree in cache-tree
cache-tree: a bit more debugging support.
read-tree: invalidate cache-tree entry when a new index entry is added.
Fix test-dump-cache-tree in one-tree disappeared case.
fsck-objects: mark objects reachable from cache-tree
cache-tree: replace a sscanf() by two strtol() calls
cache-tree.c: typefix
test-dump-cache-tree: validate the cached data as well.
cache_tree_update: give an option to update cache-tree only.
read-tree: teach 1-way merege and plain read to prime cache-tree.
read-tree: teach 1 and 2 way merges about cache-tree.
update-index: when --unresolve, smudge the relevant cache-tree entries.
test-dump-cache-tree: report number of subtrees.
cache-tree: sort the subtree entries.
Teach fsck-objects about cache-tree.
index: make the index file format extensible.
cache-tree: protect against "git prune".
Add test-dump-cache-tree
Use cache-tree in update-index.
...
This cleans up and libifies the "git update-index --[really-]refresh"
functionality. This will be eventually required for eventually doing the
"commit" and "status" commands as built-ins.
It really just moves "refresh_index()" from update-index.c to
read-cache.c, but it also has to change the calling convention so that the
function uses a "unsigned int flags" argument instead of various static
flags variables for passing down the information about whether to be quiet
or not, and allow unmerged entries etc.
That actually cleans up update-index.c too, since it turns out that all
those flags were really specific to that one function of the index update,
so they shouldn't have had file-scope visibility even before.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
An earlier patch from Shawn Pearce dependes on a change that is
only in "next". I do not want to make this series hostage to
the yet-to-graduate js/fmt-patch branch, but let's try fixing it
by merging the early parts of the branch to see what happens.
Right now, 'sp/reflog' will not be in "next" for now, so I won't
have to regret this -- if this merge causes problem down the road
merging I can always rebuild the topic branch ;-).
With this one, it's now a fatal error to try to add a pathname
that cannot be added with "git add", i.e.
[torvalds@g5 git]$ git add .git/config
fatal: unable to add .git/config to index
and
[torvalds@g5 git]$ git add foo/../bar
fatal: unable to add foo/../bar to index
instead of the old "Ignoring path xyz" warning that would end up
silently succeeding on any other paths.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If config parameter core.logAllRefUpdates is true or the log
file already exists then append a line to ".git/logs/refs/<ref>"
whenever git-update-ref <ref> is executed. Each log line contains
the following information:
oldsha1 <SP> newsha1 <SP> committer <LF>
where committer is the current user, date, time and timezone in
the standard GIT ident format. If the caller is unable to append
to the log file then git-update-ref will fail without updating <ref>.
An optional message may be included in the log line with the -m flag.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* master: (109 commits)
t1300-repo-config: two new config parsing tests.
Another config file parsing fix.
update-index: plug memory leak from prefix_path()
checkout-index: plug memory leak from prefix_path()
update-index --unresolve: work from a subdirectory.
pack-object: squelch eye-candy on non-tty
core.prefersymlinkrefs: use symlinks for .git/HEAD
repo-config: trim white-space before comment
Fix for config file section parsing.
Clarify git-cherry documentation.
Update git-unpack-objects documentation.
Fix up docs where "--" isn't displayed correctly.
Several trivial documentation touch ups.
git-svn 1.0.0
git-svn: documentation updates
delta: stricter constness
Makefile: do not link rev-list any specially.
builtin-push: --all and --tags _are_ explicit refspecs
builtin-log/whatchanged/show: make them official.
show-branch: omit uninteresting merges.
...
This updates the user interface and generated diff data format.
* "diff --binary" is used to signal that we want an e-mailable
binary patch. It implies --full-index and -p.
* "apply --allow-binary-replacement" acquired a short synonym
"apply --binary".
* After the "GIT binary patch\n" header line there is a token
to record which binary patch mechanism was used, so that we
can extend it later. Currently there are two mechanisms
defined: "literal" and "delta". The former records the
deflated postimage and the latter records the deflated delta
from the preimage to postimage.
For purely implementation convenience, I added the deflated
length after these "literal/delta" tokens (otherwise the
decoding side needs to guess and reallocate the buffer while
inflating). Improvement patches are very welcomed.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds "binary patch" to the diff output and teaches apply
what to do with them.
On the diff generation side, traditionally, we said "Binary
files differ\n" without giving anything other than the preimage
and postimage object name on the index line. This was good
enough for applying a patch generated from your own repository
(very useful while rebasing), because the postimage would be
available in such a case. However, this was not useful when the
recipient of such a patch via e-mail were to apply it, even if
the preimage was available.
This patch allows the diff to generate "binary" patch when
operating under --full-index option. The binary patch follows
the usual extended git diff headers, and looks like this:
"GIT binary patch\n"
<length byte><data>"\n"
...
"\n"
Each line is prefixed with a "length-byte", whose value is upper
or lowercase alphabet that encodes number of bytes that the data
on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ...,
'Z' means 26, 'a' means 27, ...). <data> is 1 or more groups of
5-byte sequence, each of which encodes up to 4 bytes in base85
encoding. Because 52 / 4 * 5 = 65 and we have the length byte,
an output line is capped to 66 characters. The payload is the
same diff-delta as we use in the packfiles.
On the consumption side, git-apply now can decode and apply the
binary patch when --allow-binary-replacement is given, the diff
was generated with --full-index, and the receiving repository
has the preimage blob, which is the same condition as it always
required when accepting an "Binary files differ\n" patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When inspecting a project whose build infrastructure used to
assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs
in the config file of such a project would help to bisect its
history.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Still Work-in-progress git fmt-patch (should it be known as
format-patch-ng?) is matched with the fix made by Huw Davies
in 262a6ef76a commit to use
RFC2822 date format.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* master:
t0000-basic: more commit-tree tests.
commit-tree.c: check_valid() microoptimization.
Fix filename verification when in a subdirectory
rebase: typofix.
socksetup: don't return on set_reuse_addr() error
If you don't have a "--" marker, then:
- all of the arguments we are going to assume are pathspecs
must exist in the working tree.
- none of the arguments we parsed as revisions could be
interpreted as a filename.
so that there really isn't any possibility of confusion in case
somebody does have a revision that looks like a pathname too.
The former rule has been in effect; this implements the latter.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we are in a subdirectory of a git archive, we need to take the prefix
of that subdirectory into accoung when we verify filename arguments.
Noted by Matthias Lederhofer
This also uses the improved error reporting for all the other git commands
that use the revision parsing interfaces, not just git-rev-parse. Also, it
makes the error reporting for mixed filenames and argument flags clearer
(you cannot put flags after the start of the pathname list).
[jc: with fix to a trivial typo noticed by Timo Hirvonen]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
read_cache_1() and write_cache_1() takes an extra parameter
*sha1 that returns the checksum of the index file when non-NULL.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.
Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h. This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Sometimes it is convient for a Porcelain to be able to checkout all
unmerged files in all stages so that an external merge tool can be
executed by the Porcelain or the end-user. Using git-unpack-file
on each stage individually incurs a rather high penalty due to the
need to fork for each file version obtained. git-checkout-index -a
--stage=all will now do the same thing, but faster.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* lt/rev-list:
setup_revisions(): handle -n<n> and -<n> internally.
git-log (internal): more options.
git-log (internal): add approxidate.
Rip out merge-order and make "git log <paths>..." work again.
Tie it all together: "git log"
Introduce trivial new pager.c helper infrastructure
git-rev-list libification: rev-list walking
Splitting rev-list into revisions lib, end of beginning.
rev-list split: minimum fixup.
First cut at libifying revlist generation
This introduces the new function
void setup_pager(void);
to set up output to be written through a pager applocation.
All in preparation for doing the simple scripts in C.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
git repo-config apply.whitespace error
Signed-off-by: Junio C Hamano <junkio@cox.net>
- Fix -Wundef -Wold-style-definition warnings
- Make pll_free() static
[jc: original patch by Timo had another unrelated bits:
- Use setenv() instead of putenv()
I'm postponing that part for now.]
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/nostat:
cache_name_compare() compares name and stage, nothing else.
"assume unchanged" git: documentation.
ls-files: split "show-valid-bit" into a different option.
"Assume unchanged" git: --really-refresh fix.
ls-files: debugging aid for CE_VALID changes.
"Assume unchanged" git: do not set CE_VALID with --refresh
"Assume unchanged" git
Previous one warned people upfront to encourage fixing their
environment early, but some people just use repositories and git
tools read-only without making any changes, and in such a case
there is not much point insisting on them having a usable ident.
This round attempts to move the error until either "git-var"
asks for the ident explicitly or "commit-tree" wants to use it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It used to be that "git-unpack-objects" would give nice percentages, but
now that we don't unpack the initial clone pack any more, it doesn't. And
I'd love to do that nice percentage view in the pack objects downloader
too, but the thing doesn't even read the pack header, much less know how
much it's going to get, so I was lazy and didn't.
Instead, it at least prints out how much data it's gotten, and what the
packing speed is. Which makes the user realize that it's actually doing
something useful instead of sitting there silently (and if the recipient
knows how large the final result is, he can at least make a guess about
when it migt be done).
So with this patch, I get something like this on my DSL line:
[torvalds@g5 ~]$ time git clone master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 clone-test
Packing 188543 objects
48.398MB (154 kB/s)
where even the speed approximation seems to be roughtly correct (even
though my algorithm is a truly stupid one, and only really gives "speed in
the last half second or so").
Anyway, _something_ like this is definitely needed. It could certainly be
better (if it showed the same kind of thing that git-unpack-objects did,
that would be much nicer, but would require parsing the object stream as
it comes in). But this is big step forward, I think.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds "assume unchanged" logic, started by this message in the list
discussion recently:
<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>
This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of. On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.
You can use two new options to update-index to set and reset the
CE_VALID bit:
git-update-index --assume-unchanged path...
git-update-index --no-assume-unchanged path...
These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file. Nor they add a new
entry to the index.
When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:
- update-index to explicitly register the current object name to the
index file.
- when update-index --refresh finds the path to be up-to-date.
- when tools like read-tree -u and apply --index update the working
tree file and register the current object name to the index file.
The flag is dropped upon read-tree that does not check out the index
entry. This happens regardless of the core.ignorestat settings.
Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time. However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:
- while "git-read-tree -m" or git-apply need to make sure
that the paths involved in the merge do not have local
modifications. This sacrifices performance for safety.
- when git-checkout-index -f -q -u -a tries to see if it needs
to checkout the paths. Otherwise you can never check
anything out ;-).
- when git-update-index --really-refresh (a new flag) tries to
see if the index entry is up to date. You can start with
everything marked as CE_VALID and run this once to drop
CE_VALID bit for paths that are modified.
Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".
This version is not expected to be perfect. I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.
But the basics seem to work, and ready to be tested by people
who asked for this feature.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The minimum length of abbreviated object name was hardcoded in
different places to be 4, risking inconsistencies in the future.
Also there were three different "default abbreviation
precision". Use two C preprocessor symbols to clean up this
mess.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes read_tree_recursive and read_tree take a struct tree
instead of a buffer. It also move the declaration of read_tree into
tree.h (where struct tree is defined), and updates ls-tree and
diff-index (the only places that presently use read_tree*()) to use
the new versions.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When overriding DT_* macro detection with NO_D_TYPE_IN_DIRENT (recent
Cygwin build problem, which hopefully is already fixed in their CVS
snapshot version), we define DTYPE() macro to return just "we do not
know", but still needed to use DT_* macro to avoid ifdef in the code
we use them. If the platform defines DT_* macro but with unusable
d_type, this would have resulted in us redefining these preprocessor
symbols.
Admittedly, that would be just a couple of compilation warnings, and
on Cygwin at least this particular problem is transitory (the problem
is already fixed in their CVS snapshot version), so this is a low
priority fix.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The recent Cygwin defines DT_UNKNOWN although it does not have d_type
in struct dirent. Give an option to tell us not to use d_type on such
platforms. Hopefully this problem will be transient.
Signed-off-by: Junio C Hamano <junkio@cox.net>
ISO C99 (and GCC 3.x or later) lets you write a flexible array
at the end of a structure, like this:
struct frotz {
int xyzzy;
char nitfol[]; /* more */
};
GCC 2.95 and 2.96 let you to do this with "char nitfol[0]";
unfortunately this is not allowed by ISO C90.
This declares such construct like this:
struct frotz {
int xyzzy;
char nitfol[FLEX_ARRAY]; /* more */
};
and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and
empty for others.
If you are using a C90 C compiler, you should be able
to override this with CFLAGS=-DFLEX_ARRAY=1 from the
command line of "make".
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the config variable 'core.sharedrepository' is set, the directories
$GIT_DIR/objects/
$GIT_DIR/objects/??
$GIT_DIR/objects/pack
$GIT_DIR/refs
$GIT_DIR/refs/heads
$GIT_DIR/refs/heads/tags
are set group writable (and g+s, since the git group may be not the primary
group of all users).
Since all files are written as lock files first, and then moved to
their destination, they do not have to be group writable. Indeed, if
this leads to problems you found a bug.
Note that -- as in my first attempt -- the config variable is set in the
function which checks the repository format. If this were done in
git_default_config instead, a lot of programs would need to be modified
to call git_config(git_default_config) first.
[jc: git variables should be in environment.c unless there is a
compelling reason to do otherwise.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Split out the functions that deal with the socketpair after
finishing git protocol handshake to receive the packed data into
a separate file, and use it in fetch-pack to keep/explode the
received pack data. We earlier had something like that on
clone-pack side once, but the list discussion resulted in the
decision that it makes sense to always keep the pack for
clone-pack, so unpacking option is not enabled on the clone-pack
side, but we later still could do so easily if we wanted to with
this change.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In order to support getting data into git with scripts, this adds a
--stdin option to git-hash-object, which will make it read from stdin.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This attempts to clean up the way various compatibility
functions are defined and used.
- A new header file, git-compat-util.h, is introduced. This
looks at various NO_XXX and does necessary function name
replacements, equivalent of -Dstrcasestr=gitstrcasestr in the
Makefile.
- Those function name replacements are removed from the Makefile.
- Common features such as usage(), die(), xmalloc() are moved
from cache.h to git-compat-util.h; cache.h includes
git-compat-util.h itself.
Signed-off-by: Junio C Hamano <junkio@cox.net>
- prefix_filename() is like prefix_path() but can be used to
name any file on the filesystem, not the files that might go
into the index file.
- setup_git_directory_gently() tries to find the GIT_DIR, but does
not die() if called outside a git repository.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is to hold what the project-local rule as to the
charset/encoding for the commit log message is. Lack of it
defaults to utf-8.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes init-db repository version aware.
It checks if an existing config file says the repository being
reinitialized is of a wrong version and aborts before doing
further harm.
When copying the templates, it makes sure the they are of the
right repository format version. Otherwise the templates are
ignored with an warning message.
It copies the templates before creating the HEAD, and if the
config file is copied from the template directory, reads it,
primarily to pick up the value of core.symrefsonly.
It changes the way the result of the filemode reliability test
is written to the configuration file using git_config_set().
The test is done even if the config file was copied from the
templates.
And finally, our own repository format version is written to the
config file.
Signed-off-by: Junio C Hamano <junkio@cox.net>
var.c::git_var read function did not have to return writable
strings; make it and the functions it points at return const char *
instead.
ident.c::get_ident() did not need to be global, so make it
static.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Make some functions static and convert func() function prototypes to to
func(void). Fix declaration after statement, missing declaration and
redundant declaration warnings.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
... namely
--replace-all, to replace any amount of matching lines, not just 0 or 1,
--get, to get the value of one key,
--get-all, the multivar version of --get, and
--unset-all, which deletes all matching lines from .git/config
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch provides the work-horse of the user-relative paths feature,
using Linus' idea of a blind chdir() and getcwd() which makes it
remarkably simple.
Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The function git_config_set() does exactly what you think it does.
Given a key (in the form "core.filemode") and a value, it sets the
key to the value. Example:
git_config_set("core.filemode", "true");
The function git_config_set_multivar() is meant for setting variables which
can have several values for the same key. Example:
[diff]
twohead = resolve
twohead = recarsive
the typo in the second line can be replaced by
git_config_set_multivar("diff.twohead", "recursive", "^recar");
The third argument of the function is a POSIX extended regex which has to
match the value. If there is no key/value pair with a matching value, a new
key/value pair is added.
These commands are also capable of unsetting (deleting) entries:
git_config_set_multivar("diff.twohead", NULL, "sol");
will delete the entry
twohead = resolve
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Ok. This is the insane patch to do this.
It really isn't very careful, and the reason I call it "approxidate()"
will become obvious when you look at the code. It is very liberal in what
it accepts, to the point where sometimes the results may not make a whole
lot of sense.
It accepts "last week" as a date string, by virtue of "last" parsing as
the number 1, and it totally ignoring superfluous fluff like "ago", so
"last week" ends up being exactly the same thing as "1 week ago". Fine so
far.
It has strange side effects: "last december" will actually parse as "Dec
1", which actually _does_ turn out right, because it will then notice that
it's not December yet, so it will decide that you must be talking about a
date last year. So it actually gets it right, but it's kind of for the
"wrong" reasons.
It also accepts the numbers 1..10 in string format ("one" .. "ten"), so
you can do "ten weeks ago" or "ten hours ago" and it will do the right
thing.
But it will do some really strange thigns too: the string "this will last
forever", will not recognize anyting but "last", which is recognized as
"1", which since it doesn't understand anything else it will think is the
day of the month. So if you do
gitk --since="this will last forever"
the date will actually parse as the first day of the current month.
And it will parse the string "now" as "now", but only because it doesn't
understand it at all, and it makes everything relative to "now".
Similarly, it doesn't actually parse the "ago" or "from now", so "2 weeks
ago" is exactly the same as "2 weeks from now". It's the current date
minus 14 days.
But hey, it's probably better (and certainly faster) than depending on GNU
date. So now you can portably do things like
gitk --since="two weeks and three days ago"
git log --since="July 5"
git-whatchanged --since="10 hours ago"
git log --since="last october"
and it will actually do exactly what you thought it would do (I think). It
will count 17 days backwards, and it will do so even if you don't have GNU
date installed.
(I don't do "last monday" or similar yet, but I can extend it to that too
if people want).
It was kind of fun trying to write code that uses such totally relaxed
"understanding" of dates yet tries to get it right for the trivial cases.
The result should be mixed with a few strange preprocessor tricks, and be
submitted for the IOCCC ;)
Feel free to try it out, and see how many strange dates it gets right. Or
wrong.
And if you find some interesting (and valid - not "interesting" as in
"strange", but "interesting" as in "I'd be interested in actually doing
this) thing it gets wrong - usually by not understanding it and silently
just doing some strange things - please holler.
Now, as usual this certainly hasn't been getting a lot of testing. But my
code always works, no?
Linus
Signed-off-by: Junio C Hamano <junkio@cox.net>
A while ago, a rename-detection limit logic was implemented as a
response to this thread:
http://marc.theaimsgroup.com/?l=git&m=112413080630175
where gitweb was found to be using a lot of time and memory to
detect renames on huge commits. git-diff family takes -l<num>
flag, and if the number of paths that are rename destination
candidates (i.e. new paths with -M, or modified paths with -C)
are larger than that number, skips rename/copy detection even
when -M or -C is specified on the command line.
This commit makes the rename detection limit easier to use. You
can have:
[diff]
renamelimit = 30
in your .git/config file to specify the default rename detection
limit. You can override this from the command line; giving 0
means 'unlimited':
git diff -M -l0
We might want to change the default behaviour, when you do not
have the configuration, to limit it to say 20 paths or so. This
would also help the diffstat generation after a big 'git pull'.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows you to force git to avoid symlinks for refs. Just add
something like
[core]
symrefsonly = true
to .git/config.
Don´t forget to "git checkout your_branch", or it does not do anything...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch implements the client side of backward compatible upload-pack
protocol extension, <20051027141619.0e8029f2.vsu@altlinux.ru> by Sergey.
The updated server can append "server_capabilities" which is supposed
to be a string containing space separated features of the server, after
one of elements in the initial list of SHA1-refname line, hidden with
an embedded NUL.
After get_remote_heads(), check if the server supports the feature like
if (server_supports("multi_ack"))
do_something();
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-pack-objects can reuse pack files stored in $GIT_DIR/pack-cache
directory, when a necessary pack is found. This is hopefully useful
when upload-pack (called from git-daemon) is expected to receive
requests for the same set of objects many times (e.g full cloning
request of any project, or updates from the set of heads previous day
to the latest for a slow moving project).
Currently git-pack-objects does *not* keep pack files it creates for
reusing. It might be useful to add --update-cache option to it,
which would allow it store pack files it created in the pack-cache
directory, and prune rarely used ones from it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows the remote side (most notably, upload-pack) to show
additional information without affecting the downloader. Peek-remote
does not ignore them -- this is to make it useful for Pasky's
automatic tag following.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Do our own ctype.h, just to get the sane semantics: we want
locale-independence, _and_ we want the right signed behaviour. Plus we
only use a very small subset of ctype.h anyway (isspace, isalpha,
isdigit and isalnum).
Signed-off-by: Junio C Hamano <junkio@cox.net>
If we want to re-pack just local packfiles, we need to know whether a
particular object is local or not.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This starts using the "user.name" and "user.email" config variables if
they exist as the default name and email when committing. This means
that you don't have to use the GIT_COMMITTER_EMAIL environment variable
to override your email - you can just edit the config file instead.
The patch looks bigger than it is because it makes the default name and
email information non-static and renames it appropriately. And it moves
the common git environment variables into a new library file, so that
you can link against libgit.a and get the git environment without having
to link in zlib and libcrypt.
In short, most of it is renaming and moving, the real change core is
just a few new lines in "git_default_config()" that copies the user
config values to the new base.
It also changes "git-var -l" to list the config variables.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-show-branch acquires two new options. --sha1-name to name
commits using the unique prefix of their object names, and
--no-name to not to show names at all.
This was outlined in <7vk6gpyuyr.fsf@assigned-by-dhcp.cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The http commit walker cannot use the same temporary file
creation code because it needs to use predictable temporary
filename for partial fetch continuation purposes, but the code
to move the temporary file to the final location should be
usable from the ordinary object creation codepath.
Export move_temp_to_file from sha1_file.c and use it, while
losing the custom relink_or_rename function from http-fetch.c.
Also the temporary object file creation part needs to make sure
the leading path exists, in preparation of the really lazy
fan-out directory creation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a first cut at a very simple parser for a git config file.
The format of the file is a simple ini-file like thing, with simple
variable/value pairs. You can (and should) make the variables have a
simple single-level scope, ie a valid file looks something like this:
#
# This is the config file, and
# a '#' or ';' character indicates
# a comment
#
; core variables
[core]
; Don't trust file modes
filemode = false
; Our diff algorithm
[diff]
external = "/usr/local/bin/gnu-diff -u"
renames = true
which parses into three variables: "core.filemode" is associated with the
string "false", and "diff.external" gets the appropriate quoted value.
Right now we only react to one variable: "core.filemode" is a boolean that
decides if we should care about the 0100 (user-execute) bit of the stat
information. Even that is just a parsing demonstration - this doesn't
actually implement that st_mode compare logic itself.
Different programs can react to different config options, although they
should always fall back to calling "git_default_config()" on any config
option name that they don't recognize.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since some platforms do not support mmap() at all, and others do only just
so, this patch introduces the option to fake mmap() and munmap() by
malloc()ing and read()ing explicitely.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
This adds more cruft to diff --git header to record the blob SHA1 and
the mode the patch/diff is intended to be applied against, to help the
receiving end fall back on a three-way merge. The new header looks
like this:
diff --git a/apply.c b/apply.c
index 7be5041..8366082 100644
--- a/apply.c
+++ b/apply.c
@@ -14,6 +14,7 @@
// files that are being modified, but doesn't apply the patch
// --stat does just a diffstat, and doesn't actually apply
+// --show-index-info shows the old and new index info for...
...
Upon receiving such a patch, if the patch did not apply cleanly to the
target tree, the recipient can try to find the matching old objects in
her object database and create a temporary tree, apply the patch to
that temporary tree, and attempt a 3-way merge between the patched
temporary tree and the target tree using the original temporary tree
as the common ancestor.
The patch lifts the code to compute the hash for an on-filesystem
object from update-index.c and makes it available to the diff output
routine.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds the counterpart of git-update-ref that lets you read
and create "symbolic refs". By default it uses a symbolic link
to represent ".git/HEAD -> refs/heads/master", but it can be compiled
to use the textfile symbolic ref.
The places that did 'readlink .git/HEAD' and 'ln -s refs/heads/blah
.git/HEAD' have been converted to use new git-symbolic-ref command, so
that they can deal with either implementation.
Signed-off-by: Junio C Hamano <junio@twinsun.com>
Symbolic refs are understood by resolve_ref(), so existing read_ref()
users will automatically understand them as well.
Signed-off-by: Junio C Hamano <junio@twinsun.com>
This extends the ref reading to understand a "symbolic ref": a ref file
that starts with "ref: " and points to another ref file, and thus
introduces the notion of ref aliases.
This is in preparation of allowing HEAD to eventually not be a symlink,
but one of these symbolic refs instead.
[jc: Linus originally required the prefix to be "ref: " five bytes
and nothing else, but I changed it to allow and strip any number of
leading whitespaces to match what update-ref.c does.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a long overdue clean-up to the code for parsing and passing
diff options. It also tightens some constness issues.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Right now we don't return any error value at all from parse_date(), and if
we can't parse it, we just silently leave the result buffer unchanged.
That's fine for the current user, which will always default to the current
date, but it's a crappy interface, and we might well be better off with an
error message rather than just the default date.
So let's change the thing to return a negative value if an error occurs,
and the length of the result otherwise (snprintf behaviour: if the buffer
is too small, it returns how big it _would_ have been).
[ I started looking at this in case we could support date-based revision
names. Looks ugly. Would have to parse relative dates.. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add -m/--modified to show files that have been modified wrt. the index.
[jc: The original came from Brian Gerst on Sep 1st but it only checked
if the paths were cache dirty without actually checking the files were
modified. I also added the usage string and a new test.]
Signed-off-by: Junio C Hamano <junkio@cox.net>
The git port (9418) is officially listed by IANA now.
So document it.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We have deprecated the old environment variable names for quite a
while and now it's time to remove them. Gone are:
SHA1_FILE_DIRECTORIES AUTHOR_DATE AUTHOR_EMAIL AUTHOR_NAME
COMMIT_AUTHOR_EMAIL COMMIT_AUTHOR_NAME SHA1_FILE_DIRECTORY
Signed-off-by: Junio C Hamano <junkio@cox.net>
Hi. This patch contains the following possible cleanups:
* Make some needlessly global functions in local-pull.c static
* Change 'char *' to 'const char *' where appropriate
Signed-off-by: Peter Hagervall <hager@cs.umu.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This reverts 6c5f9baa3b commit, whose
change breaks gcc-2.95.
Not that I ignore portability to compilers that are properly C99, but
keeping compilation with GCC working is more important, at least for
now. We would probably end up declaring with "name[1]" and teach the
allocator to subtract one if we really aimed for portability, but that
is left for later rounds.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Omitting the first branch in ?: is a GNU extension. Cute,
but not supported by other compilers. Replaced mostly
by explicit tests. Calls to getenv() simply are repeated
on non-GNU compilers.
Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
It cannot be checked with #ifndef, if you really think about what it
does which cannot be done only with the preprocessor. My thinko.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Not all programs necessarily have a pathspec array of pathnames, some of
them (like git-update-cache) want to do things one file at a time. So
export the single-path interface too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We always show the diff as an absolute path, but pathnames to diff are
taken relative to the current working directory (and if no pathnames are
given, the default ends up being all of the current working directory).
Note that "../xyz" also works, so you can do
cd linux/drivers/char
git diff ../block
and it will generate a diff of the linux/drivers/block changes.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Note that the pack file has to be in the usual location if it gets
installed later.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It was a mistake to use GIT_ALTERNATE_OBJECT_DIRECTORIES
environment variable to specify what alternate object pools to
look for missing objects when working with an object database.
It is not a property of the process running the git commands,
but a property of the object database that is partial and needs
other object pools to complete the set of objects it lacks.
This patch allows you to have $GIT_OBJECT_DIRECTORY/info/alternates
whose contents is in exactly the same format as the environment
variable, to let an object database name alternate object pools
it depends on.
Signed-off-by: Junio C Hamano <junkio@cox.net>
GCC's format __attribute__ is good for checking errors, especially
with -Wformat=2 parameter. This fixes most of the reported problems
against 2005-08-09 snapshot.
Per discussion with people interested in binary packaging,
change the default template location from /etc/git-core to
/usr/share/git-core hierarchy. If a user wants to run git
before installing for whatever reason, in addition to adding
$src to the PATH environment variable, git-init-db can be run
with --template=$src/templates/blt/ parameter.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows git-send-pack to push local refs to a destination
repository under different names.
Here is the name mapping rules for refs.
* If there is no ref mapping on the command line:
- if '--all' is specified, it is equivalent to specifying
<local> ":" <local> for all the existing local refs on the
command line
- otherwise, it is equivalent to specifying <ref> ":" <ref> for
all the refs that exist on both sides.
* <name> is just a shorthand for <name> ":" <name>
* <src> ":" <dst>
push ref that matches <src> to ref that matches <dst>.
- It is an error if <src> does not match exactly one of local
refs.
- It is an error if <dst> matches more than one remote refs.
- If <dst> does not match any remote refs, either
- it has to start with "refs/"; <dst> is used as the
destination literally in this case.
- <src> == <dst> and the ref that matched the <src> must not
exist in the set of remote refs; the ref matched <src>
locally is used as the name of the destination.
For example,
- "git-send-pack --all <remote>" works exactly as before;
- "git-send-pack <remote> master:upstream" pushes local master
to remote ref that matches "upstream". If there is no such
ref, it is an error.
- "git-send-pack <remote> master:refs/heads/upstream" pushes
local master to remote refs/heads/upstream, even when
refs/heads/upstream does not exist.
- "git-send-pack <remote> master" into an empty remote
repository pushes the local ref/heads/master to the remote
ref/heads/master.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A template mechanism to populate newly initialized repository
with default set of files is introduced. Use it to ship example
hooks that can be used for update and post update checks, as
Josef Weidendorfer suggests.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This causes ssh-pull to request objects in prefetch() and read then in
fetch(), such that it reduces the unpipelined round-trip time.
This also makes sha1_write_from_fd() support having a buffer of data
which it accidentally read from the fd after the object; this was
formerly not a problem, because it would always get a short read at
the end of an object, because the next object had not been
requested. This is no longer true.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds support for reading an uninstalled index, and installing a
pack file that was added while the program was running, as well as
functions for determining where to put the file.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduce a new file $GIT_DIR/info/grafts (or $GIT_GRAFT_FILE)
which is a list of "fake commit parent records". Each line of
this file is a commit ID, followed by parent commit IDs, all
40-byte hex SHA1 separated by a single SP in between. The
records override the parent information we would normally read
from the commit objects, allowing both adding "fake" parents
(i.e. grafting), and pretending as if a commit is not a child of
some of its real parents (i.e. cauterizing).
Signed-off-by: Junio C Hamano <junkio@cox.net>
The git-update-server-info command prepares informational files
to help clients discover the contents of a repository, and pull
from it via a dumb transport protocols. Currently, the
following files are produced.
- The $repo/info/refs file lists the name of heads and tags
available in the $repo/refs/ directory, along with their
SHA1. This can be used by git-ls-remote command running on
the client side.
- The $repo/info/rev-cache file describes the commit ancestry
reachable from references in the $repo/refs/ directory. This
file is in an append-only binary format to make the server
side friendly to rsync mirroring scheme, and can be read by
git-show-rev-cache command.
- The $repo/objects/info/pack file lists the name of the packs
available, the interdependencies among them, and the head
commits and tags contained in them. Along with the other two
files, this is designed to help clients to make smart pull
decisions.
The git-receive-pack command is changed to invoke it at the end,
so just after a push to a public repository finishes via "git
push", the server info is automatically updated.
In addition, building of the rev-cache file can be done by a
standalone git-build-rev-cache command separately.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Moving these functions allows all of the logic for figuring out what
these values are to be shared between programs.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
... and make git-diff-files use it too. This all _should_ make the
diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the
path matching early interally.
Useful for pulling stuff off a dedicated server. Instead of connecting
with ssh or just starting a local pipeline, we connect over TCP to the
other side and try to see if there's a git server listening.
Of course, since I haven't written the git server yet, that will never
happen. But the server really just needs to listen on a port, and
execute a "git-upload-pack" when somebody connects.
(It should read one packet-line, which should be of the format
"git-upload-pack directoryname\n"
and eventually we migth have other commands the server might accept).
Add write_sha1_to_fd(), which writes an object to a file descriptor. This
includes support for unpacking it and recompressing it.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes the first half of write_sha1_file() and
index_fd() externally visible, to allow callers to compute the
object ID without actually storing it in the object database.
[JC demangled the whitespaces himself because he liked the patch
so much, and reworked the interface to index_fd() slightly,
taking suggestion from Linus and of his own.]
Signed-off-by: Bryan Larsen <bryan.larsen@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The function write_one_ref() is passed the list of refs received
from the other end, which was obtained by directory traversal
under $GIT_DIR/refs; this can contain paths other than what
git-init-db prepares and would fail to clone when there is
such.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
"git_path()" returns a static pathname pointer into the git directory
using a printf-like format specifier.
"head_ref()" works like "for_each_ref()", except for just the HEAD.
This implements show_pack_info() function used in verify-pack
command when -v flag is used to obtain something like
unpack-objects used to give when it was first written.
It shows the following for each non-deltified object found in
the pack:
SHA1 type size offset
For deltified objects, it shows this instead:
SHA1 type size offset depth base_sha1
In order to get the output in the order that appear in the pack
file for debugging purposes, you can do this:
$ git-verify-pack -v packfile | sort -n -k 4,4
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nico pointed out that having verify_pack.c and verify-pack.c was
confusing. Rename verify_pack.c to pack-check.c as suggested,
and enhances the verification done quite a bit.
- Built-in sha1_file unpacking knows that a base object of a
deltified object _must_ be in the same pack, and takes
advantage of that fact.
- Earlier verify-pack command only checked the SHA1 sum for the
entire pack file and did not look into its contents. It now
checks everything idx file claims to have unpacks correctly.
- It now has a hook to give more detailed information for
objects contained in the pack under -v flag.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's not working yet, but it's at the point where I want to be able to
track my changes. The theory of operation is that this is the "remote"
side of a "git push". It can tell us what references the remote side
has, receives out reference update commands and a pack-file, and can
execute the unpacking command.
Given a list of <pack>.idx files, this command validates the
index file and the corresponding .pack file for consistency.
This patch also uses the same validation mechanism in fsck-cache
when the --full flag is used.
During normal operation, sha1_file.c verifies that a given .idx
file matches the .pack file by comparing the SHA1 checksum
stored in .idx file and .pack file as a minimum sanity check.
We may further want to check the pack signature and version when
we map the pack, but that would be a separate patch.
Earlier, errors to map a pack file was not flagged fatal but led
to a random fatal error later. This version explicitly die()s
when such an error is detected.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The commands git-fsck-cache and probably git-*-pull needs to have a way
to enumerate objects contained in packed GIT archives and alternate
object pools. This commit exposes the data structure used to keep track
of them from sha1_file.c, and adds a couple of accessor interface
functions for use by the enhanced git-fsck-cache command.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This lets us eliminate one use of map_sha1_file() outside
sha1_file.c, to bring us one step closer to the packed GIT.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Packed delta files created by git-pack-objects seems to be the
way to go, and existing "delta" object handling code has exposed
the object representation details to too many places. Remove it
while we refactor code to come up with a proper interface in
sha1_file.c.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
An earlier change to optimize directory-file conflict check
broke what "read-tree --emu23" expects. This is fixed by this
commit.
(1) Introduces an explicit flag to tell add_cache_entry() not to
check for conflicts and use it when reading an existing tree
into an empty stage --- by definition this case can never
introduce such conflicts.
(2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name()
aware of the cache stages, and flag conflict only with paths
in the same stage.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make 'sha1' parameters const where possible
Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds code to read a hash out of a specified file under
{GIT_DIR}/refs/, and to write such files atomically and optionally with an
compare and lock.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a "-u" flag to update the tree as a result of a merge.
Right now this code is way too anal about things, and fails merges it
shouldn't, but let me fix up the different cases and this will allow for
much smoother merging even in the presense of dirty data in the working
tree.
This adds sha1_file_size() helper function and uses it in the
rename/copy similarity estimator. The helper function handles
deltified object as well.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a remote repository is deltified, we need to get the
objects that a deltified object we want to obtain is based upon.
The initial parts of each retrieved SHA1 file is inflated and
inspected to see if it is deltified, and its base object is
asked from the remote side when it is. Since this partial
inflation and inspection has a small performance hit, it can
optionally be skipped by giving -d flag to git-*-pull commands.
This flag should be used only when the remote repository is
known to have no deltified objects.
Rsync transport does not have this problem since it fetches
everything the remote side has.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make a separate helper for parsing the header of an object file
(really carefully) and for unpacking the rest. This means that
anybody who uses the "unpack_sha1_header()" interface can easily
look at the header and decide to unpack the rest too, without
doing any extra work.
It's for people who aren't necessarily interested in the whole
unpacked file, but do want to know the header information (size,
type, etc..)
For example, the delta code can use this to figure out whether
an object is already a delta object, and what it is a delta
against, without actually bothering to unpack all of the actual
data in the delta.
Add <limits.h> to the include files handled by "cache.h", and remove
extraneous #include directives from various .c files. The rule is that
"cache.h" gets all the basic stuff, so that we'll have as few system
dependencies as possible.
This one compares two pathnames that may be partial basenames, not
full paths. We need to get the path sorting right, since a directory
name will sort as if it had the final '/' at the end.
With -u flag, git-checkout-cache picks up the stat information
from newly created file and updates the cache. This removes the
need to run git-update-cache --refresh immediately after running
git-checkout-cache.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Raw hashes should be unsigned char.
- String functions want signed char.
- Hash and compress functions want unsigned char.
Signed-off By: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
xmalloc() and xrealloc() now take their sizes as size_t-type arguments.
Introduced complementary xcalloc().
Signed-off-by: Brad Roberts <braddr@puremagic.com>
Signed-off-by: Petr Baudis <pasky@ucw.cz>
It is kind of surprising that this was missed in the last round,
but the work tree scanner in git-ls-files was still deliberately
ignoring symlinks. This patch fixes it, so that --others will
correctly report unregistered symlinks.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Petr Baudis <pasky@ucw.cz>
This allows git to be built even with linkers which are not smart enough
to join those symbols, and makes this correct C. Pointed out by several
people.
During the mailing list discussion on renaming GIT_ environment
variables, people felt that having one environment that lets the
user (or Porcelain) specify both SHA1_FILE_DIRECTORY (now
GIT_OBJECT_DIRECTORY) and GIT_INDEX_FILE for the default layout
would be handy. This change introduces GIT_DIR environment
variable, from which the defaults for GIT_INDEX_FILE and
GIT_OBJECT_DIRECTORY are derived. When GIT_DIR is not defined,
it defaults to ".git". GIT_INDEX_FILE defaults to
"$GIT_DIR/index" and GIT_OBJECT_DIRECTORY defaults to
"$GIT_DIR/objects".
Special thanks for ideas and discussions go to Petr Baudis and
Daniel Barkalow. Bugs are mine ;-)
Signed-off-by: Junio C Hamano <junkio@cox.net>
H. Peter Anvin mentioned that using SHA1_whatever as an
environment variable name is not nice and we should instead use
names starting with "GIT_" prefix to avoid conflicts. Here is
what this patch does:
* Renames the following environment variables:
New name Old Name
GIT_AUTHOR_DATE AUTHOR_DATE
GIT_AUTHOR_EMAIL AUTHOR_EMAIL
GIT_AUTHOR_NAME AUTHOR_NAME
GIT_COMMITTER_EMAIL COMMIT_AUTHOR_EMAIL
GIT_COMMITTER_NAME COMMIT_AUTHOR_NAME
GIT_ALTERNATE_OBJECT_DIRECTORIES SHA1_FILE_DIRECTORIES
GIT_OBJECT_DIRECTORY SHA1_FILE_DIRECTORY
* Introduces a compatibility macro, gitenv(), which does an
getenv() and if it fails calls gitenv_bc(), which in turn
picks up the value from old name while giving a warning about
using an old name.
* Changes all users of the environment variable to fetch
environment variable with the new name using gitenv().
* Updates the documentation and scripts shipped with Linus GIT
distribution.
The transition plan is as follows:
* We will keep the backward compatibility list used by gitenv()
for now, so the current scripts and user environments
continue to work as before. The users will get warnings when
they have old name but not new name in their environment to
the stderr.
* The Porcelain layers should start using new names. However,
just in case it ends up calling old Plumbing layer
implementation, they should also export old names, taking
values from the corresponding new names, during the
transition period.
* After a transition period, we would drop the compatibility
support and drop gitenv(). Revert the callers to directly
call getenv() but keep using the new names.
The last part is probably optional and the transition
duration needs to be set to a reasonable value.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When "path" exists as a file or a symlink in the index, an
attempt to add "path/file" is refused because it results in file
vs directory conflict. Similarly when "path/file1",
"path/file2", etc. exist, an attempt to add "path" as a file or
a symlink is refused. With git-update-cache --replace, these
existing entries that conflict with the entry being added are
automatically removed from the cache, with warning messages.
Signed-off-by: Junio C Hamano <junkio@cox.net>
SHA1_FILE_DIRECTORIES environment variable is a colon separated paths
used when looking for SHA1 files not found in the usual place for
reading. Creating a new SHA1 file does not use this alternate object
database location mechanism. This is useful to archive older, rarely
used objects into separate directories.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It didn't properly mark all cache updates as being dirty, and
causes merge errors due to that. In particular, it didn't notice
when a file was force-removed.
Besides, it was ugly as hell. I've put in place a slightly cleaner
version, but I've not enabled the optimization because I don't
want to be burned again.
Allow to store and track symlink in the repository. A symlink is stored
the same way as a regular file, only with the appropriate mode bits set.
The symlink target is therefore stored in a blob object.
This will hopefully make our udev repository fully functional. :)
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make it much safer: we write to a temporary file, and then link that
temporary file to the final destination. This avoids all the nasty
races if several people write the same object at the same time.
It should also result in nicer on-disk layout, since it means that
objects all get created in the same subdirectory. That makes a lot
of block allocation algorithms happier, since the objects will now
be allocated from the same zone.
A new command, git-write-blob, is introduced. This registers
the contents of any file on the filesystem as a blob in the
object database and reports its SHA1 to the standard output.
To implement it, the patch promotes index_fd() from a static
function in update-cache.c to extern and moves it to a library
source, sha1_file.c.
This command is used to update git-merge-one-file-script so that
it does not smudge the work tree.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows the programs to use various simplified versions of
the SHA1 names, eg just say "HEAD" for the SHA1 pointed to by
the .git/HEAD file etc.
For example, this commit has been done with
git-commit-tree $(git-write-tree) -p HEAD
instead of the traditional "$(cat .git/HEAD)" syntax.
...since everything out there is either strange (libc mktime has issues
with timezones) or introduces unnecessary dependencies for people (libcurl).
This goes back to the old date parsing, but moves it out into a file of
its own, and does the "struct tm" to "seconds since epoch" handling by
hand.
I grepped through the tz-database and it seems there's one "country"
left that has non-60-minute DST: Lord Howe Island. All others dropped
that before 1970.
This patch renames read_tree_with_tree_or_commit_sha1() to
read_object_with_reference() and extends it to automatically
dereference not just "commit" objects but "tag" objects. With
this patch, you can say e.g.:
ls-tree $tag
read-tree -m $(merge-base $tag $HEAD) $tag $HEAD
diff-cache $tag
diff-tree $tag $HEAD
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce xmalloc and xrealloc to die gracefully with a descriptive
message when out of memory, rather than taking a SIGSEGV.
Signed-off-by: Christopher Li<chrislgit@chrisli.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds two functions: one to check if an object is present in the local
database, and one to add an object to the local database by reading it
from a file descriptor and checking its hash.
Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This one is about a million times simpler, and much more likely to be
correct too.
Instead of trying to match up a tree object against the index, we just
read in the tree object side-by-side into the index, and just walk the
resulting index file. This was what all the read-tree cleanups were
all getting to.
This one includes the Mozilla SHA1 implementation sent in by Edgar Toernig.
It's dual-licenced under MPL-1.1 or GPL, so in the context of git, we
obviously use the GPL version.
Side note: the Mozilla SHA1 implementation is about twice as fast as the
default openssl one on my G5, but the default openssl one has optimized
x86 assembly language on x86. So choose wisely.
We use that to specify alternative index files, which can be useful
if you want to (for example) generate a temporary index file to do
some specific operation that you don't want to mess with your main
one with.
It defaults to the regular ".git/index" if it hasn't been specified.
This patch implements read_tree_with_tree_or_commit_sha1(),
which can be used when you are interested in reading an unpacked
raw tree data but you do not know nor care if the SHA1 you
obtained your user is a tree ID or a commit ID. Before this
function's introduction, you would have called read_sha1_file(),
examined its type, parsed it to call read_sha1_file() again if
it is a commit, and verified that the resulting object is a
tree. Instead, this function does that for you. It returns
NULL if the given SHA1 is not either a tree or a commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds --stage option to show-files command. It shows
file-mode, SHA1, stage and pathname. Record separator follows
the usual convention of -z option as before.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The ce_namelen field has been renamed to ce_flags and split into
the top 2-bit unused, next 2-bit stage number and the lowest
12-bit name-length, stored in the network byte order. A new
macro create_ce_flags() is defined to synthesize this value from
length and stage, but it forgets to turn the value into the
network byte order. Here is a fix.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This allows using a git tree over NFS with different byte order, and
makes it possible to just copy a fully populated repository and have
the end result immediately usable (needing just a refresh to update
the stat information).
Now there is error() for "library" errors and die() for fatal "application"
errors. usage() is now used strictly only for usage errors.
Signed-off-by: Petr Baudis <pasky@ucw.cz>
It now requires the "--add" flag before you add any new files, and
a "--remove" file if you want to mark files for removal. And giving
it the "--refresh" flag makes it just update all the files that it
already knows about.
It's got some debugging printouts etc still in it, but testing on the
kernel seems to show that it does indeed fix the issue with huge tree
files for each commit.
It finds the cache entry position for a given name, and is
generally useful. Sure, everybody can just scan the active
cache array, but since it's sorted, you actually want to
search it with a binary search, so let's not duplicate that
logic all over the place.
Patches from Dave Jones and Ingo Molnar, but since I don't have any
infrastructure in place to use the old patch applicator scripts I
am trying to build up, I ended up fixing the thing by hand instead.
Credit where credit is due, though. Nice to see that people are
taking a look at the project even in this early stage.