The dereference() function to peel a tree-ish and find the underlying
tree expects arithmetic to (void *) to work on byte addresses. We
should be reading the text of objects through a char * anyway.
Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Lazy fast-import frontend authors that want to rely on the backend to
keep track of the content of the imported trees _almost_ have what
they need in the 'cat-blob' command (v1.7.4-rc0~30^2~3, 2010-11-28).
But it is not quite enough, since
(1) cat-blob can be used to retrieve the content of files, but
not their mode, and
(2) using cat-blob requires the frontend to keep track of a name
(mark number or object id) for each blob to be retrieved
Introduce an 'ls' command to complement cat-blob and take care of the
remaining needs. The 'ls' command finds what is at a given path
within a given tree-ish (tag, commit, or tree):
'ls' SP <dataref> SP <path> LF
or in fast-import's active commit:
'ls' SP <path> LF
The response is a single line sent through the cat-blob channel,
imitating ls-tree output. So for example:
FE> ls :1 Documentation
gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9 Documentation
FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt
gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870 git-fast-import.txt
FE> ls :1 RelNotes
gfi> 120000 blob b942e49944 RelNotes
FE> cat-blob b942e49944
gfi> b942e49944 blob 32
gfi> Documentation/RelNotes/1.7.4.txt
The most interesting parts of the reply are the first word, which is
a 6-digit octal mode (regular file, executable, symlink, directory,
or submodule), and the part from the second space to the tab, which is
a <dataref> that can be used in later cat-blob, ls, and filemodify (M)
commands to refer to the content (blob, tree, or commit) at that path.
If there is nothing there, the response is "missing some/path".
The intent is for this command to be used to read files from the
active commit, so a frontend can apply patches to them, and to copy
files and directories from previous revisions.
For example, proposed updates to svn-fe use this command in place of
its internal representation of the repository directory structure.
This simplifies the frontend a great deal and means support for
resuming an import in a separate fast-import run (i.e., incremental
import) is basically free.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Sverre Rabbelier <srabbelier@gmail.com>
Here is a 'feature' command for streams to use to require support for
the notemodify (N) command.
When the 'feature' facility was introduced (v1.7.0-rc0~95^2~4,
2009-12-04), the notes import feature was old news (v1.6.6-rc0~21^2~8,
2009-10-09) and it was not obvious it deserved to be a named feature.
But now that is clear, since all major non-git fast-import backends
lack support for it.
Details: on git version with this patch applied, any "feature notes"
command in the features/options section at the beginning of a stream
will be treated as a no-op. On fast-import implementations without
the feature (and older git versions), the command instead errors out
with a message like
This version of fast-import does not support feature notes.
So by declaring use of notes at the beginning of a stream, frontends
can avoid wasting time and other resources when the backend does not
support notes. (This would be especially important for backends that
do not support rewinding history after a botched import.)
Improved-by: Thomas Rast <trast@student.ethz.ch>
Improved-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
rebase -i: clarify in-editor documentation of "exec"
tests: sanitize more git environment variables
fast-import: treat filemodify with empty tree as delete
rebase: give a better error message for bogus branch
rebase: use explicit "--" with checkout
Conflicts:
t/t9300-fast-import.sh
Normal git processes do not allow one to build a tree with an empty
subtree entry without trying hard at it. This is in keeping with the
general UI philosophy: git tracks content, not empty directories.
v1.7.3-rc0~75^2 (2010-06-30) changed that by making it easy to include
an empty subtree in fast-import's active commit:
M 040000 4b825dc642 subdir
One can trigger this by reading an empty tree (for example, the tree
corresponding to an empty root commit) and trying to move it to a
subtree. It is better and more closely analogous to 'git read-tree
--prefix' to treat such commands as requests to remove the subtree.
Noticed-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a frontend uses a marks file to ensure its state persists between
runs, it may represent "clean slate" when bootstrapping with "no marks
yet". In such a case, feeding the last state with --import-marks and
saving the state after the current run with --export-marks would be a
natural thing to do.
The --import-marks option however errors out when the specified marks file
doesn't exist; this makes bootstrapping a bit difficult. The location of
the marks file becomes backend-dependent when --relative-marks is in
effect, and the frontend cannot check for the existence of the file in
such a case.
The --import-marks-if-exists option does the same thing as --import-marks
but does not flag an error if the named file does not exist yet to help
these frontends.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-blob-access:
t9300: avoid short reads from dd
t9300: remove unnecessary use of /dev/stdin
fast-import: Allow cat-blob requests at arbitrary points in stream
fast-import: let importers retrieve blobs
fast-import: clarify documentation of "feature" command
fast-import: stricter parsing of integer options
Conflicts:
fast-import.c
* jj/icase-directory:
Support case folding in git fast-import when core.ignorecase=true
Support case folding for git add when core.ignorecase=true
Add case insensitivity support when using git ls-files
Add case insensitivity support for directories when using git status
Case insensitivity support for .gitignore via core.ignorecase
Add string comparison functions that respect the ignore_case variable.
Makefile & configure: add a NO_FNMATCH_CASEFOLD flag
Makefile & configure: add a NO_FNMATCH flag
Conflicts:
Makefile
config.mak.in
configure.ac
fast-import.c
The new rule: a "cat-blob" can be inserted wherever a comment is
allowed, which means at the start of any line except in the middle of
a "data" command.
This saves frontends from having to loop over everything they want to
commit in the next commit and cat-ing the necessary objects in
advance.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New objects written by fast-import are not available immediately.
Until a checkpoint has been started and finishes writing the pack
index, any new blobs will not be accessible using standard git tools.
So introduce a new way to access them: a "cat-blob" command in the
command stream requests for fast-import to print a blob to stdout or a
file descriptor specified by the argument to --cat-blob-fd. The value
for cat-blob-fd cannot be specified in the stream because that would
be a layering violation: the decision of where to direct a stream has
to be made when fast-import is started anyway, so we might as well
make the stream format is independent of that detail.
Output uses the same format as "git cat-file --batch".
Thanks to Sverre Rabbelier and Sam Vilain for guidance in designing
the protocol.
Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: David Barr <david.barr@cordelta.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check the result from strtoul to avoid accepting arguments like
--depth=-1 and --active-branches=foo,bar,baz.
Requested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/fast-import-fix:
fast-import: do not clear notes in do_change_note_fanout()
t9300 (fast-import): another test for the "replace root" feature
fast-import: tighten M 040000 syntax
fast-import: filemodify after M 040000 <tree> "" crashes
It can be tedious to wait for a multi-million-revision import.
Unfortunately it is hard to spy on the import because fast-import
works by continuously streaming out objects, without updating the pack
index or refs until a checkpoint command or the end of the stream.
So allow the impatient operator to request checkpoints by sending a
signal, like so:
killall -USR1 git-fast-import
When receiving such a signal, fast-import would schedule a checkpoint
to take place after the current top-level command (usually a "commit"
or "blob" request) finishes.
Caveats: just like ordinary checkpoint commands, such requests slow
down the import. Switching to a new pack at a suboptimal moment is
also likely to result in a less dense initial collection of packs.
That's the price.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More often than not, find_object is called for recently inserted objects.
Optimise for this case by inserting new entries at the start of the chain.
This doesn't affect the cost of new inserts but reduces the cost of find
and insert for existing object entries.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 5edde51 (fast-import: filemodify after M 040000 <tree> ""
crashes, 2010-10-17) taught fast-import to load trees from the
object db as needed when it is time to access them.
But it went too far. In change_note_fanout(), an empty,
not-loaded tree is not meant to destroy notes, so calling
load_tree() at that point is exactly the wrong thing to do.
Kudos to Johan Herland for t9301, which caught this failure.
Reported-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When tree_content_set() is asked to modify the path "foo/bar/",
it first recurses like so:
tree_content_set(root, "foo/bar/", sha1, S_IFDIR) ->
tree_content_set(root:foo, "bar/", ...) ->
tree_content_set(root:foo/bar, "", ...)
And as a side-effect of 2794ad5 (fast-import: Allow filemodify to set
the root, 2010-10-10), this last call is accepted and changes
the tree entry for root:foo/bar to refer to the specified tree.
That seems safe enough but let's reject the new syntax (we never meant
to support it) and make it harder for frontends to introduce pointless
incompatibilities with git fast-import 1.7.3.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Until M 040000 <tree> "" syntax was introduced in commit 2794ad5
(fast-import: Allow filemodify to set the root, 2010-10-10), it
was impossible for the root entry to refer to an unloaded tree.
Update various functions to take that possibility into account.
Otherwise
M 040000 <tree> ""
M 100644 :1 "foo"
and similar commands (using D, C, or R after resetting the root
tree) segfault.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v1.7.3-rc0~75^2 (Teach fast-import to import subtrees named by tree id,
2010-06-30) has a shortcoming - it doesn't allow the root to be set.
Extend this behaviour by allowing the root to be referenced as the
empty path, "".
For a command (like filter-branch --subdirectory-filter) that wants
to commit a lot of trees that already exist in the object db, writing
undeltified objects as loose files only to repack them later can
involve a significant amount of overhead.
(23% slow-down observed on Linux 2.6.35, worse on Mac OS X 10.6)
Fortunately we have fast-import (which is one of the only git commands
that will write to a pack directly) but there is not an advertised way
to tell fast-import to commit a given tree without unpacking it.
This patch changes that, by allowing
M 040000 <tree id> ""
as a filemodify line in a commit to reset to a particular tree without
any need to parse it. For example,
M 040000 4b825dc642 ""
is a synonym for the deleteall command and the fast-import equivalent of
git read-tree 4b825dc642
Signed-off-by: David Barr <david.barr@cordelta.com>
Commit-message-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Tested-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When core.ignorecase=true, imported file paths will be folded to match
existing directory case.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/d-f-conflict-fix:
merge-recursive: Avoid excessive output for and reprocessing of renames
merge-recursive: Fix multiple file rename across D/F conflict
t6031: Add a testcase covering multiple renames across a D/F conflict
merge-recursive: Fix typo
Mark tests that use symlinks as needing SYMLINKS prerequisite
t/t6035-merge-dir-to-symlink.sh: Remove TODO on passing test
fast-import: Improve robustness when D->F changes provided in wrong order
fast-export: Fix output order of D/F changes
merge_recursive: Fix renames across paths below D/F conflicts
merge-recursive: Fix D/F conflicts
Add a rename + D/F conflict testcase
Add additional testcases for D/F conflicts
Conflicts:
merge-recursive.c
dump_marks_helper() has a bug when dumping marks larger than 2^20-1,
i.e., when the sparse array has more than two levels. The bug was
that the 'base' counter was being shifted by 20 bits at level 3, and
then again by 10 bits at level 2, rather than a total shift of 20 bits
in this argument to the recursive call:
(base + k) << m->shift
There are two ways to fix this correctly, the elegant:
(base + k) << 10
and the one I chose due to edit distance:
base + (k << m->shift)
Signed-off-by: Raja R Harinath <harinath@hurrynot.org>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When older versions of fast-export came across a directory changing to a
symlink (or regular file), it would output the changes in the form
M 120000 :239821 dir-changing-to-symlink
D dir-changing-to-symlink/filename1
When fast-import sees the first line, it deletes the directory named
dir-changing-to-symlink (and any files below it) and creates a symlink in
its place. When fast-import came across the second line, it was previously
trying to remove the file and relevant leading directories in
tree_content_remove(), and as a side effect it would delete the symlink
that was just created. This resulted in the symlink silently missing from
the resulting repository.
To improve robustness, we ignore file deletions underneath directory names
that correspond to non-directories. This can also be viewed as a minor
optimization: since there cannot be a file and a directory with the same
name in the same directory, the file clearly can't exist so nothing needs
to be done to delete it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To simulate the svn cp command, it would be very useful to be
replace an arbitrary file in the current revision by an
arbitrary directory from a previous one. Modify the filemodify
command to allow that:
M 040000 <tree id> pathname
This would be most useful in combination with a facility to
print the commit ids for new revisions as they are written.
Cc: Shawn O. Pearce <spearce@spearce.org>
Cc: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* gv/portable:
test-lib: use DIFF definition from GIT-BUILD-OPTIONS
build: propagate $DIFF to scripts
Makefile: Tru64 portability fix
Makefile: HP-UX 10.20 portability fixes
Makefile: HPUX11 portability fixes
Makefile: SunOS 5.6 portability fix
inline declaration does not work on AIX
Allow disabling "inline"
Some platforms lack socklen_t type
Make NO_{INET_NTOP,INET_PTON} configured independently
Makefile: some platforms do not have hstrerror anywhere
git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
test_cmp: do not use "diff -u" on platforms that lack one
fixup: do not unconditionally disable "diff -u"
tests: use "test_cmp", not "diff", when verifying the result
Do not use "diff" found on PATH while building and installing
enums: omit trailing comma for portability
Makefile: -lpthread may still be necessary when libc has only pthread stubs
Rewrite dynamic structure initializations to runtime assignment
Makefile: pass CPPFLAGS through to fllow customization
Conflicts:
Makefile
wt-status.h
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.
enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.
Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.
Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following function is duplicated:
encode_header
Move this function to sha1_file.c and rename it 'encode_in_pack_object_header',
as suggested by Junio C Hamano
Signed-off-by: Michael Lukashov <michael.lukashov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This let diff_delta() abort early if it is going to bust the given
size limit. Also, only objects larger than 20 bytes are considered
as objects smaller than that are most certainly going to produce
larger deltas than the original object due to the additional headers.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that fast-import is creating packs with index version 2, there is
no point limiting the pack size by default. A pack split will still
happen if off_t is not sufficiently large to hold large offsets.
While updating the doc, let's remove the "packfiles fit on CDs"
suggestion. Pack files created by fast-import are still suboptimal and
a 'git repack -a -f -d' or even 'git gc --aggressive' would be a pretty
good idea before considering storage on CDs.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows for the creation of pack index version 2 with its object
CRC and the possibility for a pack to be larger than 4 GB.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in preparation for using write_idx_file(). Also, by using
sha1write() we get some buffering to reduces the number of write
syscalls, and the written data is SHA1 summed which allows for the extra
data integrity validation check performed in fixup_pack_header_footer()
(details on this in commit abeb40e5aa).
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar in spirit to 07cf0f2 (make --max-pack-size argument to 'git
pack-object' count in bytes, 2010-02-03) which made the option by the same
name to pack-objects, this counts the pack size limit in bytes.
In order not to cause havoc with people used to the previous megabyte
scale an integer smaller than 8192 is interpreted in megabytes but the
user gets a warning. Also a minimum size of 1 MiB is enforced to avoid an
explosion of pack files.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Manual merge made at 844ad3d (Merge branch 'sp/maint-fast-import-large-blob'
into sp/fast-import-large-blob, 2010-02-01) did not correctly reflect the change
of unit in which this variable's value is counted from its previous version.
Now it counts in bytes, not in megabytes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
* sp/maint-fast-import-large-blob:
fast-import: Stream very large blobs directly to pack
bash: don't offer remote transport helpers as subcommands
Conflicts:
fast-import.c
If a blob is larger than the configured big-file-threshold, instead
of reading it into a single buffer obtained from malloc, stream it
onto the end of the current pack file. Streaming the larger objects
into the pack avoids the 4+ GiB memory footprint that occurs when
fast-import is processing 2+ GiB blobs.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jh/notes' (early part):
Add more testcases to test fast-import of notes
Rename t9301 to t9350, to make room for more fast-import tests
fast-import: Proper notes tree manipulation
* sr/gfi-options:
fast-import: add (non-)relative-marks feature
fast-import: allow for multiple --import-marks= arguments
fast-import: test the new option command
fast-import: add option command
fast-import: add feature command
fast-import: put marks reading in its own function
fast-import: put option parsing code in separate functions