When interpreting a config value, the config parser reads in 1+ space
character(s) and puts -one- space character in the buffer as soon as
the first non-space character is encountered (if not inside quotes).
Unfortunately the buffer size check lacks the extra space character
which gets inserted at the next non-space character, resulting in
a crash with a specially crafted config entry.
The unit test now uses Java to compile a platform independent
.NET framework to output the test string in C# :o)
Read: Thanks to Johannes Sixt for the correct printf call
which replaces the perl invocation.
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow
- more clarity into who "owns" the name (ie now when we free the name in
the show_object callback, it's because we generated it ourselves by
calling path_name())
- not calling path_name() at all, either because we don't care about the
name in the first place, or because we are actually happy walking the
linked list of "struct name_path *" and the last component.
Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.
Why? We use that name for two things:
- add_preferred_base_object(), which actually _wants_ to traverse the
path, and now does it by looking for '/' characters!
- for 'name_hash()', which only cares about the last 16 characters of a
name, so again, generating the full name seems to be just unnecessary
work.
Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.
This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Here's a less trivial thing, and slightly more dubious one.
I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.
Now, there are possible downsides to this:
- the "buffer using object_array" _can_ in theory result in at least
better I-cache usage (two tight loops rather than one more spread out
one). I don't think this is a real issue, but in theory..
- this _does_ change the order of the objects printed. Instead of doing a
"process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
over the commits (which puts all the root trees _first_ in the object
list, this patch just adds them to the list of pending objects, and
then we'll traverse them in that order (and thus show each root tree
object together with the objects we discover under it)
I _think_ the new ordering actually makes more sense, but the object
ordering is actually a subtle thing when it comes to packing
efficiency, so any change in order is going to have implications for
packing. Good or bad, I dunno.
- There may be some reason why we did it that odd way with the object
array, that I have simply forgotten.
Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.
This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...
Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.
Or maybe there _used_ to be a reason, and no longer is.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On Wed, 8 Apr 2009, Björn Steinbrink wrote:
>
> The name of the processed object was duplicated for passing it to
> add_object(), but that already calls path_name, which allocates a new
> string anyway. So the memory allocated by the xstrdup calls just went
> nowhere, leaking memory.
Ack, ack.
There's another easy 5% or so for the built-in object walker: once we've
created the hash from the name, the name isn't interesting any more, and
so something trivial like this can help a bit.
Does it matter? Probably not on its own. But a few more memory saving
tricks and it might all make a difference.
Linus
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test was added recently (5a688fe, "core.sharedrepository = 0mode"
should set, not loosen; 2009-03-28). It checked the result of a sed
invocation for emptyness, but in some cases it forgot to print anything
at all, so that those checks would never be false.
Due to this mistake, it went unnoticed that the files in objects/info are
not necessarily 0440, but can also be 0660. Because the 0mode setting
tries to guarantee that the files are accessible only to the people they
are meant to be used by, we should only make sure that they are readable
by the user and the group when the configuration is set to 0660. It is a
separate matter from the core.shredrepository settings that w-bit from
immutable object files under objects/[0-9a-f][0-9a-f] directories should
be dropped.
COMMIT_EDITMSG is still world-readable, but it (and any transient files
that are meant for repositories with a work tree) does not matter. If you
are working on a shared machine and on a sekrit stuff, the root of the
work tree would be with mode 0700 (or 0750 to allow peeking by other
people in the group), and that would mean that .git/COMMIT_EDITMSG in such
a repository would not be readable by the strangers anyway.
Also, in the real-world use case, .git/COMMIT_EDITMSG will be given to an
arbitrary editor the user happens to use, and we have no guarantee what it
does (e.g. it may create a new file with umask and replace, it may rewrite
in place, it may leave an editor backup file but use umask to create it,
etc.), and the protection of the file lies majorly on the protection of
the root of the work tree.
This test cannot be run on Windows; it requires POSIXPERM when merged to
'master'.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/shared-literally:
t1301: loosen test for forced modes
set_shared_perm(): sometimes we know what the final mode bits should look like
move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath
Move chmod(foo, 0444) into move_temp_to_file()
"core.sharedrepository = 0mode" should set, not loosen
* jc/maint-1.6.0-keep-pack:
pack-objects: don't loosen objects available in alternate or kept packs
t7700: demonstrate repack flaw which may loosen objects unnecessarily
Remove --kept-pack-only option and associated infrastructure
pack-objects: only repack or loosen objects residing in "local" packs
git-repack.sh: don't use --kept-pack-only option to pack-objects
t7700-repack: add two new tests demonstrating repacking flaws
is_kept_pack(): final clean-up
Simplify is_kept_pack()
Consolidate ignore_packed logic more
has_sha1_kept_pack(): take "struct rev_info"
has_sha1_pack(): refactor "pretend these packs do not exist" interface
git-repack: resist stray environment variable
Conflicts:
t/t7700-repack.sh
The name of the processed object was duplicated for passing it to
add_object(), but that already calls path_name, which allocates a new
string anyway. So the memory allocated by the xstrdup calls just went
nowhere, leaking memory.
This reduces the RSS usage for a "rev-list --all --objects" by about 10% on
the gentoo repo (fully packed) as well as linux-2.6.git:
gentoo:
| old | new
----------------|-------------------------------
RSS | 1537284 | 1388408
VSZ | 1816852 | 1667952
time elapsed | 1:49.62 | 1:48.99
min. page faults| 417178 | 379919
linux-2.6.git:
| old | new
----------------|-------------------------------
RSS | 324452 | 292996
VSZ | 491792 | 460376
time elapsed | 0:14.53 | 0:14.28
min. page faults| 89360 | 81613
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise, git complains about not finding a branch to pull from in
'branch..merge', which is hardly understandable. While we're there,
reword the sentences slightly.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.6.1:
Documentation: clarify .gitattributes search
git-checkout.txt: clarify that <branch> applies when no path is given.
git-checkout.txt: fix incorrect statement about HEAD and index
Most of the time when we give branch name in the message, we quote it
inside a pair of single-quotes. git-checkout uses double-quotes; this
patch corrects the inconsistency.
Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.6.0:
Documentation: clarify .gitattributes search
git-checkout.txt: clarify that <branch> applies when no path is given.
git-checkout.txt: fix incorrect statement about HEAD and index
Use the term "toplevel of the work tree" in gitattributes.txt and
gitignore.txt to define the limits of the search for those files.
Signed-off-by: Jason Merrill <jason@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise, the sentence "Defaults to HEAD." can be mis-read to mean
that "git checkout -- hello.c" checks-out from HEAD.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command "git checkout" checks out from the index by default, not
HEAD (the introducing comment were correct, but the detailled
explanation added below were not).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Actually, you have to set the -b option after the add command.
Signed-off-by: Julien Danjou <julien@danjou.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously we ignored the result of calling add_interactive,
which meant that if an error occurred we simply committed
whatever happened to be in the index.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies the code without changing the semantics and removes
the unhelpful "needs $sha1" part of the conflicting submodule message.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merging merge bases during a recursive merge we do not want to
leave any unmerged entries. Otherwise we cannot create a temporary
tree for the recursive merge to work with.
We failed to do so in case of a submodule conflict between merge
bases, causing a NULL pointer dereference in the next step of the
recursive merge.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While 'git checkout <submodule>' should not update the submodule's
working directory, it should update the index. This is in line with
how submodules are handled in the rest of Git.
While at it, test 'git reset [<commit>] <submodule>', too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* dm/maint-docco:
Documentation: Remove spurious uses of "you" in git-bisect.txt.
Documentation: minor grammatical fix in git-check-ref-format.txt
Documentation: minor grammatical fixes in git-check-attr.txt
Documentation: minor grammatical fixes in git-cat-file.txt
Documentation: minor grammatical fixes and rewording in git-bundle.txt
Documentation: remove some uses of the passive voice in git-bisect.txt
Documentation: reword example text in git-bisect.txt.
Documentation: reworded the "Description" section of git-bisect.txt.
Documentation: minor grammatical fixes in git-branch.txt.
Documentation: minor grammatical fixes in git-blame.txt.
Documentation: reword the "Description" section of git-bisect.txt.
Documentation: minor grammatical fixes in git-archive.txt.
Previously the code did a simple prefix match, which means that a path in
a directory "frotz/" would have matched with pathspec "f".
Signed-off-by: Junio C Hamano <gitster@pobox.com>