Some of the --pretty=format placeholders expansions are expensive to
calculate. This is made worse by the current code's use of
interpolate(), which requires _all_ placeholders are to be prepared
up front.
One way to speed this up is to check which placeholders are present
in the format string and to prepare only the expansions that are
needed. That still leaves the allocation overhead of interpolate().
Another way is to use a callback based approach together with the
strbuf library to keep allocations to a minimum and avoid string
copies. That's what this patch does. It introduces a new strbuf
function, strbuf_expand().
The function takes a format string, list of placeholder strings,
a user supplied function 'fn', and an opaque pointer 'context'
to tell 'fn' what thingy to operate on.
The function 'fn' is expected to accept a strbuf, a parsed
placeholder string and the 'context' pointer, and append the
interpolated value for the 'context' thingy, according to the
format specified by the placeholder.
Thanks to Pierre Habouzit for his suggestion to use strchrnul() and
the code surrounding its callsite. And thanks to Junio for most of
this commit message. :)
Here my measurements of most of Paul Mackerras' test cases that
highlighted the performance problem (best of three runs):
(master)
$ time git log --pretty=oneline >/dev/null
real 0m0.390s
user 0m0.340s
sys 0m0.040s
(master)
$ time git log --pretty=raw >/dev/null
real 0m0.434s
user 0m0.408s
sys 0m0.016s
(master)
$ time git log --pretty="format:%H {%P} %ct" >/dev/null
real 0m1.347s
user 0m0.080s
sys 0m1.256s
(interp_find_active -- Dscho)
$ time ./git log --pretty="format:%H {%P} %ct" >/dev/null
real 0m0.694s
user 0m0.020s
sys 0m0.672s
(strbuf_expand -- this patch)
$ time ./git log --pretty="format:%H {%P} %ct" >/dev/null
real 0m0.395s
user 0m0.352s
sys 0m0.028s
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* make strbuf_read_file take a size hint (works like strbuf_read)
* use it in a couple of places.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.
strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.
as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.
API changes:
* strbuf_setlen now always works, so just make strbuf_reset a convenience
macro.
* strbuf_detatch takes a size_t* optional argument (meaning it can be
NULL) to copy the buffer's len, as it was needed for this refactor to
make the code more readable, and working like the callers.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
strbuf_setlen() expect to be able to NUL terminate the buffer,
but a completely empty strbuf could have an empty buffer with 0
allocation; both the assert() and the assignment for NUL
termination would fail.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add strbuf_remove, change strbuf_insert:
As both are special cases of strbuf_splice, implement them as such.
gcc is able to do the math and generate almost optimal code this way.
Add strbuf_swap:
Exchange the values of its arguments.
Use it in fast-import.c
Also fix spacing issues in strbuf.h
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
read_line is now strbuf_getline, and is a first class citizen, it returns 0
when reading a line worked, EOF else.
The ->eof marker was used non-locally by fast-import.c, mimic the same
behaviour using a static int in "read_next_command", that now returns -1 on
EOF, and avoids to call strbuf_getline when it's in EOF state.
Also no longer automagically strbuf_release the buffer, it's counter
intuitive and breaks fast-import in a very subtle way.
Note: being at EOF implies that command_buf.len == 0.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* strbuf_splice replace a portion of the buffer with another.
* strbuf_attach replace a strbuf buffer with the given one, that should be
malloc'ed. Then it enforces strbuf's invariants. If alloc > len, then this
function has negligible cost, else it will perform a realloc, possibly
with a cost.
Also some style issues are fixed now.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add strbuf_rtrim to remove trailing spaces.
* Add strbuf_insert to insert data at a given position.
* Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
\0 so the overflow test for snprintf is the strict comparison. This is
not critical as the growth mechanism chosen will always allocate _more_
memory than asked, so the second test will not fail. It's some kind of
miracle though.
* Add size extension hints for strbuf_init and strbuf_read. If 0, default
applies, else:
+ initial buffer has the given size for strbuf_init.
+ first growth checks it has at least this size rather than the
default 8192.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The gory details are explained in strbuf.h. The change of semantics this
patch enforces is that the embeded buffer has always a '\0' character after
its last byte, to always make it a C-string. The offs-by-one changes are all
related to that very change.
A strbuf can be used to store byte arrays, or as an extended string
library. The `buf' member can be passed to any C legacy string function,
because strbuf operations always ensure there is a terminating \0 at the end
of the buffer, not accounted in the `len' field of the structure.
A strbuf can be used to generate a string/buffer whose final size is not
really known, and then "strbuf_detach" can be used to get the built buffer,
and keep the wrapping "strbuf" structure usable for further work again.
Other interesting feature: strbuf_grow(sb, size) ensure that there is
enough allocated space in `sb' to put `size' new octets of data in the
buffer. It helps avoiding reallocating data for nothing when the problem the
strbuf helps to solve has a known typical size.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- Raw hashes should be unsigned char.
- String functions want signed char.
- Hash and compress functions want unsigned char.
Signed-off By: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch introduces a new program, diff-tree-helper. It reads
output from diff-cache and diff-tree, and produces a patch file.
The diff format customization can be done the same way the
show-diff uses; the same external diff interface introduced by
the previous patch to drive diff from show-diff is used so this
is not surprising.
It is used like the following examples:
$ diff-cache --cached -z <tree> | diff-tree-helper -z -R paths...
$ diff-tree -r -z <tree1> <tree2> | diff-tree-helper -z paths...
- As usual, the use of the -z flag is recommended in the script
to pass NUL-terminated filenames through the pipe between
commands.
- The -R flag is used to generate reverse diff. It does not
matter for diff-tree case, but it is sometimes useful to get
a patch in the desired direction out of diff-cache.
- The paths parameters are used to restrict the paths that
appears in the output. Again this is useful to use with
diff-cache, which, unlike diff-tree, does not take such paths
restriction parameters.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>