Attempts to reduce the stack footprint of sha1_object_info()
and unpack_entry() codepaths.
* tr/packed-object-info-wo-recursion:
sha1_file: remove recursion in unpack_entry
Refactor parts of in_delta_base_cache/cache_or_unpack_entry
sha1_file: remove recursion in packed_object_info
MSYS bash interprets the slash in the argument core.commentchar="/"
as root directory and mangles it into a Windows style path. Use a
different core.commentchar to dodge the issue.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git fast-import expects an extra newline after the commit message data,
but we are adding it only on hg-git compat mode, which is why the
bidirectionality tests pass.
We should add it unconditionally.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The earlier logic to warn against "git add subdir" that is run
without "-A" or "--no-all" was only to check any <pathspec> given
exactly spells a directory name that (still) exists on the
filesystem. This had number of problems:
* "git add '*dir'" (note that the wildcard is hidden from the
shell) would not trigger the warning.
* "git add '*.py'" would behave differently between the current
version of Git and Git 2.0 for the same reason as "subdir", but
would not trigger the warning.
* "git add dir" for a submodule "dir" would just update the index
entry for the submodule "dir" without ever recursing into it, and
use of "-A" or "--no-all" would matter. But the logic only
checks the directory-ness of "dir" and gives an unnecessary
warning.
Rework the logic to detect the case where the behaviour will be
different in Git 2.0, and issue a warning only when it matters.
Even with the code before this warning, "git add subdir" will have
to traverse the directory in order to find _new_ files the index
does not know about _anyway_, so we can do this check without adding
an extra pass to find if <pathspec> matches any removed file.
This essentially updates the "add_files_to_cache()" public API to
"update_files_in_cache()" API that is internal to "git add", because
with the "--all" option, the function is no longer about "adding"
paths to the cache, but is also used to remove them.
There are other callers of the former from "checkout" (used when
"checkout -m" prepares the temporary tree that represents the local
modifications to be merged) and "commit" ("commit --include" that
picks up local changes in addition to what is in the index). Since
ADD_CACHE_IGNORE_ERRORS (aka "--no-all") is not used by either of
them, once dust settles after Git 2.0 and the warning becomes
unnecessary, we may want to unify these two functions again.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Highlight that CONFIG_SYSTEM and /etc/gitweb.conf are meant to be
the fallback configuration file in BUGS section of gitweb.conf
documentation. This will hopefully help people who expect them to
be a common default, which unfortunately came later in the history.
split_ident_line() can leave us with the pointers date_begin, date_end,
tz_begin and tz_end all set to NULL. Check them before use and supply
the same fallback values as in the case of a negative return code from
split_ident_line().
The "(unknown)" is not actually shown in the output, though, because it
will be converted to a number (zero) eventually.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Centralize the parsing of the date and time zone strings in the new
helper function show_ident_date() and make sure it checks the pointers
provided by split_ident_line() for NULL before use.
Reported-by: Ivan Lyapunov <dront78@gmail.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "cat-file -p" prints commits, it shows them in their
raw format, since git's format is already human-readable.
For tags, however, we print the whole thing raw except for
one thing: we convert the timestamp on the tagger line into a
human-readable date.
This dates all the way back to a0f15fa (Pretty-print tagger
dates, 2006-03-01). At that time there was no other way to
pretty-print a tag. These days, however, neither of those
matters much. The normal way to pretty-print a tag is with
"git show", which is much more flexible than "cat-file -p".
Commit a0f15fa also built "verify-tag --verbose" (and
subsequently "tag -v") around the "cat-file -p" output.
However, that behavior was lost in commit 62e09ce (Make git
tag a builtin, 2007-07-20), and we went back to printing
the raw tag contents. Nobody seems to have noticed the bug
since then (and it is arguably a saner behavior anyway, as
it shows the actual bytes for which we verified the
signature).
Let's drop the tagger-date formatting for "cat-file -p". It
makes us more consistent with cat-file's commit
pretty-printer, and as a bonus, we can drop the hand-rolled
tag parsing code in cat-file (which happened to behave
inconsistently with the tag pretty-printing code elsewhere).
This is a change of output format, so it's possible that
some callers could considered this a regression. However,
the original behavior was arguably a bug (due to the
inconsistency with commits), likely nobody was relying on it
(even we do not use it ourselves these days), and anyone
relying on the "-p" pretty-printer should be able to expect
a change in the output format (i.e., while "cat-file" is
plumbing, the output format of "-p" was never guaranteed to
be stable).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The has_cr_in_index() function is an almost 1:1 copy of
read_blob_data_from_index() with some additions. Use the
latter instead of using copy-pasted code.
Signed-off-by: Lukas Fleischer <git@cryptocrack.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows for optionally getting the size of the returned data and
will be used in a follow-up patch.
Signed-off-by: Lukas Fleischer <git@cryptocrack.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract the read_index_data() function from attr.c and move it to
read-cache.c; rename it to read_blob_data_from_index() and update
the function signature of it to align better with index/cache API
functions.
This allows for reusing the function in convert.c later.
Signed-off-by: Lukas Fleischer <git@cryptocrack.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we die from an async thread, we do not actually exit the
program, but just kill the thread. This confuses the static
counter in usage.c's default die_is_recursing function; it
updates the counter once for the thread death, and then when
the main program calls die() itself, it erroneously thinks
we are recursing. The end result is that we print "recursion
detected in die handler" instead of the real error in such a
case (the easiest way to trigger this is having a remote
connection hang up while running a sideband demultiplexer).
This patch solves it by using a per-thread counter when the
async_die function is installed; we detect recursion in each
thread (including the main one), but they do not step on
each other's toes.
Other threaded code does not need to worry about this, as
they do not install specialized die handlers; they just let
a die() from a sub-thread take down the whole program.
Since we are overriding the default recursion-check
function, there is an interesting corner case that is not a
problem, but bears some explanation. Imagine the main thread
calls die(), and then in the die_routine starts an async
call. We will switch to using thread-local storage, which
starts at 0, for the main thread's counter, even though
the original counter was actually at 1. That's OK, though,
for two reasons:
1. It would miss only the first level of recursion, and
would still find recursive failures inside the async
helper.
2. We do not currently and are not likely to start doing
anything as heavyweight as starting an async routine
from within a die routine or helper function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When any git code calls die or die_errno, we use a counter
to detect recursion into the die functions from any of the
helper functions. However, such a simple counter is not good
enough for threaded programs, which may call die from a
sub-thread, killing only the sub-thread (but incrementing
the counter for everyone).
Rather than try to deal with threads ourselves here, let's
just allow callers to plug in their own recursion-detection
function. This is similar to how we handle the die routine
(the caller plugs in a die routine which may kill only the
sub-thread).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
External projects have been known to parse the output of
"git version". Help prevent future authors from changing
its format by adding a comment to its implementation.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you try this:
1. Install Git for Windows (from the msysgit project)
2. Put
[core]
autocrlf = false
eol = native
in your .gitconfig.
3. Clone a project with
*.txt text
in its .gitattributes.
Then with current git, any text files checked out have LF line
endings, instead of the expected CRLF.
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>
Cc: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_revisions_from_stdin() has passed pointers to its read buffer
down to handle_revision_arg() since its inception way back in 42cabc3
(Teach rev-list an option to read revs from the standard input.,
2006-09-05). Even back then, this was a bug: through
add_pending_object, the argument was recorded in the object_array's
'name' field.
Fix it by making a copy whenever read_revisions_from_stdin() passes an
argument down the callchain. The other caller runs handle_revision_arg()
on argv[], where it would be redundant to make a copy.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because we reuse curl handles for multiple requests, the
setup of a handle happens in two stages: stable, global
setup and per-request setup. The lifecycle of a handle is
something like:
1. get_curl_handle; do basic global setup that will last
through the whole program (e.g., setting the user
agent, ssl options, etc)
2. get_active_slot; set up a per-request baseline (e.g.,
clearing the read/write functions, making it a GET
request, etc)
3. perform the request with curl_*_perform functions
4. goto step 2 to perform another request
Breaking it down this way means we can avoid doing global
setup from step (1) repeatedly, but we still finish step (2)
with a predictable baseline setup that callers can rely on.
Until commit 6d052d7 (http: add HTTP_KEEP_ERROR option,
2013-04-05), setting curl's FAILONERROR option was a global
setup; we never changed it. However, 6d052d7 introduced an
option where some requests might turn off FAILONERROR. Later
requests using the same handle would have the option
unexpectedly turned off, which meant they would not notice
http failures at all.
This could easily be seen in the test-suite for the
"half-auth" cases of t5541 and t5551. The initial requests
turned off FAILONERROR, which meant it was erroneously off
for the rpc POST. That worked fine for a successful request,
but meant that we failed to react properly to the HTTP 401
(instead, we treated whatever the server handed us as a
successful message body).
The solution is simple: now that FAILONERROR is a
per-request setting, we move it to get_active_slot to make
sure it is reset for each request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the current transport-helper code, refs without namespaced refspecs don't
work correctly, so let's always use them.
Some people reported issues with 'git clone --mirror', and this fixes them, as
well as possibly others.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --diff-algorithm algo" is also understood as "git diff
--diff-algorithm=algo".
* jk/diff-algo-finishing-touches:
diff: allow unstuck arguments with --diff-algorithm
git-merge(1): document diff-algorithm option to merge-recursive
The new core.commentchar configuration was not applied to a few
places.
* rt/commentchar-fmt-merge-msg:
fmt-merge-msg: use core.commentchar in tag signatures completely
fmt-merge-msg: respect core.commentchar in people credits
"git bundle" did not like a bundle created using a commit without
any message as its one of the prerequistes.
* lf/bundle-with-tip-wo-message:
bundle: Accept prerequisites without commit messages
"git show-branch" was not prepared to show a very long run of
ancestor operators e.g. foobar^2~2^2^2^2...^2~4 correctly.
* jk/show-branch-strbuf:
show-branch: use strbuf instead of static buffer
Improve error reporting from the http transfer clients.
* jk/http-error-messages:
http: drop http_error function
remote-curl: die directly with http error messages
http: re-word http error message
http: simplify http_error helper function
remote-curl: consistently report repo url for http errors
remote-curl: always show friendlier 404 message
remote-curl: let servers override http 404 advice
remote-curl: show server content on http errors
http: add HTTP_KEEP_ERROR option
Closing (not redirecting to /dev/null) the standard error stream is
not a very smart thing to do. Later open may return file
descriptor #2 for unrelated purpose, and error reporting code may
write into them.
* tr/perl-keep-stderr-open:
t9700: do not close STDERR
perl: redirect stderr to /dev/null instead of closing
'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.
fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).
The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.
We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.
In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.
In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.
builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.
Update API documentation to reflect the changes.
Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' recursively scans directories up to three times:
1. To collect untracked files.
2. To collect ignored files.
3. When collecting ignored files, to check that an untracked directory
that potentially contains ignored files doesn't also contain untracked
files (i.e. isn't already listed as untracked).
Let's get rid of case 3 first.
Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.
To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.
Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.
Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").
Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.
In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.
In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.
The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).
Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.
This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.
There's just one line left in treat_file, so move this to treat_one_path
as well.
Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):
| status | status --ignored
| linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 | 1.583 | 0.321 | 2.579
after | 0.156 | 0.988 | 0.202 | 1.279
gain | 1.397 | 1.602 | 1.589 | 2.016
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The is_excluded and is_path_excluded APIs are very similar, except for a
few noteworthy differences:
is_excluded doesn't handle ignored directories, results for paths within
ignored directories are incorrect. This is probably based on the premise
that recursive directory scans should stop at ignored directories, which
is no longer true (in certain cases, read_directory_recursive currently
calls is_excluded *and* is_path_excluded to get correct ignored state).
is_excluded caches parsed .gitignore files of the last directory in struct
dir_struct. If the directory changes, it finds a common parent directory
and is very careful to drop only as much state as necessary. On the other
hand, is_excluded will also read and parse .gitignore files in already
ignored directories, which are completely irrelevant.
is_path_excluded correctly handles ignored directories by checking if any
component in the path is excluded. As it uses is_excluded internally, this
unfortunately forces is_excluded to drop and re-read all .gitignore files,
as there is no common parent directory for the root dir.
is_path_excluded tracks state in a separate struct path_exclude_check,
which is essentially a wrapper of dir_struct with two more fields. However,
as is_path_excluded also modifies dir_struct, it is not possible to e.g.
use multiple path_exclude_check structures with the same dir_struct in
parallel. The additional structure just unnecessarily complicates the API.
Teach is_excluded / prep_exclude about ignored directories: whenever
entering a new directory, first check if the entire directory is excluded.
Remember the excluded state in dir_struct. Don't traverse into already
ignored directories (i.e. don't read irrelevant .gitignore files).
Directories could also be excluded by exclude patterns specified on the
command line or .git/info/exclude, so we cannot simply skip prep_exclude
entirely if there's no .gitignore file name (dir_struct.exclude_per_dir).
Move this check to just before actually reading the file.
is_path_excluded is now equivalent to is_excluded, so we can simply
redirect to it (the public API is cleaned up in the next patch).
The performance impact of the additional ignored check per directory is
hardly noticeable when reading directories recursively (e.g. 'git status').
However, performance of git commands using the is_path_excluded API (e.g.
'git ls-files --cached --ignored --exclude-standard') is greatly improved
as this no longer re-reads .gitignore files on each call.
Here's some performance data from the linux and WebKit repos (best of 10
runs on a Debian Linux on SSD, core.preloadIndex=true):
| ls-files -ci | status | status --ignored
| linux | WebKit | linux | WebKit | linux | WebKit
-------+-------+--------+-------+--------+-------+---------
before | 0.506 | 6.539 | 0.212 | 1.555 | 0.323 | 2.541
after | 0.080 | 1.191 | 0.218 | 1.583 | 0.321 | 2.579
gain | 6.325 | 5.490 | 0.972 | 0.982 | 1.006 | 0.985
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The notion of "ignored tracked" directories introduced in 721ac4ed "dir.c:
Make git-status --ignored more consistent" has a few unwanted side effects:
- git-clean -d -X: deletes ignored tracked directories. git-clean should
never delete tracked content.
- git-ls-files --ignored --other --directory: lists ignored tracked
directories instead of "other" directories.
- git-status --ignored: lists ignored tracked directories while contained
files may be listed as modified. Paths listed by git-status should be
disjoint (except in long format where a path may be listed in both the
staged and unstaged section).
Additionally, the current behaviour violates documentation in gitignore(5)
("Specifies intentionally *untracked* files to ignore") and Documentation/
technical/api-directory-listing.txt ("DIR_SHOW_OTHER_DIRECTORIES: Include
a directory that is *not tracked*.").
In dir.c::treat_directory, remove the special handling of ignored tracked
directories, so that the DIR_SHOW_OTHER_DIRECTORIES flag only affects
"other" (i.e. untracked) directories. In dir.c::dir_add_name, check that
added paths are untracked even if DIR_SHOW_IGNORED is set.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored path/' doesn't list ignored files and directories
within 'path' if some component of 'path' is classified as untracked.
Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading
directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag
from aborting at the top level untracked directory.
As a side effect, this also eliminates a recursive directory scan per
leading directory level, as treat_directory() can no longer call
read_directory_recursive() when called from treat_leading_path().
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' lists empty untracked directories as ignored, even
though they don't have any ignored files.
When checking if a directory is already listed as untracked (i.e. shouldn't
be listed as ignored as well), don't assume that the directory has only
ignored files if it doesn't have untracked files, as the directory may be
empty.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-ls-files --ignored --directories' hides empty directories even though
--no-empty-directory was not specified.
Treat the DIR_HIDE_EMPTY_DIRECTORIES flag independently from
DIR_SHOW_IGNORED to make all git-ls-files options work as expected.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' lists ignored tracked directories without any
ignored files if a tracked file happens to match an exclude pattern.
Always exclude tracked files.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' lists both the ignored directory and the ignored
files if the files are in a tracked sub directory.
When recursing into sub directories in read_directory_recursive, pass on
the check_only parameter so that we don't accidentally add the files.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git-status --ignored' drops ignored directories if they contain untracked
files in an untracked sub directory.
Fix it by getting exact (recursive) excluded status in treat_directory.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The exact definition of "refspec" can be found in git-fetch and
git-push manpages. So don't duplicate this here in the glossary.
Actually the definition of "pathspec" should be moved to a separate
file akin to the way it's done with "refspec". But this will only be
wortwhile when there's more to say about it. So for the time being
just improve the first sentence a little bit; fix the indentation of
the first paragraph after the bullet list and remove the one-item
list of magic signatures with its - for the user - unnecessary
introduction of "magic word 'top'".
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use "SHA-1" instead of "SHA1" whenever we talk about the hash function.
When used as a programming symbol, we keep "SHA1".
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name of the hash function is "SHA-1", not "SHA1".
Also to people who look up "object name" in the glossary,
the details of which hash function is applied on what to
compute "object name" is not important but the fact that the
name is meant to be an unique identifier for the contents
stored in the object is.
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Otherwise when using 'git branch -vv' it's hard to see them among so
much output.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When bisect successfully finds a single revision, the first bad commit
should be shown to human readers of 'git bisect log'.
This resolves the apparent disconnect between the bisection result and
the log when a bug reporter says "I know that the first bad commit is
$rev, as you can see from $(git bisect log)".
Signed-off-by: Torstein Hegge <hegge@resisty.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>