When we look up a sha1 object for reading via parse_object() =>
read_sha1_file() => read_object() callpath, we first check
packfiles, and then loose objects. If we still haven't found it, we
re-scan the list of packfiles in `objects/pack`. This final step
ensures that we can co-exist with a simultaneous repack process
which creates a new pack and then prunes the old object.
This extra re-scan usually does not have a performance impact for
two reasons:
1. If an object is missing, then typically the re-scan will find a
new pack, then no more misses will occur. Or if it truly is
missing, then our next step is usually to die().
2. Re-scanning is cheap enough that we do not even notice.
However, these do not always hold. The assumption in (1) is that the
caller is expecting to find the object. This is usually the case,
but the call to `parse_object` in `everything_local` does not follow
this pattern. It is looking to see whether we have objects that the
remote side is advertising, not something we expect to
have. Therefore if we are fetching from a remote which has many refs
pointing to objects we do not have, we may end up re-scanning the
pack directory many times.
Even with this extra re-scanning, the impact is often not noticeable
due to (2); we just readdir() the packs directory and skip any packs
that are already loaded. However, if there are a large number of
packs, even enumerating the directory can be expensive, especially
if we do it repeatedly.
Having this many packs is a good sign the user should run `git gc`,
but it would still be nice to avoid having to scan the directory at
all.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We generally try to run "gc --auto" after any commands that
might introduce a large number of new objects. An obvious
place to do so is after running "fetch", which may introduce
new loose objects or packs (depending on the size of the
fetch).
While an active developer repository will probably
eventually trigger a "gc --auto" on another action (e.g.,
git-rebase), there are two good reasons why it is nicer to
do it at fetch time:
1. Read-only repositories which track an upstream (e.g., a
continuous integration server which fetches and builds,
but never makes new commits) will accrue loose objects
and small packs, but never coalesce them into a more
efficient larger pack.
2. Fetching is often already perceived to be slow to the
user, since they have to wait on the network. It's much
more pleasant to include a potentially slow auto-gc as
part of the already-long network fetch than in the
middle of productive work with git-rebase or similar.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python 2.4 lacks the following features:
subprocess.check_call
struct.pack_into
Take a cue from 460d1026 and provide an implementation of the
CalledProcessError exception. Then replace the calls to
subproccess.check_call with calls to subprocess.call that check the return
status and raise a CalledProcessError exception if necessary.
The struct.pack_into in t/9802 can be converted into a single struct.pack
call which is available in Python 2.4.
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Acked-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python 2.5 and older do not accept None as the first argument to
translate() and complain with:
TypeError: expected a character buffer object
As suggested by Pete Wyckoff, let's just replace the call to translate()
with a regex search which should be more clear and more portable.
This allows git-p4 to be used with Python 2.5.
Signed-off-by: Brandon Casey <bcasey@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually a commit that makes it to logmsg_reencode will have
been parsed, and the commit->buffer struct member will be
valid. However, some code paths will free commit buffers
after having used them (for example, the log traversal
machinery will do so to keep memory usage down).
Most of the time this is fine; log should only show a commit
once, and then exits. However, there are some code paths
where this does not work. At least two are known:
1. A commit may be shown as part of a regular ref, and
then it may be shown again as part of a submodule diff
(e.g., if a repo contains refs to both the superproject
and subproject).
2. A notes-cache commit may be shown during "log --all",
and then later used to access a textconv cache during a
diff.
Lazily loading in logmsg_reencode does not necessarily catch
all such cases, but it should catch most of them. Users of
the commit buffer tend to be either parsing for structure
(in which they will call parse_commit, and either we will
already have parsed, or we will load commit->buffer lazily
there), or outputting (either to the user, or fetching a
part of the commit message via format_commit_message). In
the latter case, we should always be using logmsg_reencode
anyway (and typically we do so via the pretty-print
machinery).
If there are any cases that this misses, we can fix them up
to use logmsg_reencode (or handle them on a case-by-case
basis if that is inappropriate).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logmsg_reencode function will return the reencoded
commit buffer, or NULL if reencoding failed or no reencoding
was necessary. Since every caller then ends up checking for NULL
and just using the commit's original buffer, anyway, we can
be a bit more helpful and just return that buffer when we
would have returned NULL.
Since the resulting string may or may not need to be freed,
we introduce a logmsg_free, which checks whether the buffer
came from the commit object or not (callers either
implemented the same check already, or kept two separate
pointers, one to mark the buffer to be used, and one for the
to-be-freed string).
Pushing this logic into logmsg_* simplifies the callers, and
will let future patches lazily load the commit buffer in a
single place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When git-commit is asked to reuse a commit message via "-c",
we call read_commit_message, which looks up the commit and
hands back either the re-encoded result, or a copy of the
original. We make a copy in the latter case so that the
ownership semantics of the return value are clear (in either
case, it can be freed).
However, since we return a "const char *", and since the
resulting buffer's lifetime is the same as that of the whole
program, we never bother to free it at all.
Let's just drop the copy. That saves us a copy in the common
case. While it does mean we leak in the re-encode case, it
doesn't matter, since we are relying on program exit to free
the memory anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace our use of fnmatch(3) with a more feature-rich wildmatch.
A handful patches at the bottom have been moved to nd/wildmatch to
graduate as part of that branch, before this series solidifies.
We may want to mark USE_WILDMATCH as an experimental curiosity a
bit more clearly (i.e. should not be enabled in production
environment, because it will make the behaviour between builds
unpredictable).
* nd/retire-fnmatch:
Makefile: add USE_WILDMATCH to use wildmatch as fnmatch
wildmatch: advance faster in <asterisk> + <literal> patterns
wildmatch: make a special case for "*/" with FNM_PATHNAME
test-wildmatch: add "perf" command to compare wildmatch and fnmatch
wildmatch: support "no FNM_PATHNAME" mode
wildmatch: make dowild() take arbitrary flags
wildmatch: rename constants and update prototype
Describe tools for automation that were invented since this
document was originally written.
* jc/doc-maintainer:
howto/maintain: document "### match next" convention in jch/pu branch
howto/maintain: mark titles for asciidoc
Documentation: update "howto maintain git"
An earlier conversion from fgets() to strbuf_getline() in the
codepath to read from /etc/mailname to learn the default host-part
of the ident e-mail address forgot that strbuf_getline() stores the
line at the beginning of the buffer just like fgets().
The "username@" the caller has prepared in the strbuf, expecting the
function to append the host-part to it, was lost because of this.
Reported-by: Mihai Rusu <dizzy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that "already exists" errors are given only when a push tries to
update an existing ref in refs/tags/ hierarchy, we can say "the
tag", instead of "the destination reference", and that is far easier
to understand.
Pointed out by Chris Rorvick.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is harmless in Python 2, which sees the parentheses as redundant
grouping, but is required for Python 3. Since this is the only change
required to make this script just run under Python 3 without needing
2to3 it seems worthwhile.
The case of an empty print must be handled specially because in that
case Python 2 will interpret '()' as an empty tuple and print it as
'()'; inserting an empty string fixes this.
Signed-off-by: John Keeping <john@keeping.me.uk>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Python 3 forbids unbuffered I/O in text mode. Change the reading of
stdin in git-remote-testpy so that we read the lines as bytes and then
decode them a line at a time.
This allows us to keep the I/O unbuffered in order to avoid
reintroducing the bug fixed by commit 7fb8e16 (git-remote-testgit: fix
race when spawning fast-import).
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Under Python 3 'hasher.update(...)' must take a byte string and not a
unicode string. Explicitly encode the argument to this method to hex
bytes so that we don't need to worry about failures to encode that might
occur if we chose a textual encoding.
This changes the directory used by git-remote-testpy for its git mirror
of the remote repository, but this tool should not have any serious
users as it is used primarily to test the Python remote helper
framework.
The use of encode() moves the required Python version forward to 2.0.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The changes to allow this script to run with Python 3 are minimal and do
not affect its functionality on the versions of Python 2 that are
already supported (2.4 onwards).
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the approach detailed in the Python documentation[1], run 2to3 on
the code as part of the build if building with Python 3.
The code itself requires no changes to convert cleanly.
[1] http://docs.python.org/3.3/howto/pyporting.html#during-installation
Signed-off-by: John Keeping <john@keeping.me.uk>
Acked-by: Sverre Rabbelier <srabbelier@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When different version of python are used to build via distutils, the
behaviour can change. Detect changes in version and pass --force in
this case.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you have random build artifacts in your build directory, left
behind by running "make" while on another branch, the "git help -a"
command run by __git_list_all_commands in the completion script that
is being tested does not have a way to know that they are not part
of the subcommands this build will ship. Such extra subcommands may
come from the user's $PATH. They will interfere with the tests that
expect a certain prefix to uniquely expand to a known completion.
Instrument the completion script and give it a way for us to tell
what (subset of) subcommands we are going to ship.
Also add a test to "git --help <prefix><TAB>" expansion. It needs
to show not just commands but some selected documentation pages.
Based on an idea by Jeff King.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we push to update an existing ref, if:
* the object at the tip of the remote is not a commit; or
* the object we are pushing is not a commit,
it won't be correct to suggest to fetch, integrate and push again,
as the old and new objects will not "merge". We should explain that
the push must be forced when there is a non-committish object is
involved in such a case.
If we do not have the current object at the tip of the remote, we do
not even know that object, when fetched, is something that can be
merged. In such a case, suggesting to pull first just like
non-fast-forward case may not be technically correct, but in
practice, most such failures are seen when you try to push your work
to a branch without knowing that somebody else already pushed to
update the same branch since you forked, so "pull first" would work
as a suggestion most of the time. And if the object at the tip is
not a commit, "pull first" will fail, without making any permanent
damage. As a side effect, it also makes the error message the user
will get during the next "push" attempt easier to understand, now
the user is aware that a non-commit object is involved.
In these cases, the current code already rejects such a push on the
client end, but we used the same error and advice messages as the
ones used when rejecting a non-fast-forward push, i.e. pull from
there and integrate before pushing again.
Introduce new rejection reasons and reword the messages
appropriately.
[jc: with help by Peff on message details]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
First compute the reason why this push would fail if done without
"--force", and then fail it by assigning that reason when the push
was not forced (or if there is no reason to require force, allow it
to succeed).
Record the fact that the push was forced in the forced_update field
only when the push would have failed without the option.
The code becomes shorter, less repetitive and easier to read this
way, especially given that the set of rejection reasons will be
extended in a later patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "nonfastforward" and "update" fields are only used while
deciding what value to assign to the "status" locally in a single
function. Remove them from the "struct ref".
The "requires_force" field is not used to decide if the proposed
update requires a --force option to succeed, or to record such a
decision made elsewhere. It is used by status reporting code that
the particular update was "forced". Rename it to "forced_update",
and move the code to assign to it around to further clarify how it
is used and what it is used for.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Take the expected SHA-1 digest in a variable, and use it instead of
hardcoding when checking the result.
Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-cvsimport relies on version 2 of cvsps and does not work with the
new version 3. Since cvsps 3.x does not currently work as well as
version 2 for incremental import, document this fact.
Specifically, there is no way to make new git-cvsimport that supports
cvsps 3.x and have a seamless transition for existing users since cvsps
3.x needs a time from which to continue importing and git-cvsimport does
not save the time of the last import or import into a specific namespace
so there is no safe way to calculate the time of the last import.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since git-rev-parse already checks for the $GIT_DIR environment
variable and that it returns an actual git repository, there is no
need to repeat the checks again here.
This also fixes a problem where git-svn did not work in cases where
.git was a file with a gitdir: link.
[ew: squashed test case,
delay setting GIT_DIR until after `git rev-parse --cdup` to fix t9101,
(thanks to Junio)]
Signed-off-by: Barry Wardell <barry.wardell@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
We do not need to call uc() separately for sprintf("%x")
as sprintf("%X") is available.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Add an extra hook so that "git push" that is run without making
sure what is being pushed is sane can be checked and rejected (as
opposed to the user deciding not pushing).
* as/pre-push-hook:
Add sample pre-push hook script
push: Add support for pre-push hooks
hooks: Add function to check if a hook exists
Add a new command "git check-ignore" for debugging .gitignore
files.
The variable names may want to get cleaned up but that can be done
in-tree.
* as/check-ignore:
clean.c, ls-files.c: respect encapsulation of exclude_list_groups
t0008: avoid brace expansion
add git-check-ignore sub-command
setup.c: document get_pathspec()
add.c: extract new die_if_path_beyond_symlink() for reuse
add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
pathspec.c: rename newly public functions for clarity
add.c: move pathspec matchers into new pathspec.c for reuse
add.c: remove unused argument from validate_pathspec()
dir.c: improve docs for match_pathspec() and match_pathspec_depth()
dir.c: provide clear_directory() for reclaiming dir_struct memory
dir.c: keep track of where patterns came from
dir.c: use a single struct exclude_list per source of excludes
Conflicts:
builtin/ls-files.c
dir.c
Regression fix to stop "git push" complaining "target ref already
exists", when it is not the real reason the command rejected the
request (e.g. non-fast-forward).
* cr/push-force-tag-update:
push: fix "refs/tags/ hierarchy cannot be updated without --force"
Remove a lot of unused code from "git imap-send".
* mh/imap-send-shrinkage:
imap-send.c: simplify logic in lf_to_crlf()
imap-send.c: fold struct store into struct imap_store
imap-send.c: remove unused field imap_store::uidvalidity
imap-send.c: use struct imap_store instead of struct store
imap-send.c: remove unused field imap_store::trashnc
imap-send.c: remove namespace fields from struct imap
imap-send.c: remove struct imap argument to parse_imap_list_l()
imap-send.c: inline parse_imap_list() in parse_list()
imap-send.c: remove some unused fields from struct store
imap-send.c: remove struct message
imap-send.c: remove struct store_conf
iamp-send.c: remove unused struct imap_store_conf
imap-send.c: remove struct msg_data
imap-send.c: remove msg_data::flags, which was always zero
Various git-cvsserver updates.
* mo/cvs-server-updates:
t9402: Use TABs for indentation
t9402: Rename check.cvsCount and check.list
t9402: Simplify git ls-tree
t9402: Add missing &&; Code style
t9402: No space after IO-redirection
t9402: Dont use test_must_fail cvs
t9402: improve check_end_tree() and check_end_full_tree()
t9402: sed -i is not portable
cvsserver Documentation: new cvs ... -r support
cvsserver: add t9402 to test branch and tag refs
cvsserver: support -r and sticky tags for most operations
cvsserver: Add version awareness to argsfromdir
cvsserver: generalize getmeta() to recognize commit refs
cvsserver: implement req_Sticky and related utilities
cvsserver: add misc commit lookup, file meta data, and file listing functions
cvsserver: define a tag name character escape mechanism
cvsserver: cleanup extra slashes in filename arguments
cvsserver: factor out git-log parsing logic
This doesn't save any lines, but does keep us from doing
error-prone pointer arithmetic with constants.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The resulting code ends up about the same length, but it is
a little more self-explanatory. It now explicitly documents
and checks the pre-condition that the incoming var starts
with "man.", and drops the magic offset "4".
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We keep a strbuf for the name of the submodule, even though
we only ever add one string to it. Let's just use xmemdupz
instead, which is slightly more efficient and makes it
easier to follow what is going on.
Unfortunately, we still end up having to deal with some
memory ownership issues in some code branches, as we have to
allocate the string in order to do a string list lookup, and
then only sometimes want to hand ownership of that string
over to the string_list. Still, making that explicit in the
code (as opposed to sometimes detaching the strbuf, and then
always releasing it) makes it a little more obvious what is
going on.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes the code a lot simpler to read by dropping a
whole bunch of constant offsets.
As a bonus, it means we also feed the whole config variable
name to our error functions:
[before]
$ git -c submodule.foo.fetchrecursesubmodules=bogus checkout
fatal: bad foo.fetchrecursesubmodules argument: bogus
[after]
$ git -c submodule.foo.fetchrecursesubmodules=bogus checkout
fatal: bad submodule.foo.fetchrecursesubmodules argument: bogus
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we parse userdiff config, we generally assume that
diff.name.key
will affect the "key" value of the "name" driver. However,
without confirming that the key is a valid userdiff key, we
may accidentally conflict with the ancient "diff.color.*"
namespace. The current code is careful not to even create a
driver struct if we do not see a key that is known by the
diff-driver code.
However, this carefulness is unnecessary; the default driver
with no keys set behaves exactly the same as having no
driver at all. We can simply set up the driver struct as
soon as we see we have a config key that looks like a
driver. This makes the code a bit more readable.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These callers can drop some inline pointer arithmetic and
magic offset constants, making them more readable and less
error-prone (those constants had to match the lengths of
strings, but there is no automatic verification of that
fact).
The "ep" pointer (presumably for "end pointer"), which
points to the final key segment of the config variable, is
given the more standard name "key" to describe its function
rather than its derivation.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is fewer lines of code, but more importantly, fixes a
bogus pointer offset. We are looking for "tar." in the
section, but later assume that the dot we found is at offset
9, not 3. This is a holdover from an earlier iteration of
767cf45 which called the section "tarfilter".
As a result, we could erroneously reject some filters with
dots in their name, as well as read uninitialized memory.
Reported by (and test by) René Scharfe.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The config callback functions get keys of the general form:
section.subsection.key
(where the subsection may be contain arbitrary data, or may
be missing). For matching keys without subsections, it is
simple enough to call "strcmp". Matching keys with
subsections is a little more complicated, and each callback
does it in an ad-hoc way, usually involving error-prone
pointer arithmetic.
Let's provide a helper that keeps the pointer arithmetic all
in one place.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git://ozlabs.org/~paulus/gitk:
gitk: Display important heads even when there are many
gitk: Improve display of list of nearby tags and heads
gitk: Fix display of branch names on some commits
gitk: Update Swedish translation (296t)
gitk: When searching, only highlight files when in Patch mode
gitk: Fix error message when clicking on a connecting line
gitk: Fix crash when not using themed widgets
gitk: Use bindshiftfunctionkey to bind Shift-F5
gitk: Refactor code for binding modified function keys
gitk: Work around empty back and forward images when buttons are disabled
gitk: Highlight first search result immediately on incremental search
gitk: Highlight current search hit in orange
gitk: Synchronize highlighting in file view when scrolling diff
Commit fa2364ec ("Which merge_file() function do you mean?", 06-12-2012)
renamed the files merge-file.[ch] to merge-blobs.[ch], but forgot to
rename the header file in the definition of the LIB_H macro.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various 'reset' optimizations and clean-ups, followed by a change
to allow "git reset" to work even on an unborn branch.
* mz/reset-misc:
reset: update documentation to require only tree-ish with paths
reset [--mixed]: use diff-based reset whether or not pathspec was given
reset: allow reset on unborn branch
reset $sha1 $pathspec: require $sha1 only to be treeish
reset.c: inline update_index_refresh()
reset.c: finish entire cmd_reset() whether or not pathspec is given
reset [--mixed]: only write index file once
reset.c: move lock, write and commit out of update_index_refresh()
reset.c: move update_index_refresh() call out of read_from_tree()
reset.c: replace switch by if-else
reset: avoid redundant error message
reset --keep: only write index file once
reset.c: share call to die_if_unmerged_cache()
reset.c: extract function for updating {ORIG_,}HEAD
reset.c: remove unnecessary variable 'i'
reset.c: extract function for parsing arguments
reset: don't allow "git reset -- $pathspec" in bare repo
reset.c: pass pathspec around instead of (prefix, argv) pair
reset $pathspec: exit with code 0 if successful
reset $pathspec: no need to discard index