Commit 1a83c24 (git_snpath(): retire and replace with
strbuf_git_path(), 2014-11-30) taught log_ref_setup and
log_ref_write_1 to take a strbuf parameter, rather than a
bare string. It then makes an alias to the strbuf's "buf"
field under the original name.
This made the original diff much shorter, but the resulting
code is more complicated that it needs to be. Since we've
aliased the pointer, we drop our reference to the strbuf to
ensure we don't accidentally change it. But if we simply
drop our alias and use "logfile.buf" directly, we do not
have to worry about this aliasing. It's a larger diff, but
the resulting code is simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In iterating over the loose refs in "refs/foo/", we keep a
running strbuf with "refs/foo/one", "refs/foo/two", etc. But
we also need to access these files in the filesystem, as
".git/refs/foo/one", etc. For this latter purpose, we make a
series of independent calls to git_path(). These are safe
(we only use the result to call stat()), but assigning the
result of git_path is a suspicious pattern that we'd rather
avoid.
This patch keeps a running buffer with ".git/refs/foo/", and
we can just append/reset each directory element as we loop.
This matches how we handle the refnames. It should also be
more efficient, as we do not keep formatting the same
".git/refs/foo" prefix (which can be arbitrarily deep).
Technically we are dropping a call to strbuf_cleanup() on
each generated filename, but that's OK; it wasn't doing
anything, as we are putting in single-level names we read
from the filesystem (so it could not possibly be cleaning up
cruft like "./" in this instance).
A clever reader may also note that the running refname
buffer ("refs/foo/") is actually a subset of the filesystem
path buffer (".git/refs/foo/"). We could get by with one
buffer, indexing the length of $GIT_DIR when we want the
refname. However, having tried this, the resulting code
actually ends up a little more confusing, and the efficiency
improvement is tiny (and almost certainly dwarfed by the
system calls we are making).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the previous commit to git_path, assigning the
result of mkpath is suspicious, since it is not clear
whether we will still depend on the value after it may have
been overwritten by subsequent calls. This patch converts
low-hanging fruit to use mkpathdup instead of mkpath (with
the downside that we must remember to free the result).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.
None of the cases changed here has a known bug, but they're
worth converting away from git_path because:
1. It's easy to use git_pathdup in these cases.
2. They use constructs (like assignment) that make it
hard to tell whether they're safe or not.
The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow an asterisk as a substring (as opposed to the entirety) of
a path component for both side of a refspec, e.g.
"refs/heads/o*:refs/remotes/heads/i*".
* jk/refspec-parse-wildcard:
refs: loosen restriction on wildcard "*" refspecs
refs: cleanup comments regarding check_refname_component()
In preparation for allowing different "backends" to store the refs
in a way different from the traditional "one ref per file in $GIT_DIR
or in a $GIT_DIR/packed-refs file" filesystem storage, reduce
direct filesystem access to ref-like things like CHERRY_PICK_HEAD
from scripts and programs.
* dt/refs-backend-preamble:
git-stash: use update-ref --create-reflog instead of creating files
update-ref and tag: add --create-reflog arg
refs: add REF_FORCE_CREATE_REFLOG flag
git-reflog: add exists command
refs: new public ref function: safe_create_reflog
refs: break out check for reflog autocreation
refs.c: add err arguments to reflog functions
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).
* jk/date-mode-format:
strbuf: make strbuf_addftime more robust
introduce "format" date-mode
convert "enum date_mode" into a struct
show-branch: use DATE_RELATIVE instead of magic number
Clean up refs API and make "git clone" less intimate with the
implementation detail.
* mh/init-delete-refs-api:
delete_ref(): use the usual convention for old_sha1
cmd_update_ref(): make logic more straightforward
update_ref(): don't read old reference value before delete
check_branch_commit(): make first parameter const
refs.h: add some parameter names to function declarations
refs: move the remaining ref module declarations to refs.h
initial_ref_transaction_commit(): check for ref D/F conflicts
initial_ref_transaction_commit(): check for duplicate refs
refs: remove some functions from the module's public interface
initial_ref_transaction_commit(): function for initial ref creation
repack_without_refs(): make function private
prune_refs(): use delete_refs()
prune_remote(): use delete_refs()
delete_refs(): bail early if the packed-refs file cannot be rewritten
delete_refs(): make error message more generic
delete_refs(): new function for the refs API
delete_ref(): handle special case more explicitly
remove_branches(): remove temporary
delete_ref(): move declaration to refs.h
Add an environment variable to tell Git to look into refs hierarchy
other than refs/replace/ for the object replacement data.
* mh/replace-refs:
Allow to control where the replace refs are looked for
Loosen restrictions on refspecs by allowing patterns that have a "*"
within a component instead of only as the whole component.
Remove the logic to accept a single "*" as a whole component from
check_refname_format(), and implement an extended form of that logic
in check_refname_component(). Pass the pointer to the flags argument
to the latter, as it has to clear REFNAME_REFSPEC_PATTERN bit when
it sees "*".
Teach check_refname_component() function to allow an asterisk "*"
only when REFNAME_REFSPEC_PATTERN is set in the flags, and drop the
bit after seeing a "*", to ensure that one side of a refspec
contains at most one asterisk.
This will allow us to accept refspecs such as `for/bar*:foo/baz*`.
Any refspec which functioned before shall continue functioning with
the new logic.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correctly specify all characters which are rejected under the '4: a
bad character' disposition, which did not list all characters that
are treated as such.
Cleanup comment style for rejected refs by inserting a ", or" at the
end of each statement.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a flag to allow forcing the creation of a reflog even if the ref
name and core.logAllRefUpdates setting would not ordinarily cause ref
creation.
In a moment, we will use this to add options to git tag and git
update-ref to force reflog creation.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The safe_create_reflog function creates a reflog, if it does not
already exist.
The log_ref_setup function becomes private and gains a force_create
parameter to force the creation of a reflog even if log_all_ref_updates
is false or the refname is not one of the special refnames.
The new parameter also reduces the need to store, modify, and restore
the log_all_ref_updates global before reflog creation.
In a moment, we will use this to add reflog creation commands to
git-reflog.
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an err argument to log_ref_setup that can explain the reason
for a failure. This then eliminates the need to manage errno through
this function since we can just add strerror(errno) to the err string
when meaningful. No callers relied on errno from this function for
anything else than the error message.
Also add err arguments to private functions write_ref_to_lockfile,
log_ref_write_1, commit_ref_update. This again eliminates the need to
manage errno in these functions.
Some error messages are slightly reordered.
Update of a patch by Ronnie Sahlberg.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git for-each-ref" reported "missing object" for 0{40} when it
encounters a broken ref. The lack of object whose name is 0{40} is
not the problem; the ref being broken is.
* mh/reporting-broken-refs-from-for-each-ref:
read_loose_refs(): treat NULL_SHA1 loose references as broken
read_loose_refs(): simplify function logic
for-each-ref: report broken references correctly
t6301: new tests of for-each-ref error handling
The ref_transaction_update() family of functions use the following
convention for their old_sha1 parameters:
* old_sha1 == NULL: Don't check the old value at all.
* is_null_sha1(old_sha1): Ensure that the reference didn't exist
before the transaction.
* otherwise: Ensure that the reference had the specified value before
the transaction.
delete_ref() had a different convention, namely treating
is_null_sha1(old_sha1) as "don't care". Change it to adhere to the
standard convention to reduce the scope for confusion.
Please note that it is now a bug to pass old_sha1=NULL_SHA1 to
delete_ref() (because it doesn't make sense to delete a reference that
you already know doesn't exist). This is consistent with the behavior
of ref_transaction_delete().
Most of the callers of delete_ref() never pass old_sha1=NULL_SHA1 to
delete_ref(), and are therefore unaffected by this change. The
two exceptions are:
* The call in cmd_update_ref(), which passed NULL_SHA1 if the old
value passed in on the command line was 0{40} or the empty string.
Change that caller to pass NULL in those cases.
Arguably, it should be an error to call "update-ref -d" with the old
value set to "does not exist", just as it is for the `--stdin`
command "delete". But since this usage was accepted until now,
continue to accept it.
* The call in delete_branches(), which could pass NULL_SHA1 if
deleting a broken or symbolic ref. Change it to pass NULL in these
cases.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some functions from the refs module were still declared in cache.h.
Move them to refs.h.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In initial_ref_transaction_commit(), check for D/F conflicts (i.e.,
the type of conflict that exists between "refs/foo" and
"refs/foo/bar") among the references being created and between the
references being created and any hypothetical existing references.
Ideally, there shouldn't *be* any existing references when this
function is called. But, at least in the case of the "testgit" remote
helper, "clone" can be called after the remote-tracking "HEAD" and
"master" branches have already been created. So let's just do the
full-blown check.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error out if the ref_transaction includes more than one update for any
refname.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following functions are no longer used from outside the refs
module:
* lock_packed_refs()
* add_packed_ref()
* commit_packed_refs()
* rollback_packed_refs()
So make these functions private.
This is an important step, because it means that nobody outside of the
refs module needs to know the difference between loose and packed
references.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone" uses shortcuts when creating the initial set of
references:
* It writes them directly to packed-refs.
* It doesn't lock the individual references (though it does lock the
packed-refs file).
* It doesn't check for refname conflicts between two new references or
between one new reference and any hypothetical old ones.
* It doesn't create reflog entries for the reference creations.
This functionality was implemented in builtin/clone.c. But really that
file shouldn't have such intimate knowledge of how references are
stored. So provide a new function in the refs API,
initial_ref_transaction_commit(), which can be used for initial
reference creation. The new function is based on the ref_transaction
interface.
This means that we can make some other functions private to the refs
module. That will be done in a followup commit.
It would seem to make sense to add a test here that there are no
existing references, because that is how the function *should* be
used. But in fact, the "testgit" remote helper appears to call it
*after* having set up refs/remotes/<name>/HEAD and
refs/remotes/<name>/master, so we can't be so strict. For now, the
function trusts its caller to only call it when it makes sense. Future
commits will add some more limited sanity checks.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is no longer called from outside of the refs module. Also move its
docstring and change it to imperative voice.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we fail to delete the doomed references from the packed-refs file,
then it is unsafe to delete their loose references, because doing so
might expose a value from the packed-refs file that is obsolete and
perhaps even points at an object that has been garbage collected.
So if repack_without_refs() fails, emit a more explicit error message
and bail.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the error message from
Could not remove branch %s
to
could not remove reference %s
First of all, the old error message referred to "branch
refs/remotes/origin/foo", which was awkward even for the existing
caller. Normally we would refer to a reference like that as either
"remote-tracking branch origin/foo" or "reference
refs/remotes/origin/foo". Here I take the lazier alternative.
Moreover, now that this function is part of the refs API, it might be
called for refs that are neither branches nor remote-tracking
branches.
While we're at it, convert the error message to lower case, as per our
usual convention.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the function remove_branches() from builtin/remote.c to refs.c,
rename it to delete_refs(), and make it public.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
delete_ref() uses a different convention for its old_sha1 parameter
than, say, ref_transaction_delete(): NULL_SHA1 means not to check the
old value. Make this fact a little bit clearer in the code by handling
it in explicit, commented code rather than burying it in a conditional
expression.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also
* Add a docstring
* Rename the second parameter to "old_sha1", to be consistent with the
convention used elsewhere in the refs module
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be useful to have grafts or replace refs for specific use-cases while
keeping the default "view" of the repository pristine (or with a different
set of grafts/replace refs).
It is possible to use a different graft file with GIT_GRAFT_FILE, but while
replace refs are more powerful, they don't have an equivalent override.
Add a GIT_REPLACE_REF_BASE environment variable to control where git is
going to look for replace refs.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bring consistency to error reporting mechanism used in "refs" API.
* mh/verify-lock-error-report:
ref_transaction_commit(): do not capitalize error messages
verify_lock(): do not capitalize error messages
verify_lock(): report errors via a strbuf
verify_lock(): on errors, let the caller unlock the lock
verify_lock(): return 0/-1 rather than struct ref_lock *
NULL_SHA1 is used to indicate an "invalid object name" throughout our
code (and the code of other git implementations), so it is vastly more
likely that an on-disk reference was set to this value due to a
software bug than that NULL_SHA1 is the legitimate SHA-1 of an actual
object. Therefore, if a loose reference has the value NULL_SHA1,
consider it to be broken.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Multi-ref transaction support we merged a few releases ago
unnecessarily kept many file descriptors open, risking to fail with
resource exhaustion. This is for 2.4.x track.
* mh/write-refs-sooner-2.4:
ref_transaction_commit(): fix atomicity and avoid fd exhaustion
ref_transaction_commit(): remove the local flags variable
ref_transaction_commit(): inline call to write_ref_sha1()
rename_ref(): inline calls to write_ref_sha1() from this function
commit_ref_update(): new function, extracted from write_ref_sha1()
write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
update-ref: test handling large transactions properly
ref_transaction_commit(): fix atomicity and avoid fd exhaustion
ref_transaction_commit(): remove the local flags variable
ref_transaction_commit(): inline call to write_ref_sha1()
rename_ref(): inline calls to write_ref_sha1() from this function
commit_ref_update(): new function, extracted from write_ref_sha1()
write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
update-ref: test handling large transactions properly
Make it clearer that there are two possible ways to read the
reference, but that we handle read errors uniformly regardless of
which way it was read.
This refactoring also makes the following change easier to implement.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our convention is for error messages to start with a lower-case
letter.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our convention is for error messages to start with a lower-case
letter.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of writing error messages directly to stderr, write them to
a "strbuf *err". The caller, lock_ref_sha1_basic(), uses this error
reporting convention with all the other callees, and reports its
error this way to its callers.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The caller already knows how to do it, so always do it in the same
place.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Its return value wasn't conveying any extra information, but it made
the reader wonder whether the ref_lock that it returned might be
different than the one that was passed to it. So change the function
to the traditional "return 0 on success or a negative value on error".
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the callers of the for_each_ref family of functions have now
been rewritten to work with object_ids, so this adapter is no longer
needed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".
To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().
This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of dying immediately upon failing to obtain a lock, retry
after a short while with backoff.
* mh/lockfile-retry:
lock_packed_refs(): allow retries when acquiring the packed-refs lock
lockfile: allow file locking to be retried with a timeout
The ref API did not handle cases where 'refs/heads/xyzzy/frotz' is
removed at the same time as 'refs/heads/xyzzy' is added (or vice
versa) very well.
* mh/ref-directory-file:
reflog_expire(): integrate lock_ref_sha1_basic() errors into ours
ref_transaction_commit(): delete extra "the" from error message
ref_transaction_commit(): provide better error messages
rename_ref(): integrate lock_ref_sha1_basic() errors into ours
lock_ref_sha1_basic(): improve diagnostics for ref D/F conflicts
lock_ref_sha1_basic(): report errors via a "struct strbuf *err"
verify_refname_available(): report errors via a "struct strbuf *err"
verify_refname_available(): rename function
refs: check for D/F conflicts among refs created in a transaction
ref_transaction_commit(): use a string_list for detecting duplicates
is_refname_available(): use dirname in first loop
struct nonmatching_ref_data: store a refname instead of a ref_entry
report_refname_conflict(): inline function
entry_matches(): inline function
is_refname_available(): convert local variable "dirname" to strbuf
is_refname_available(): avoid shadowing "dir" variable
is_refname_available(): revamp the comments
t1404: new tests of ref D/F conflicts within transactions
Multi-ref transaction support we merged a few releases ago
unnecessarily kept many file descriptors open, risking to fail with
resource exhaustion. This is for 2.4.x track.
* mh/write-refs-sooner-2.4:
ref_transaction_commit(): fix atomicity and avoid fd exhaustion
ref_transaction_commit(): remove the local flags variable
ref_transaction_commit(): inline call to write_ref_sha1()
rename_ref(): inline calls to write_ref_sha1() from this function
commit_ref_update(): new function, extracted from write_ref_sha1()
write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
update-ref: test handling large transactions properly
ref_transaction_commit(): fix atomicity and avoid fd exhaustion
ref_transaction_commit(): remove the local flags variable
ref_transaction_commit(): inline call to write_ref_sha1()
rename_ref(): inline calls to write_ref_sha1() from this function
commit_ref_update(): new function, extracted from write_ref_sha1()
write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
update-ref: test handling large transactions properly
The refs API uses ref_lock struct which had its own "int fd", even
though the same file descriptor was in the lock struct it contains.
Clean-up the code to lose this redundant field.
* sb/ref-lock-lose-lock-fd:
refs.c: remove lock_fd from struct ref_lock
Currently, there is only one attempt to acquire any lockfile, and if
the lock is held by another process, the locking attempt fails
immediately.
This is not such a limitation for loose reference files. First, they
don't take long to rewrite. Second, most reference updates have a
known "old" value, so if another process is updating a reference at
the same moment that we are trying to lock it, then probably the
expected "old" value will not longer be valid, and the update will
fail anyway.
But these arguments do not hold for packed-refs:
* The packed-refs file can be large and take significant time to
rewrite.
* Many references are stored in a single packed-refs file, so it could
be that the other process was changing a different reference than
the one that we are interested in.
Therefore, it is much more likely for there to be spurious lock
conflicts in connection to the packed-refs file, resulting in
unnecessary command failures.
So, if the first attempt to lock the packed-refs file fails, continue
retrying for a configurable length of time before giving up. The
default timeout is 1 second.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old code was roughly
for update in updates:
acquire locks and check old_sha
for update in updates:
if changing value:
write_ref_to_lockfile()
commit_ref_update()
for update in updates:
if deleting value:
unlink()
rewrite packed-refs file
for update in updates:
if reference still locked:
unlock_ref()
This has two problems.
Non-atomic updates
==================
The atomicity of the reference transaction depends on all pre-checks
being done in the first loop, before any changes have started being
committed in the second loop. The problem is that
write_ref_to_lockfile() (previously part of write_ref_sha1()), which
is called from the second loop, contains two more checks:
* It verifies that new_sha1 is a valid object
* If the reference being updated is a branch, it verifies that
new_sha1 points at a commit object (as opposed to a tag, tree, or
blob).
If either of these checks fails, the "transaction" is aborted during
the second loop. But this might happen after some reference updates
have already been permanently committed. In other words, the
all-or-nothing promise of "git update-ref --stdin" could be violated.
So these checks have to be moved to the first loop.
File descriptor exhaustion
==========================
The old code locked all of the references in the first loop, leaving
all of the lockfiles open until later loops. Since we might be
updating a lot of references, this could result in file descriptor
exhaustion.
The solution
============
After this patch, the code looks like
for update in updates:
acquire locks and check old_sha
if changing value:
write_ref_to_lockfile()
else:
close_ref()
for update in updates:
if changing value:
commit_ref_update()
for update in updates:
if deleting value:
unlink()
rewrite packed-refs file
for update in updates:
if reference still locked:
unlock_ref()
This fixes both problems:
1. The pre-checks in write_ref_to_lockfile() are now done in the first
loop, before any changes have been committed. If any of the checks
fails, the whole transaction can now be rolled back correctly.
2. All lockfiles are closed in the first loop immediately after they
are created (either by write_ref_to_lockfile() or by close_ref()).
This means that there is never more than one open lockfile at a
time, preventing file descriptor exhaustion.
To simplify the bookkeeping across loops, add a new REF_NEEDS_COMMIT
bit to update->flags, which keeps track of whether the corresponding
lockfile needs to be committed, as opposed to just unlocked. (Since
"struct ref_update" is internal to the refs module, this change is not
visible to external callers.)
This change fixes two tests in t1400.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>