2015-11-09 14:34:01 +01:00
|
|
|
/*
|
|
|
|
* The backend-independent part of the reference module.
|
|
|
|
*/
|
|
|
|
|
2023-03-21 07:25:53 +01:00
|
|
|
#include "cache.h"
|
2023-02-24 01:09:24 +01:00
|
|
|
#include "alloc.h"
|
2017-06-14 20:07:36 +02:00
|
|
|
#include "config.h"
|
2017-02-10 12:16:15 +01:00
|
|
|
#include "hashmap.h"
|
2023-03-21 07:25:54 +01:00
|
|
|
#include "gettext.h"
|
2023-02-24 01:09:27 +01:00
|
|
|
#include "hex.h"
|
2014-10-01 12:28:42 +02:00
|
|
|
#include "lockfile.h"
|
2017-04-16 08:41:26 +02:00
|
|
|
#include "iterator.h"
|
2006-12-19 23:34:12 +01:00
|
|
|
#include "refs.h"
|
2015-11-10 12:42:36 +01:00
|
|
|
#include "refs/refs-internal.h"
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
#include "run-command.h"
|
2021-09-26 21:03:26 +02:00
|
|
|
#include "hook.h"
|
2018-05-16 01:42:15 +02:00
|
|
|
#include "object-store.h"
|
2006-11-19 22:22:44 +01:00
|
|
|
#include "object.h"
|
|
|
|
#include "tag.h"
|
2017-03-26 04:42:31 +02:00
|
|
|
#include "submodule.h"
|
2017-04-24 12:01:22 +02:00
|
|
|
#include "worktree.h"
|
2020-07-28 22:23:39 +02:00
|
|
|
#include "strvec.h"
|
2018-04-12 02:21:09 +02:00
|
|
|
#include "repository.h"
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
#include "sigchain.h"
|
date API: create a date.h, split from cache.h
Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.
The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.
Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.
We could simply include this new header in "cache.h", but as this
change shows these functions weren't common enough to warrant
including in it in the first place. By moving them out of cache.h
changes to this API will no longer cause a (mostly) full re-build of
the project when "make" is run.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 09:14:02 +01:00
|
|
|
#include "date.h"
|
2022-08-05 19:58:36 +02:00
|
|
|
#include "commit.h"
|
2014-12-12 09:57:02 +01:00
|
|
|
|
2016-09-04 18:08:10 +02:00
|
|
|
/*
|
|
|
|
* List of all available backends
|
|
|
|
*/
|
|
|
|
static struct ref_storage_be *refs_backends = &refs_be_files;
|
|
|
|
|
|
|
|
static struct ref_storage_be *find_ref_storage_backend(const char *name)
|
|
|
|
{
|
|
|
|
struct ref_storage_be *be;
|
|
|
|
for (be = refs_backends; be; be = be->next)
|
|
|
|
if (!strcmp(be->name, name))
|
|
|
|
return be;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2012-04-10 07:30:13 +02:00
|
|
|
/*
|
2014-06-04 05:38:10 +02:00
|
|
|
* How to handle various characters in refnames:
|
|
|
|
* 0: An acceptable character for refs
|
2014-07-28 19:41:53 +02:00
|
|
|
* 1: End-of-component
|
|
|
|
* 2: ., look for a preceding . to reject .. in refs
|
|
|
|
* 3: {, look for a preceding @ to reject @{ in refs
|
2015-07-22 23:05:32 +02:00
|
|
|
* 4: A bad character: ASCII control characters, and
|
2015-07-22 23:05:33 +02:00
|
|
|
* ":", "?", "[", "\", "^", "~", SP, or TAB
|
|
|
|
* 5: *, reject unless REFNAME_REFSPEC_PATTERN is set
|
2014-06-04 05:38:10 +02:00
|
|
|
*/
|
|
|
|
static unsigned char refname_disposition[256] = {
|
2014-07-28 19:41:53 +02:00
|
|
|
1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
|
|
|
|
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
|
2015-07-22 23:05:33 +02:00
|
|
|
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 2, 1,
|
2014-07-28 19:41:53 +02:00
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 4,
|
2014-06-04 05:38:10 +02:00
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
2014-07-28 19:41:53 +02:00
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 0, 4, 0,
|
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 4, 4
|
2014-06-04 05:38:10 +02:00
|
|
|
};
|
|
|
|
|
2022-08-05 19:58:36 +02:00
|
|
|
struct ref_namespace_info ref_namespace[] = {
|
|
|
|
[NAMESPACE_HEAD] = {
|
|
|
|
.ref = "HEAD",
|
|
|
|
.decoration = DECORATION_REF_HEAD,
|
|
|
|
.exact = 1,
|
|
|
|
},
|
|
|
|
[NAMESPACE_BRANCHES] = {
|
|
|
|
.ref = "refs/heads/",
|
|
|
|
.decoration = DECORATION_REF_LOCAL,
|
|
|
|
},
|
|
|
|
[NAMESPACE_TAGS] = {
|
|
|
|
.ref = "refs/tags/",
|
|
|
|
.decoration = DECORATION_REF_TAG,
|
|
|
|
},
|
|
|
|
[NAMESPACE_REMOTE_REFS] = {
|
|
|
|
/*
|
|
|
|
* The default refspec for new remotes copies refs from
|
|
|
|
* refs/heads/ on the remote into refs/remotes/<remote>/.
|
|
|
|
* As such, "refs/remotes/" has special handling.
|
|
|
|
*/
|
|
|
|
.ref = "refs/remotes/",
|
|
|
|
.decoration = DECORATION_REF_REMOTE,
|
|
|
|
},
|
|
|
|
[NAMESPACE_STASH] = {
|
|
|
|
/*
|
|
|
|
* The single ref "refs/stash" stores the latest stash.
|
|
|
|
* Older stashes can be found in the reflog.
|
|
|
|
*/
|
|
|
|
.ref = "refs/stash",
|
|
|
|
.exact = 1,
|
|
|
|
.decoration = DECORATION_REF_STASH,
|
|
|
|
},
|
|
|
|
[NAMESPACE_REPLACE] = {
|
|
|
|
/*
|
|
|
|
* This namespace allows Git to act as if one object ID
|
|
|
|
* points to the content of another. Unlike the other
|
|
|
|
* ref namespaces, this one can be changed by the
|
|
|
|
* GIT_REPLACE_REF_BASE environment variable. This
|
|
|
|
* .namespace value will be overwritten in setup_git_env().
|
|
|
|
*/
|
|
|
|
.ref = "refs/replace/",
|
|
|
|
.decoration = DECORATION_GRAFTED,
|
|
|
|
},
|
|
|
|
[NAMESPACE_NOTES] = {
|
|
|
|
/*
|
|
|
|
* The refs/notes/commit ref points to the tip of a
|
|
|
|
* parallel commit history that adds metadata to commits
|
|
|
|
* in the normal history. This ref can be overwritten
|
|
|
|
* by the core.notesRef config variable or the
|
|
|
|
* GIT_NOTES_REFS environment variable.
|
|
|
|
*/
|
|
|
|
.ref = "refs/notes/commit",
|
|
|
|
.exact = 1,
|
|
|
|
},
|
|
|
|
[NAMESPACE_PREFETCH] = {
|
|
|
|
/*
|
|
|
|
* Prefetch refs are written by the background 'fetch'
|
|
|
|
* maintenance task. It allows faster foreground fetches
|
|
|
|
* by advertising these previously-downloaded tips without
|
|
|
|
* updating refs/remotes/ without user intervention.
|
|
|
|
*/
|
|
|
|
.ref = "refs/prefetch/",
|
|
|
|
},
|
|
|
|
[NAMESPACE_REWRITTEN] = {
|
|
|
|
/*
|
|
|
|
* Rewritten refs are used by the 'label' command in the
|
|
|
|
* sequencer. These are particularly useful during an
|
|
|
|
* interactive rebase that uses the 'merge' command.
|
|
|
|
*/
|
|
|
|
.ref = "refs/rewritten/",
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
|
|
|
void update_ref_namespace(enum ref_namespace namespace, char *ref)
|
|
|
|
{
|
|
|
|
struct ref_namespace_info *info = &ref_namespace[namespace];
|
|
|
|
if (info->ref_updated)
|
|
|
|
free(info->ref);
|
|
|
|
info->ref = ref;
|
|
|
|
info->ref_updated = 1;
|
|
|
|
}
|
|
|
|
|
2014-06-04 05:38:10 +02:00
|
|
|
/*
|
|
|
|
* Try to read one refname component from the front of refname.
|
|
|
|
* Return the length of the component found, or -1 if the component is
|
|
|
|
* not legal. It is legal if it is something reasonable to have under
|
|
|
|
* ".git/refs/"; We do not like it if:
|
2012-04-10 07:30:13 +02:00
|
|
|
*
|
2019-03-08 10:28:34 +01:00
|
|
|
* - it begins with ".", or
|
2012-04-10 07:30:13 +02:00
|
|
|
* - it has double dots "..", or
|
2015-07-22 23:05:32 +02:00
|
|
|
* - it has ASCII control characters, or
|
2015-07-22 23:05:33 +02:00
|
|
|
* - it has ":", "?", "[", "\", "^", "~", SP, or TAB anywhere, or
|
|
|
|
* - it has "*" anywhere unless REFNAME_REFSPEC_PATTERN is set, or
|
2015-07-22 23:05:32 +02:00
|
|
|
* - it ends with a "/", or
|
|
|
|
* - it ends with ".lock", or
|
|
|
|
* - it contains a "@{" portion
|
2019-03-08 10:28:34 +01:00
|
|
|
*
|
|
|
|
* When sanitized is not NULL, instead of rejecting the input refname
|
|
|
|
* as an error, try to come up with a usable replacement for the input
|
|
|
|
* refname in it.
|
2012-04-10 07:30:13 +02:00
|
|
|
*/
|
2019-03-08 10:28:34 +01:00
|
|
|
static int check_refname_component(const char *refname, int *flags,
|
|
|
|
struct strbuf *sanitized)
|
2012-04-10 07:30:13 +02:00
|
|
|
{
|
|
|
|
const char *cp;
|
|
|
|
char last = '\0';
|
2019-03-08 10:28:34 +01:00
|
|
|
size_t component_start = 0; /* garbage - not a reasonable initial value */
|
|
|
|
|
|
|
|
if (sanitized)
|
|
|
|
component_start = sanitized->len;
|
2012-04-10 07:30:13 +02:00
|
|
|
|
|
|
|
for (cp = refname; ; cp++) {
|
2014-06-04 05:38:10 +02:00
|
|
|
int ch = *cp & 255;
|
|
|
|
unsigned char disp = refname_disposition[ch];
|
2019-03-08 10:28:34 +01:00
|
|
|
|
|
|
|
if (sanitized && disp != 1)
|
|
|
|
strbuf_addch(sanitized, ch);
|
|
|
|
|
2014-06-04 05:38:10 +02:00
|
|
|
switch (disp) {
|
2014-07-28 19:41:53 +02:00
|
|
|
case 1:
|
2014-06-04 05:38:10 +02:00
|
|
|
goto out;
|
2014-07-28 19:41:53 +02:00
|
|
|
case 2:
|
2019-03-08 10:28:34 +01:00
|
|
|
if (last == '.') { /* Refname contains "..". */
|
|
|
|
if (sanitized)
|
|
|
|
/* collapse ".." to single "." */
|
|
|
|
strbuf_setlen(sanitized, sanitized->len - 1);
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2014-06-04 05:38:10 +02:00
|
|
|
break;
|
2014-07-28 19:41:53 +02:00
|
|
|
case 3:
|
2019-03-08 10:28:34 +01:00
|
|
|
if (last == '@') { /* Refname contains "@{". */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[sanitized->len-1] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2012-04-10 07:30:13 +02:00
|
|
|
break;
|
2014-07-28 19:41:53 +02:00
|
|
|
case 4:
|
2019-03-08 10:28:34 +01:00
|
|
|
/* forbidden char */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[sanitized->len-1] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
break;
|
2015-07-22 23:05:33 +02:00
|
|
|
case 5:
|
2019-03-08 10:28:34 +01:00
|
|
|
if (!(*flags & REFNAME_REFSPEC_PATTERN)) {
|
|
|
|
/* refspec can't be a pattern */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[sanitized->len-1] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2015-07-22 23:05:33 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Unset the pattern flag so that we only accept
|
|
|
|
* a single asterisk for one side of refspec.
|
|
|
|
*/
|
|
|
|
*flags &= ~ REFNAME_REFSPEC_PATTERN;
|
|
|
|
break;
|
2014-06-04 05:38:10 +02:00
|
|
|
}
|
2012-04-10 07:30:13 +02:00
|
|
|
last = ch;
|
|
|
|
}
|
2014-06-04 05:38:10 +02:00
|
|
|
out:
|
2012-04-10 07:30:13 +02:00
|
|
|
if (cp == refname)
|
2012-04-10 07:30:22 +02:00
|
|
|
return 0; /* Component has zero length. */
|
2019-03-08 10:28:34 +01:00
|
|
|
|
|
|
|
if (refname[0] == '.') { /* Component starts with '.'. */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[component_start] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2014-10-01 12:28:15 +02:00
|
|
|
if (cp - refname >= LOCK_SUFFIX_LEN &&
|
2019-03-08 10:28:34 +01:00
|
|
|
!memcmp(cp - LOCK_SUFFIX_LEN, LOCK_SUFFIX, LOCK_SUFFIX_LEN)) {
|
|
|
|
if (!sanitized)
|
|
|
|
return -1;
|
|
|
|
/* Refname ends with ".lock". */
|
|
|
|
while (strbuf_strip_suffix(sanitized, LOCK_SUFFIX)) {
|
|
|
|
/* try again in case we have .lock.lock */
|
|
|
|
}
|
|
|
|
}
|
2012-04-10 07:30:13 +02:00
|
|
|
return cp - refname;
|
|
|
|
}
|
|
|
|
|
2019-03-08 10:28:34 +01:00
|
|
|
static int check_or_sanitize_refname(const char *refname, int flags,
|
|
|
|
struct strbuf *sanitized)
|
2012-04-10 07:30:13 +02:00
|
|
|
{
|
|
|
|
int component_len, component_count = 0;
|
|
|
|
|
2019-03-08 10:28:34 +01:00
|
|
|
if (!strcmp(refname, "@")) {
|
Add new @ shortcut for HEAD
Typing 'HEAD' is tedious, especially when we can use '@' instead.
The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.
So now we can use 'git show @~1', and all that goody goodness.
Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-02 08:34:30 +02:00
|
|
|
/* Refname is a single character '@'. */
|
2019-03-08 10:28:34 +01:00
|
|
|
if (sanitized)
|
|
|
|
strbuf_addch(sanitized, '-');
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
Add new @ shortcut for HEAD
Typing 'HEAD' is tedious, especially when we can use '@' instead.
The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.
So now we can use 'git show @~1', and all that goody goodness.
Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-02 08:34:30 +02:00
|
|
|
|
2012-04-10 07:30:13 +02:00
|
|
|
while (1) {
|
2019-03-08 10:28:34 +01:00
|
|
|
if (sanitized && sanitized->len)
|
|
|
|
strbuf_complete(sanitized, '/');
|
|
|
|
|
2012-04-10 07:30:13 +02:00
|
|
|
/* We are at the start of a path component. */
|
2019-03-08 10:28:34 +01:00
|
|
|
component_len = check_refname_component(refname, &flags,
|
|
|
|
sanitized);
|
|
|
|
if (sanitized && component_len == 0)
|
|
|
|
; /* OK, omit empty component */
|
|
|
|
else if (component_len <= 0)
|
2015-07-22 23:05:33 +02:00
|
|
|
return -1;
|
|
|
|
|
2012-04-10 07:30:13 +02:00
|
|
|
component_count++;
|
|
|
|
if (refname[component_len] == '\0')
|
|
|
|
break;
|
|
|
|
/* Skip to next component. */
|
|
|
|
refname += component_len + 1;
|
|
|
|
}
|
|
|
|
|
2019-03-08 10:28:34 +01:00
|
|
|
if (refname[component_len - 1] == '.') {
|
|
|
|
/* Refname ends with '.'. */
|
|
|
|
if (sanitized)
|
|
|
|
; /* omit ending dot */
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2012-04-10 07:30:13 +02:00
|
|
|
if (!(flags & REFNAME_ALLOW_ONELEVEL) && component_count < 2)
|
|
|
|
return -1; /* Refname has only one component. */
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-03-08 10:28:34 +01:00
|
|
|
int check_refname_format(const char *refname, int flags)
|
|
|
|
{
|
|
|
|
return check_or_sanitize_refname(refname, flags, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
void sanitize_refname_component(const char *refname, struct strbuf *out)
|
|
|
|
{
|
|
|
|
if (check_or_sanitize_refname(refname, REFNAME_ALLOW_ONELEVEL, out))
|
|
|
|
BUG("sanitizing refname '%s' check returned error", refname);
|
|
|
|
}
|
|
|
|
|
2015-11-10 12:42:36 +01:00
|
|
|
int refname_is_safe(const char *refname)
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
{
|
2016-04-27 12:39:11 +02:00
|
|
|
const char *rest;
|
|
|
|
|
|
|
|
if (skip_prefix(refname, "refs/", &rest)) {
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
char *buf;
|
|
|
|
int result;
|
2016-04-27 12:40:39 +02:00
|
|
|
size_t restlen = strlen(rest);
|
|
|
|
|
|
|
|
/* rest must not be empty, or start or end with "/" */
|
|
|
|
if (!restlen || *rest == '/' || rest[restlen - 1] == '/')
|
|
|
|
return 0;
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Does the refname try to escape refs/?
|
|
|
|
* For example: refs/foo/../bar is safe but refs/foo/../../bar
|
|
|
|
* is not.
|
|
|
|
*/
|
2016-04-27 12:40:39 +02:00
|
|
|
buf = xmallocz(restlen);
|
|
|
|
result = !normalize_path_copy(buf, rest) && !strcmp(buf, rest);
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
free(buf);
|
|
|
|
return result;
|
|
|
|
}
|
2016-04-27 12:42:27 +02:00
|
|
|
|
|
|
|
do {
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
if (!isupper(*refname) && *refname != '_')
|
|
|
|
return 0;
|
|
|
|
refname++;
|
2016-04-27 12:42:27 +02:00
|
|
|
} while (*refname);
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2017-06-23 09:01:37 +02:00
|
|
|
/*
|
|
|
|
* Return true if refname, which has the specified oid and flags, can
|
|
|
|
* be resolved to an object in the database. If the referred-to object
|
|
|
|
* does not exist, emit a warning and return false.
|
|
|
|
*/
|
|
|
|
int ref_resolves_to_object(const char *refname,
|
2021-10-08 23:08:15 +02:00
|
|
|
struct repository *repo,
|
2017-06-23 09:01:37 +02:00
|
|
|
const struct object_id *oid,
|
|
|
|
unsigned int flags)
|
|
|
|
{
|
|
|
|
if (flags & REF_ISBROKEN)
|
|
|
|
return 0;
|
2021-10-08 23:08:15 +02:00
|
|
|
if (!repo_has_object_file(repo, oid)) {
|
2018-07-21 09:49:35 +02:00
|
|
|
error(_("%s does not point to a valid object!"), refname);
|
2017-06-23 09:01:37 +02:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
char *refs_resolve_refdup(struct ref_store *refs,
|
|
|
|
const char *refname, int resolve_flags,
|
refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
All of the callers already pass the hash member of struct object_id, so
update them to pass a pointer to the struct directly,
This transformation was done with an update to declaration and
definition and the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3.hash, E4)
+ resolve_refdup(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3->hash, E4)
+ resolve_refdup(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:06:55 +02:00
|
|
|
struct object_id *oid, int *flags)
|
2017-03-26 04:42:34 +02:00
|
|
|
{
|
|
|
|
const char *result;
|
|
|
|
|
|
|
|
result = refs_resolve_ref_unsafe(refs, refname, resolve_flags,
|
2022-01-26 15:37:01 +01:00
|
|
|
oid, flags);
|
2017-03-26 04:42:34 +02:00
|
|
|
return xstrdup_or_null(result);
|
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
char *resolve_refdup(const char *refname, int resolve_flags,
|
refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
All of the callers already pass the hash member of struct object_id, so
update them to pass a pointer to the struct directly,
This transformation was done with an update to declaration and
definition and the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3.hash, E4)
+ resolve_refdup(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3->hash, E4)
+ resolve_refdup(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:06:55 +02:00
|
|
|
struct object_id *oid, int *flags)
|
Start handling references internally as a sorted in-memory list
This also adds some very rudimentary support for the notion of packed
refs. HOWEVER! At this point it isn't used to actually look up a ref
yet, only for listing them (ie "for_each_ref()" and friends see the
packed refs, but none of the other single-ref lookup routines).
Note how we keep two separate lists: one for the loose refs, and one for
the packed refs we read. That's so that we can easily keep the two apart,
and read only one set or the other (and still always make sure that the
loose refs take precedence).
[ From this, it's not actually obvious why we'd keep the two separate
lists, but it's important to have the packed refs on their own list
later on, when I add support for looking up a single loose one.
For that case, we will want to read _just_ the packed refs in case the
single-ref lookup fails, yet we may end up needing the other list at
some point in the future, so keeping them separated is important ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-12 01:37:32 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_resolve_refdup(get_main_ref_store(the_repository),
|
2017-03-26 04:42:34 +02:00
|
|
|
refname, resolve_flags,
|
refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
All of the callers already pass the hash member of struct object_id, so
update them to pass a pointer to the struct directly,
This transformation was done with an update to declaration and
definition and the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3.hash, E4)
+ resolve_refdup(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3->hash, E4)
+ resolve_refdup(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:06:55 +02:00
|
|
|
oid, flags);
|
2011-12-12 06:38:22 +01:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
/* The argument to filter_refs */
|
|
|
|
struct ref_filter {
|
|
|
|
const char *pattern;
|
2018-11-12 14:25:44 +01:00
|
|
|
const char *prefix;
|
2015-11-09 14:34:01 +01:00
|
|
|
each_ref_fn *fn;
|
|
|
|
void *cb_data;
|
|
|
|
};
|
2012-04-10 07:30:26 +02:00
|
|
|
|
2021-10-16 11:39:14 +02:00
|
|
|
int read_ref_full(const char *refname, int resolve_flags, struct object_id *oid, int *flags)
|
2012-04-10 07:30:21 +02:00
|
|
|
{
|
2021-10-16 11:39:14 +02:00
|
|
|
struct ref_store *refs = get_main_ref_store(the_repository);
|
|
|
|
|
2021-10-16 11:39:27 +02:00
|
|
|
if (refs_resolve_ref_unsafe(refs, refname, resolve_flags,
|
2022-01-26 15:37:01 +01:00
|
|
|
oid, flags))
|
2015-11-09 14:34:01 +01:00
|
|
|
return 0;
|
|
|
|
return -1;
|
2012-04-10 07:30:21 +02:00
|
|
|
}
|
|
|
|
|
2017-10-16 00:06:56 +02:00
|
|
|
int read_ref(const char *refname, struct object_id *oid)
|
2011-12-12 06:38:22 +01:00
|
|
|
{
|
2017-10-16 00:06:56 +02:00
|
|
|
return read_ref_full(refname, RESOLVE_REF_READING, oid, NULL);
|
2007-04-17 03:42:50 +02:00
|
|
|
}
|
|
|
|
|
2020-08-21 18:59:34 +02:00
|
|
|
int refs_ref_exists(struct ref_store *refs, const char *refname)
|
2019-04-06 13:34:24 +02:00
|
|
|
{
|
2021-10-16 11:39:27 +02:00
|
|
|
return !!refs_resolve_ref_unsafe(refs, refname, RESOLVE_REF_READING,
|
2022-01-26 15:37:01 +01:00
|
|
|
NULL, NULL);
|
2019-04-06 13:34:24 +02:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
int ref_exists(const char *refname)
|
2012-04-10 07:30:13 +02:00
|
|
|
{
|
2019-04-06 13:34:24 +02:00
|
|
|
return refs_ref_exists(get_main_ref_store(the_repository), refname);
|
2012-04-10 07:30:13 +02:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
static int filter_refs(const char *refname, const struct object_id *oid,
|
|
|
|
int flags, void *data)
|
2012-04-10 07:30:26 +02:00
|
|
|
{
|
2015-11-09 14:34:01 +01:00
|
|
|
struct ref_filter *filter = (struct ref_filter *)data;
|
|
|
|
|
2017-06-22 23:38:08 +02:00
|
|
|
if (wildmatch(filter->pattern, refname, 0))
|
2015-11-09 14:34:01 +01:00
|
|
|
return 0;
|
2018-11-12 14:25:44 +01:00
|
|
|
if (filter->prefix)
|
|
|
|
skip_prefix(refname, filter->prefix, &refname);
|
2015-11-09 14:34:01 +01:00
|
|
|
return filter->fn(refname, oid, flags, filter->cb_data);
|
2012-04-10 07:30:26 +02:00
|
|
|
}
|
|
|
|
|
2017-10-16 00:07:10 +02:00
|
|
|
enum peel_status peel_object(const struct object_id *name, struct object_id *oid)
|
2007-04-17 03:42:50 +02:00
|
|
|
{
|
2021-04-13 09:16:36 +02:00
|
|
|
struct object *o = lookup_unknown_object(the_repository, name);
|
2007-04-17 03:42:50 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
if (o->type == OBJ_NONE) {
|
2018-04-25 20:20:59 +02:00
|
|
|
int type = oid_object_info(the_repository, name, NULL);
|
2020-06-17 11:14:08 +02:00
|
|
|
if (type < 0 || !object_as_type(o, type, 0))
|
2015-11-09 14:34:01 +01:00
|
|
|
return PEEL_INVALID;
|
|
|
|
}
|
2012-04-10 07:30:13 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
if (o->type != OBJ_TAG)
|
|
|
|
return PEEL_NON_TAG;
|
2012-05-22 23:03:29 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
o = deref_tag_noverify(o);
|
|
|
|
if (!o)
|
|
|
|
return PEEL_INVALID;
|
|
|
|
|
2017-10-16 00:07:10 +02:00
|
|
|
oidcpy(oid, &o->oid);
|
2015-11-09 14:34:01 +01:00
|
|
|
return PEEL_PEELED;
|
2012-05-22 23:03:29 +02:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
struct warn_if_dangling_data {
|
|
|
|
FILE *fp;
|
|
|
|
const char *refname;
|
|
|
|
const struct string_list *refnames;
|
|
|
|
const char *msg_fmt;
|
|
|
|
};
|
2012-04-10 07:30:13 +02:00
|
|
|
|
2022-08-19 12:08:32 +02:00
|
|
|
static int warn_if_dangling_symref(const char *refname,
|
2022-08-25 19:09:48 +02:00
|
|
|
const struct object_id *oid UNUSED,
|
2015-11-09 14:34:01 +01:00
|
|
|
int flags, void *cb_data)
|
|
|
|
{
|
|
|
|
struct warn_if_dangling_data *d = cb_data;
|
|
|
|
const char *resolves_to;
|
2012-04-10 07:30:13 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
if (!(flags & REF_ISSYMREF))
|
|
|
|
return 0;
|
2012-04-10 07:30:13 +02:00
|
|
|
|
2017-09-23 11:45:04 +02:00
|
|
|
resolves_to = resolve_ref_unsafe(refname, 0, NULL, NULL);
|
2015-11-09 14:34:01 +01:00
|
|
|
if (!resolves_to
|
|
|
|
|| (d->refname
|
|
|
|
? strcmp(resolves_to, d->refname)
|
|
|
|
: !string_list_has_string(d->refnames, resolves_to))) {
|
|
|
|
return 0;
|
|
|
|
}
|
2012-04-10 07:30:13 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
fprintf(d->fp, d->msg_fmt, refname);
|
|
|
|
fputc('\n', d->fp);
|
|
|
|
return 0;
|
2012-04-10 07:30:13 +02:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
|
2012-04-25 00:45:11 +02:00
|
|
|
{
|
2015-11-09 14:34:01 +01:00
|
|
|
struct warn_if_dangling_data data;
|
|
|
|
|
|
|
|
data.fp = fp;
|
|
|
|
data.refname = refname;
|
|
|
|
data.refnames = NULL;
|
|
|
|
data.msg_fmt = msg_fmt;
|
|
|
|
for_each_rawref(warn_if_dangling_symref, &data);
|
2012-04-25 00:45:11 +02:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
void warn_dangling_symrefs(FILE *fp, const char *msg_fmt, const struct string_list *refnames)
|
2012-04-10 07:30:26 +02:00
|
|
|
{
|
2015-11-09 14:34:01 +01:00
|
|
|
struct warn_if_dangling_data data;
|
2012-04-10 07:30:26 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
data.fp = fp;
|
|
|
|
data.refname = NULL;
|
|
|
|
data.refnames = refnames;
|
|
|
|
data.msg_fmt = msg_fmt;
|
|
|
|
for_each_rawref(warn_if_dangling_symref, &data);
|
2012-04-10 07:30:26 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_tag_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_ref_in(refs, "refs/tags/", fn, cb_data);
|
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
int for_each_tag_ref(each_ref_fn fn, void *cb_data)
|
2012-04-10 07:30:26 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_tag_ref(get_main_ref_store(the_repository), fn, cb_data);
|
2012-04-10 07:30:26 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_branch_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_ref_in(refs, "refs/heads/", fn, cb_data);
|
2012-04-10 07:30:26 +02:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
int for_each_branch_ref(each_ref_fn fn, void *cb_data)
|
2012-04-10 07:30:26 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_branch_ref(get_main_ref_store(the_repository), fn, cb_data);
|
2012-04-10 07:30:26 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data);
|
2011-12-12 06:38:15 +01:00
|
|
|
}
|
2012-04-10 07:30:26 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
int for_each_remote_ref(each_ref_fn fn, void *cb_data)
|
2011-09-30 00:11:42 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_remote_ref(get_main_ref_store(the_repository), fn, cb_data);
|
2011-12-12 06:38:15 +01:00
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
int head_ref_namespaced(each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int ret = 0;
|
|
|
|
struct object_id oid;
|
|
|
|
int flag;
|
2007-04-17 03:42:50 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
strbuf_addf(&buf, "%sHEAD", get_git_namespace());
|
2017-10-16 00:06:56 +02:00
|
|
|
if (!read_ref_full(buf.buf, RESOLVE_REF_READING, &oid, &flag))
|
2015-11-09 14:34:01 +01:00
|
|
|
ret = fn(buf.buf, &oid, flag, cb_data);
|
|
|
|
strbuf_release(&buf);
|
2007-04-17 03:42:50 +02:00
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
return ret;
|
2011-09-30 00:11:42 +02:00
|
|
|
}
|
2007-04-17 03:42:50 +02:00
|
|
|
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21 22:33:41 +01:00
|
|
|
void normalize_glob_ref(struct string_list_item *item, const char *prefix,
|
|
|
|
const char *pattern)
|
|
|
|
{
|
|
|
|
struct strbuf normalized_pattern = STRBUF_INIT;
|
|
|
|
|
|
|
|
if (*pattern == '/')
|
|
|
|
BUG("pattern must not start with '/'");
|
|
|
|
|
2022-08-05 19:58:33 +02:00
|
|
|
if (prefix)
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21 22:33:41 +01:00
|
|
|
strbuf_addstr(&normalized_pattern, prefix);
|
2022-08-05 19:58:33 +02:00
|
|
|
else if (!starts_with(pattern, "refs/") &&
|
|
|
|
strcmp(pattern, "HEAD"))
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21 22:33:41 +01:00
|
|
|
strbuf_addstr(&normalized_pattern, "refs/");
|
2022-08-05 19:58:33 +02:00
|
|
|
/*
|
|
|
|
* NEEDSWORK: Special case other symrefs such as REBASE_HEAD,
|
|
|
|
* MERGE_HEAD, etc.
|
|
|
|
*/
|
|
|
|
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21 22:33:41 +01:00
|
|
|
strbuf_addstr(&normalized_pattern, pattern);
|
|
|
|
strbuf_strip_suffix(&normalized_pattern, "/");
|
|
|
|
|
|
|
|
item->string = strbuf_detach(&normalized_pattern, NULL);
|
|
|
|
item->util = has_glob_specials(pattern) ? NULL : item->string;
|
|
|
|
strbuf_release(&normalized_pattern);
|
|
|
|
}
|
|
|
|
|
2015-11-09 14:34:01 +01:00
|
|
|
int for_each_glob_ref_in(each_ref_fn fn, const char *pattern,
|
|
|
|
const char *prefix, void *cb_data)
|
2013-04-22 21:52:18 +02:00
|
|
|
{
|
2015-11-09 14:34:01 +01:00
|
|
|
struct strbuf real_pattern = STRBUF_INIT;
|
|
|
|
struct ref_filter filter;
|
|
|
|
int ret;
|
2010-01-20 10:48:25 +01:00
|
|
|
|
2013-11-30 21:55:40 +01:00
|
|
|
if (!prefix && !starts_with(pattern, "refs/"))
|
2010-01-20 10:48:25 +01:00
|
|
|
strbuf_addstr(&real_pattern, "refs/");
|
2010-01-20 10:48:26 +01:00
|
|
|
else if (prefix)
|
|
|
|
strbuf_addstr(&real_pattern, prefix);
|
2010-01-20 10:48:25 +01:00
|
|
|
strbuf_addstr(&real_pattern, pattern);
|
|
|
|
|
2010-03-12 18:04:26 +01:00
|
|
|
if (!has_glob_specials(pattern)) {
|
2010-02-04 06:23:18 +01:00
|
|
|
/* Append implied '/' '*' if not present. */
|
use strbuf_complete to conditionally append slash
When working with paths in strbufs, we frequently want to
ensure that a directory contains a trailing slash before
appending to it. We can shorten this code (and make the
intent more obvious) by calling strbuf_complete.
Most of these cases are trivially identical conversions, but
there are two things to note:
- in a few cases we did not check that the strbuf is
non-empty (which would lead to an out-of-bounds memory
access). These were generally not triggerable in
practice, either from earlier assertions, or typically
because we would have just fed the strbuf to opendir(),
which would choke on an empty path.
- in a few cases we indexed the buffer with "original_len"
or similar, rather than the current sb->len, and it is
not immediately obvious from the diff that they are the
same. In all of these cases, I manually verified that
the strbuf does not change between the assignment and
the strbuf_complete call.
This does not convert cases which look like:
if (sb->len && !is_dir_sep(sb->buf[sb->len - 1]))
strbuf_addch(sb, '/');
as those are obviously semantically different. Some of these
cases arguably should be doing that, but that is out of
scope for this change, which aims purely for cleanup with no
behavior change (and at least it will make such sites easier
to find and examine in the future, as we can grep for
strbuf_complete).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-24 23:08:35 +02:00
|
|
|
strbuf_complete(&real_pattern, '/');
|
2010-01-20 10:48:25 +01:00
|
|
|
/* No need to check for '*', there is none. */
|
|
|
|
strbuf_addch(&real_pattern, '*');
|
|
|
|
}
|
|
|
|
|
|
|
|
filter.pattern = real_pattern.buf;
|
2018-11-12 14:25:44 +01:00
|
|
|
filter.prefix = prefix;
|
2010-01-20 10:48:25 +01:00
|
|
|
filter.fn = fn;
|
|
|
|
filter.cb_data = cb_data;
|
|
|
|
ret = for_each_ref(filter_refs, &filter);
|
|
|
|
|
|
|
|
strbuf_release(&real_pattern);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-01-20 10:48:26 +01:00
|
|
|
int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data)
|
|
|
|
{
|
|
|
|
return for_each_glob_ref_in(fn, pattern, NULL, cb_data);
|
|
|
|
}
|
|
|
|
|
2009-05-13 23:22:04 +02:00
|
|
|
const char *prettify_refname(const char *name)
|
2009-03-09 02:06:05 +01:00
|
|
|
{
|
2017-03-23 16:50:12 +01:00
|
|
|
if (skip_prefix(name, "refs/heads/", &name) ||
|
|
|
|
skip_prefix(name, "refs/tags/", &name) ||
|
|
|
|
skip_prefix(name, "refs/remotes/", &name))
|
|
|
|
; /* nothing */
|
|
|
|
return name;
|
2009-03-09 02:06:05 +01:00
|
|
|
}
|
|
|
|
|
2014-01-14 04:16:07 +01:00
|
|
|
static const char *ref_rev_parse_rules[] = {
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 15:01:46 +01:00
|
|
|
"%.*s",
|
|
|
|
"refs/%.*s",
|
|
|
|
"refs/tags/%.*s",
|
|
|
|
"refs/heads/%.*s",
|
|
|
|
"refs/remotes/%.*s",
|
|
|
|
"refs/remotes/%.*s/HEAD",
|
|
|
|
NULL
|
|
|
|
};
|
|
|
|
|
remote: make refspec follow the same disambiguation rule as local refs
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-01 18:22:37 +02:00
|
|
|
#define NUM_REV_PARSE_RULES (ARRAY_SIZE(ref_rev_parse_rules) - 1)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Is it possible that the caller meant full_name with abbrev_name?
|
|
|
|
* If so return a non-zero value to signal "yes"; the magnitude of
|
|
|
|
* the returned value gives the precedence used for disambiguation.
|
|
|
|
*
|
|
|
|
* If abbrev_name cannot mean full_name, return 0.
|
|
|
|
*/
|
2014-01-14 04:16:07 +01:00
|
|
|
int refname_match(const char *abbrev_name, const char *full_name)
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 15:01:46 +01:00
|
|
|
{
|
|
|
|
const char **p;
|
|
|
|
const int abbrev_name_len = strlen(abbrev_name);
|
remote: make refspec follow the same disambiguation rule as local refs
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-01 18:22:37 +02:00
|
|
|
const int num_rules = NUM_REV_PARSE_RULES;
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 15:01:46 +01:00
|
|
|
|
remote: make refspec follow the same disambiguation rule as local refs
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-01 18:22:37 +02:00
|
|
|
for (p = ref_rev_parse_rules; *p; p++)
|
|
|
|
if (!strcmp(full_name, mkpath(*p, abbrev_name_len, abbrev_name)))
|
|
|
|
return &ref_rev_parse_rules[num_rules] - p;
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 15:01:46 +01:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-03-15 18:31:24 +01:00
|
|
|
/*
|
|
|
|
* Given a 'prefix' expand it by the rules in 'ref_rev_parse_rules' and add
|
|
|
|
* the results to 'prefixes'
|
|
|
|
*/
|
2020-07-28 22:25:12 +02:00
|
|
|
void expand_ref_prefix(struct strvec *prefixes, const char *prefix)
|
2018-03-15 18:31:24 +01:00
|
|
|
{
|
|
|
|
const char **p;
|
|
|
|
int len = strlen(prefix);
|
|
|
|
|
|
|
|
for (p = ref_rev_parse_rules; *p; p++)
|
2020-07-28 22:25:12 +02:00
|
|
|
strvec_pushf(prefixes, *p, len, prefix);
|
2018-03-15 18:31:24 +01:00
|
|
|
}
|
|
|
|
|
2020-12-11 12:36:57 +01:00
|
|
|
static const char default_branch_name_advice[] = N_(
|
|
|
|
"Using '%s' as the name for the initial branch. This default branch name\n"
|
|
|
|
"is subject to change. To configure the initial branch name to use in all\n"
|
|
|
|
"of your new repositories, which will suppress this warning, call:\n"
|
|
|
|
"\n"
|
|
|
|
"\tgit config --global init.defaultBranch <name>\n"
|
|
|
|
"\n"
|
|
|
|
"Names commonly chosen instead of 'master' are 'main', 'trunk' and\n"
|
|
|
|
"'development'. The just-created branch can be renamed via this command:\n"
|
|
|
|
"\n"
|
|
|
|
"\tgit branch -m <name>\n"
|
|
|
|
);
|
|
|
|
|
2020-12-11 12:36:56 +01:00
|
|
|
char *repo_default_branch_name(struct repository *r, int quiet)
|
2020-06-24 16:46:33 +02:00
|
|
|
{
|
|
|
|
const char *config_key = "init.defaultbranch";
|
|
|
|
const char *config_display_key = "init.defaultBranch";
|
|
|
|
char *ret = NULL, *full_ref;
|
2020-10-23 16:00:00 +02:00
|
|
|
const char *env = getenv("GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME");
|
2020-06-24 16:46:33 +02:00
|
|
|
|
2020-10-23 16:00:00 +02:00
|
|
|
if (env && *env)
|
|
|
|
ret = xstrdup(env);
|
|
|
|
else if (repo_config_get_string(r, config_key, &ret) < 0)
|
2020-06-24 16:46:33 +02:00
|
|
|
die(_("could not retrieve `%s`"), config_display_key);
|
|
|
|
|
2020-12-11 12:36:57 +01:00
|
|
|
if (!ret) {
|
2020-06-24 16:46:33 +02:00
|
|
|
ret = xstrdup("master");
|
2020-12-11 12:36:57 +01:00
|
|
|
if (!quiet)
|
|
|
|
advise(_(default_branch_name_advice), ret);
|
|
|
|
}
|
2020-06-24 16:46:33 +02:00
|
|
|
|
|
|
|
full_ref = xstrfmt("refs/heads/%s", ret);
|
|
|
|
if (check_refname_format(full_ref, 0))
|
|
|
|
die(_("invalid branch name: %s = %s"), config_display_key, ret);
|
|
|
|
free(full_ref);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2020-12-11 12:36:56 +01:00
|
|
|
const char *git_default_branch_name(int quiet)
|
2020-06-24 16:46:33 +02:00
|
|
|
{
|
|
|
|
static char *ret;
|
|
|
|
|
|
|
|
if (!ret)
|
2020-12-11 12:36:56 +01:00
|
|
|
ret = repo_default_branch_name(the_repository, quiet);
|
2020-06-24 16:46:33 +02:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2011-10-12 19:35:38 +02:00
|
|
|
/*
|
|
|
|
* *string and *len will only be substituted, and *string returned (for
|
|
|
|
* later free()ing) if the string passed in is a magic short-hand form
|
|
|
|
* to name a branch.
|
|
|
|
*/
|
2019-04-06 13:34:26 +02:00
|
|
|
static char *substitute_branch_name(struct repository *r,
|
2020-09-02 00:28:09 +02:00
|
|
|
const char **string, int *len,
|
|
|
|
int nonfatal_dangling_mark)
|
2011-10-12 19:35:38 +02:00
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
2020-09-02 00:28:09 +02:00
|
|
|
struct interpret_branch_name_options options = {
|
|
|
|
.nonfatal_dangling_mark = nonfatal_dangling_mark
|
|
|
|
};
|
2020-09-02 00:28:07 +02:00
|
|
|
int ret = repo_interpret_branch_name(r, *string, *len, &buf, &options);
|
2011-10-12 19:35:38 +02:00
|
|
|
|
|
|
|
if (ret == *len) {
|
|
|
|
size_t size;
|
|
|
|
*string = strbuf_detach(&buf, &size);
|
|
|
|
*len = size;
|
|
|
|
return (char *)*string;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2019-04-06 13:34:28 +02:00
|
|
|
int repo_dwim_ref(struct repository *r, const char *str, int len,
|
2020-09-02 00:28:09 +02:00
|
|
|
struct object_id *oid, char **ref, int nonfatal_dangling_mark)
|
2011-10-12 19:35:38 +02:00
|
|
|
{
|
2020-09-02 00:28:09 +02:00
|
|
|
char *last_branch = substitute_branch_name(r, &str, &len,
|
|
|
|
nonfatal_dangling_mark);
|
2019-04-06 13:34:28 +02:00
|
|
|
int refs_found = expand_ref(r, str, len, oid, ref);
|
2016-06-12 12:54:02 +02:00
|
|
|
free(last_branch);
|
|
|
|
return refs_found;
|
|
|
|
}
|
|
|
|
|
2019-04-06 13:34:27 +02:00
|
|
|
int expand_ref(struct repository *repo, const char *str, int len,
|
|
|
|
struct object_id *oid, char **ref)
|
2016-06-12 12:54:02 +02:00
|
|
|
{
|
2011-10-12 19:35:38 +02:00
|
|
|
const char **p, *r;
|
|
|
|
int refs_found = 0;
|
2017-03-28 21:46:33 +02:00
|
|
|
struct strbuf fullref = STRBUF_INIT;
|
2011-10-12 19:35:38 +02:00
|
|
|
|
|
|
|
*ref = NULL;
|
|
|
|
for (p = ref_rev_parse_rules; *p; p++) {
|
2017-10-16 00:06:57 +02:00
|
|
|
struct object_id oid_from_ref;
|
|
|
|
struct object_id *this_result;
|
2011-10-12 19:35:38 +02:00
|
|
|
int flag;
|
2021-10-16 11:39:24 +02:00
|
|
|
struct ref_store *refs = get_main_ref_store(repo);
|
2011-10-12 19:35:38 +02:00
|
|
|
|
2017-10-16 00:06:57 +02:00
|
|
|
this_result = refs_found ? &oid_from_ref : oid;
|
2017-03-28 21:46:33 +02:00
|
|
|
strbuf_reset(&fullref);
|
|
|
|
strbuf_addf(&fullref, *p, len, str);
|
2021-10-16 11:39:27 +02:00
|
|
|
r = refs_resolve_ref_unsafe(refs, fullref.buf,
|
2021-10-16 11:39:24 +02:00
|
|
|
RESOLVE_REF_READING,
|
2022-01-26 15:37:01 +01:00
|
|
|
this_result, &flag);
|
2011-10-12 19:35:38 +02:00
|
|
|
if (r) {
|
|
|
|
if (!refs_found++)
|
|
|
|
*ref = xstrdup(r);
|
|
|
|
if (!warn_ambiguous_refs)
|
|
|
|
break;
|
2017-03-28 21:46:33 +02:00
|
|
|
} else if ((flag & REF_ISSYMREF) && strcmp(fullref.buf, "HEAD")) {
|
2018-07-21 09:49:35 +02:00
|
|
|
warning(_("ignoring dangling symref %s"), fullref.buf);
|
2017-03-28 21:46:33 +02:00
|
|
|
} else if ((flag & REF_ISBROKEN) && strchr(fullref.buf, '/')) {
|
2018-07-21 09:49:35 +02:00
|
|
|
warning(_("ignoring broken ref %s"), fullref.buf);
|
2011-10-19 22:55:49 +02:00
|
|
|
}
|
2011-10-12 19:35:38 +02:00
|
|
|
}
|
2017-03-28 21:46:33 +02:00
|
|
|
strbuf_release(&fullref);
|
2011-10-12 19:35:38 +02:00
|
|
|
return refs_found;
|
|
|
|
}
|
|
|
|
|
2019-04-06 13:34:29 +02:00
|
|
|
int repo_dwim_log(struct repository *r, const char *str, int len,
|
|
|
|
struct object_id *oid, char **log)
|
2011-10-12 19:35:38 +02:00
|
|
|
{
|
2019-04-06 13:34:29 +02:00
|
|
|
struct ref_store *refs = get_main_ref_store(r);
|
2020-09-02 00:28:09 +02:00
|
|
|
char *last_branch = substitute_branch_name(r, &str, &len, 0);
|
2011-10-12 19:35:38 +02:00
|
|
|
const char **p;
|
|
|
|
int logs_found = 0;
|
2017-03-28 21:46:33 +02:00
|
|
|
struct strbuf path = STRBUF_INIT;
|
2011-10-12 19:35:38 +02:00
|
|
|
|
|
|
|
*log = NULL;
|
|
|
|
for (p = ref_rev_parse_rules; *p; p++) {
|
2017-10-16 00:06:59 +02:00
|
|
|
struct object_id hash;
|
2011-10-12 19:35:38 +02:00
|
|
|
const char *ref, *it;
|
|
|
|
|
2017-03-28 21:46:33 +02:00
|
|
|
strbuf_reset(&path);
|
|
|
|
strbuf_addf(&path, *p, len, str);
|
2019-04-06 13:34:29 +02:00
|
|
|
ref = refs_resolve_ref_unsafe(refs, path.buf,
|
|
|
|
RESOLVE_REF_READING,
|
2022-01-26 15:37:01 +01:00
|
|
|
oid ? &hash : NULL, NULL);
|
2011-10-12 19:35:38 +02:00
|
|
|
if (!ref)
|
|
|
|
continue;
|
2019-04-06 13:34:29 +02:00
|
|
|
if (refs_reflog_exists(refs, path.buf))
|
2017-03-28 21:46:33 +02:00
|
|
|
it = path.buf;
|
2019-04-06 13:34:29 +02:00
|
|
|
else if (strcmp(ref, path.buf) &&
|
|
|
|
refs_reflog_exists(refs, ref))
|
2011-10-12 19:35:38 +02:00
|
|
|
it = ref;
|
|
|
|
else
|
|
|
|
continue;
|
|
|
|
if (!logs_found++) {
|
|
|
|
*log = xstrdup(it);
|
2021-08-23 13:36:08 +02:00
|
|
|
if (oid)
|
|
|
|
oidcpy(oid, &hash);
|
2011-10-12 19:35:38 +02:00
|
|
|
}
|
2015-11-09 14:34:01 +01:00
|
|
|
if (!warn_ambiguous_refs)
|
|
|
|
break;
|
2006-10-01 00:02:00 +02:00
|
|
|
}
|
2017-03-28 21:46:33 +02:00
|
|
|
strbuf_release(&path);
|
2015-11-09 14:34:01 +01:00
|
|
|
free(last_branch);
|
|
|
|
return logs_found;
|
2013-09-04 17:22:41 +02:00
|
|
|
}
|
|
|
|
|
2019-04-06 13:34:29 +02:00
|
|
|
int dwim_log(const char *str, int len, struct object_id *oid, char **log)
|
|
|
|
{
|
|
|
|
return repo_dwim_log(the_repository, str, len, oid, log);
|
|
|
|
}
|
|
|
|
|
2022-09-19 18:34:50 +02:00
|
|
|
int is_per_worktree_ref(const char *refname)
|
2015-07-31 08:06:18 +02:00
|
|
|
{
|
2020-07-27 18:25:47 +02:00
|
|
|
return starts_with(refname, "refs/worktree/") ||
|
|
|
|
starts_with(refname, "refs/bisect/") ||
|
|
|
|
starts_with(refname, "refs/rewritten/");
|
2015-07-31 08:06:18 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static int is_pseudoref_syntax(const char *refname)
|
|
|
|
{
|
|
|
|
const char *c;
|
|
|
|
|
|
|
|
for (c = refname; *c; c++) {
|
|
|
|
if (!isupper(*c) && *c != '-' && *c != '_')
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-09-19 18:34:50 +02:00
|
|
|
/*
|
|
|
|
* HEAD is not a pseudoref, but it certainly uses the
|
|
|
|
* pseudoref syntax.
|
|
|
|
*/
|
2015-07-31 08:06:18 +02:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2022-09-19 18:34:50 +02:00
|
|
|
static int is_current_worktree_ref(const char *ref) {
|
|
|
|
return is_pseudoref_syntax(ref) || is_per_worktree_ref(ref);
|
2018-10-21 10:08:54 +02:00
|
|
|
}
|
|
|
|
|
2022-09-19 18:34:50 +02:00
|
|
|
enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref,
|
|
|
|
const char **worktree_name, int *worktree_name_length,
|
|
|
|
const char **bare_refname)
|
2018-10-21 10:08:54 +02:00
|
|
|
{
|
2022-09-19 18:34:50 +02:00
|
|
|
const char *name_dummy;
|
|
|
|
int name_length_dummy;
|
|
|
|
const char *ref_dummy;
|
2018-10-21 10:08:54 +02:00
|
|
|
|
2022-09-19 18:34:50 +02:00
|
|
|
if (!worktree_name)
|
|
|
|
worktree_name = &name_dummy;
|
|
|
|
if (!worktree_name_length)
|
|
|
|
worktree_name_length = &name_length_dummy;
|
|
|
|
if (!bare_refname)
|
|
|
|
bare_refname = &ref_dummy;
|
|
|
|
|
|
|
|
if (skip_prefix(maybe_worktree_ref, "worktrees/", bare_refname)) {
|
|
|
|
const char *slash = strchr(*bare_refname, '/');
|
|
|
|
|
|
|
|
*worktree_name = *bare_refname;
|
|
|
|
if (!slash) {
|
|
|
|
*worktree_name_length = strlen(*worktree_name);
|
|
|
|
|
|
|
|
/* This is an error condition, and the caller tell because the bare_refname is "" */
|
|
|
|
*bare_refname = *worktree_name + *worktree_name_length;
|
|
|
|
return REF_WORKTREE_OTHER;
|
|
|
|
}
|
|
|
|
|
|
|
|
*worktree_name_length = slash - *bare_refname;
|
|
|
|
*bare_refname = slash + 1;
|
|
|
|
|
|
|
|
if (is_current_worktree_ref(*bare_refname))
|
|
|
|
return REF_WORKTREE_OTHER;
|
|
|
|
}
|
|
|
|
|
|
|
|
*worktree_name = NULL;
|
|
|
|
*worktree_name_length = 0;
|
|
|
|
|
|
|
|
if (skip_prefix(maybe_worktree_ref, "main-worktree/", bare_refname)
|
|
|
|
&& is_current_worktree_ref(*bare_refname))
|
|
|
|
return REF_WORKTREE_MAIN;
|
|
|
|
|
|
|
|
*bare_refname = maybe_worktree_ref;
|
|
|
|
if (is_current_worktree_ref(maybe_worktree_ref))
|
|
|
|
return REF_WORKTREE_CURRENT;
|
|
|
|
|
|
|
|
return REF_WORKTREE_SHARED;
|
2015-07-31 08:06:18 +02:00
|
|
|
}
|
|
|
|
|
2017-08-21 13:51:34 +02:00
|
|
|
long get_files_ref_lock_timeout_ms(void)
|
|
|
|
{
|
|
|
|
static int configured = 0;
|
|
|
|
|
|
|
|
/* The default timeout is 100 ms: */
|
|
|
|
static int timeout_ms = 100;
|
|
|
|
|
|
|
|
if (!configured) {
|
|
|
|
git_config_get_int("core.filesreflocktimeout", &timeout_ms);
|
|
|
|
configured = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return timeout_ms;
|
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:35 +02:00
|
|
|
int refs_delete_ref(struct ref_store *refs, const char *msg,
|
|
|
|
const char *refname,
|
2017-10-16 00:06:50 +02:00
|
|
|
const struct object_id *old_oid,
|
2017-03-26 04:42:35 +02:00
|
|
|
unsigned int flags)
|
2007-01-26 23:26:09 +01:00
|
|
|
{
|
2014-04-30 18:22:45 +02:00
|
|
|
struct ref_transaction *transaction;
|
2015-07-21 23:04:50 +02:00
|
|
|
struct strbuf err = STRBUF_INIT;
|
2007-01-26 23:26:10 +01:00
|
|
|
|
2022-04-14 00:51:33 +02:00
|
|
|
transaction = ref_store_transaction_begin(refs, &err);
|
2014-04-30 18:22:45 +02:00
|
|
|
if (!transaction ||
|
2017-10-16 00:06:53 +02:00
|
|
|
ref_transaction_delete(transaction, refname, old_oid,
|
2017-02-21 02:10:32 +01:00
|
|
|
flags, msg, &err) ||
|
2014-04-30 21:22:42 +02:00
|
|
|
ref_transaction_commit(transaction, &err)) {
|
2014-04-30 18:22:45 +02:00
|
|
|
error("%s", err.buf);
|
|
|
|
ref_transaction_free(transaction);
|
|
|
|
strbuf_release(&err);
|
2006-10-01 00:02:00 +02:00
|
|
|
return 1;
|
2007-01-26 23:26:09 +01:00
|
|
|
}
|
2015-11-09 14:34:01 +01:00
|
|
|
ref_transaction_free(transaction);
|
|
|
|
strbuf_release(&err);
|
2008-01-16 20:14:30 +01:00
|
|
|
return 0;
|
|
|
|
}
|
2007-01-26 23:26:09 +01:00
|
|
|
|
2017-03-26 04:42:35 +02:00
|
|
|
int delete_ref(const char *msg, const char *refname,
|
2017-10-16 00:06:50 +02:00
|
|
|
const struct object_id *old_oid, unsigned int flags)
|
2017-03-26 04:42:35 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_delete_ref(get_main_ref_store(the_repository), msg, refname,
|
2017-10-16 00:06:50 +02:00
|
|
|
old_oid, flags);
|
2017-03-26 04:42:35 +02:00
|
|
|
}
|
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
static void copy_reflog_msg(struct strbuf *sb, const char *msg)
|
2007-07-29 02:17:17 +02:00
|
|
|
{
|
|
|
|
char c;
|
|
|
|
int wasspace = 1;
|
2007-01-26 23:26:10 +01:00
|
|
|
|
2007-07-29 02:17:17 +02:00
|
|
|
while ((c = *msg++)) {
|
|
|
|
if (wasspace && isspace(c))
|
|
|
|
continue;
|
|
|
|
wasspace = isspace(c);
|
|
|
|
if (wasspace)
|
|
|
|
c = ' ';
|
2018-07-10 23:08:22 +02:00
|
|
|
strbuf_addch(sb, c);
|
2015-07-21 23:04:50 +02:00
|
|
|
}
|
2018-07-10 23:08:22 +02:00
|
|
|
strbuf_rtrim(sb);
|
2007-07-29 02:17:17 +02:00
|
|
|
}
|
2007-01-26 23:26:10 +01:00
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
static char *normalize_reflog_message(const char *msg)
|
|
|
|
{
|
|
|
|
struct strbuf sb = STRBUF_INIT;
|
|
|
|
|
|
|
|
if (msg && *msg)
|
|
|
|
copy_reflog_msg(&sb, msg);
|
|
|
|
return strbuf_detach(&sb, NULL);
|
|
|
|
}
|
|
|
|
|
2015-11-10 12:42:36 +01:00
|
|
|
int should_autocreate_reflog(const char *refname)
|
2015-07-21 23:04:51 +02:00
|
|
|
{
|
2017-01-27 11:09:47 +01:00
|
|
|
switch (log_all_ref_updates) {
|
|
|
|
case LOG_REFS_ALWAYS:
|
|
|
|
return 1;
|
|
|
|
case LOG_REFS_NORMAL:
|
|
|
|
return starts_with(refname, "refs/heads/") ||
|
|
|
|
starts_with(refname, "refs/remotes/") ||
|
|
|
|
starts_with(refname, "refs/notes/") ||
|
|
|
|
!strcmp(refname, "HEAD");
|
|
|
|
default:
|
2015-07-21 23:04:51 +02:00
|
|
|
return 0;
|
2017-01-27 11:09:47 +01:00
|
|
|
}
|
2015-07-21 23:04:51 +02:00
|
|
|
}
|
|
|
|
|
2014-07-16 01:02:38 +02:00
|
|
|
int is_branch(const char *refname)
|
2008-01-16 00:50:17 +01:00
|
|
|
{
|
2013-11-30 21:55:40 +01:00
|
|
|
return !strcmp(refname, "HEAD") || starts_with(refname, "refs/heads/");
|
2007-01-26 23:26:09 +01:00
|
|
|
}
|
|
|
|
|
2014-06-03 18:09:59 +02:00
|
|
|
struct read_ref_at_cb {
|
|
|
|
const char *refname;
|
2017-04-26 21:29:31 +02:00
|
|
|
timestamp_t at_time;
|
2014-06-03 18:09:59 +02:00
|
|
|
int cnt;
|
|
|
|
int reccnt;
|
2017-10-16 00:07:03 +02:00
|
|
|
struct object_id *oid;
|
2014-06-03 18:09:59 +02:00
|
|
|
int found_it;
|
|
|
|
|
2017-10-16 00:07:03 +02:00
|
|
|
struct object_id ooid;
|
|
|
|
struct object_id noid;
|
2014-06-03 18:09:59 +02:00
|
|
|
int tz;
|
2017-04-26 21:29:31 +02:00
|
|
|
timestamp_t date;
|
2014-06-03 18:09:59 +02:00
|
|
|
char **msg;
|
2017-04-26 21:29:31 +02:00
|
|
|
timestamp_t *cutoff_time;
|
2014-06-03 18:09:59 +02:00
|
|
|
int *cutoff_tz;
|
|
|
|
int *cutoff_cnt;
|
|
|
|
};
|
|
|
|
|
2021-01-06 10:01:53 +01:00
|
|
|
static void set_read_ref_cutoffs(struct read_ref_at_cb *cb,
|
|
|
|
timestamp_t timestamp, int tz, const char *message)
|
|
|
|
{
|
|
|
|
if (cb->msg)
|
|
|
|
*cb->msg = xstrdup(message);
|
|
|
|
if (cb->cutoff_time)
|
|
|
|
*cb->cutoff_time = timestamp;
|
|
|
|
if (cb->cutoff_tz)
|
|
|
|
*cb->cutoff_tz = tz;
|
|
|
|
if (cb->cutoff_cnt)
|
|
|
|
*cb->cutoff_cnt = cb->reccnt;
|
|
|
|
}
|
|
|
|
|
2017-02-22 00:47:32 +01:00
|
|
|
static int read_ref_at_ent(struct object_id *ooid, struct object_id *noid,
|
2022-08-25 19:09:48 +02:00
|
|
|
const char *email UNUSED,
|
2022-08-19 12:08:35 +02:00
|
|
|
timestamp_t timestamp, int tz,
|
|
|
|
const char *message, void *cb_data)
|
2014-06-03 18:09:59 +02:00
|
|
|
{
|
|
|
|
struct read_ref_at_cb *cb = cb_data;
|
2021-01-07 11:36:59 +01:00
|
|
|
int reached_count;
|
2014-06-03 18:09:59 +02:00
|
|
|
|
|
|
|
cb->tz = tz;
|
|
|
|
cb->date = timestamp;
|
|
|
|
|
2021-01-07 11:36:59 +01:00
|
|
|
/*
|
|
|
|
* It is not possible for cb->cnt == 0 on the first iteration because
|
|
|
|
* that special case is handled in read_ref_at().
|
|
|
|
*/
|
|
|
|
if (cb->cnt > 0)
|
|
|
|
cb->cnt--;
|
|
|
|
reached_count = cb->cnt == 0 && !is_null_oid(ooid);
|
|
|
|
if (timestamp <= cb->at_time || reached_count) {
|
2021-01-06 10:01:53 +01:00
|
|
|
set_read_ref_cutoffs(cb, timestamp, tz, message);
|
2014-06-03 18:09:59 +02:00
|
|
|
/*
|
2017-11-05 09:42:09 +01:00
|
|
|
* we have not yet updated cb->[n|o]oid so they still
|
2014-06-03 18:09:59 +02:00
|
|
|
* hold the values for the previous record.
|
|
|
|
*/
|
2021-01-07 11:36:59 +01:00
|
|
|
if (!is_null_oid(&cb->ooid) && !oideq(&cb->ooid, noid))
|
|
|
|
warning(_("log for ref %s has gap after %s"),
|
convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-25 18:55:02 +02:00
|
|
|
cb->refname, show_date(cb->date, cb->tz, DATE_MODE(RFC2822)));
|
2021-01-07 11:36:59 +01:00
|
|
|
if (reached_count)
|
|
|
|
oidcpy(cb->oid, ooid);
|
|
|
|
else if (!is_null_oid(&cb->ooid) || cb->date == cb->at_time)
|
2017-10-16 00:07:03 +02:00
|
|
|
oidcpy(cb->oid, noid);
|
2018-08-28 23:22:48 +02:00
|
|
|
else if (!oideq(noid, cb->oid))
|
2018-07-21 09:49:35 +02:00
|
|
|
warning(_("log for ref %s unexpectedly ended on %s"),
|
2014-06-03 18:09:59 +02:00
|
|
|
cb->refname, show_date(cb->date, cb->tz,
|
convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-25 18:55:02 +02:00
|
|
|
DATE_MODE(RFC2822)));
|
2014-06-03 18:09:59 +02:00
|
|
|
cb->found_it = 1;
|
|
|
|
}
|
2021-01-06 10:01:53 +01:00
|
|
|
cb->reccnt++;
|
2017-10-16 00:07:03 +02:00
|
|
|
oidcpy(&cb->ooid, ooid);
|
|
|
|
oidcpy(&cb->noid, noid);
|
2021-01-07 11:36:59 +01:00
|
|
|
return cb->found_it;
|
|
|
|
}
|
|
|
|
|
2022-08-25 19:09:48 +02:00
|
|
|
static int read_ref_at_ent_newest(struct object_id *ooid UNUSED,
|
2022-08-19 12:08:32 +02:00
|
|
|
struct object_id *noid,
|
2022-08-25 19:09:48 +02:00
|
|
|
const char *email UNUSED,
|
2022-08-19 12:08:32 +02:00
|
|
|
timestamp_t timestamp, int tz,
|
|
|
|
const char *message, void *cb_data)
|
2021-01-07 11:36:59 +01:00
|
|
|
{
|
|
|
|
struct read_ref_at_cb *cb = cb_data;
|
|
|
|
|
|
|
|
set_read_ref_cutoffs(cb, timestamp, tz, message);
|
|
|
|
oidcpy(cb->oid, noid);
|
|
|
|
/* We just want the first entry */
|
|
|
|
return 1;
|
2014-06-03 18:09:59 +02:00
|
|
|
}
|
|
|
|
|
2017-02-22 00:47:32 +01:00
|
|
|
static int read_ref_at_ent_oldest(struct object_id *ooid, struct object_id *noid,
|
2022-08-25 19:09:48 +02:00
|
|
|
const char *email UNUSED,
|
2022-08-19 12:08:35 +02:00
|
|
|
timestamp_t timestamp, int tz,
|
|
|
|
const char *message, void *cb_data)
|
2014-06-03 18:09:59 +02:00
|
|
|
{
|
|
|
|
struct read_ref_at_cb *cb = cb_data;
|
|
|
|
|
2021-01-06 10:01:53 +01:00
|
|
|
set_read_ref_cutoffs(cb, timestamp, tz, message);
|
2017-10-16 00:07:03 +02:00
|
|
|
oidcpy(cb->oid, ooid);
|
|
|
|
if (is_null_oid(cb->oid))
|
|
|
|
oidcpy(cb->oid, noid);
|
2014-06-03 18:09:59 +02:00
|
|
|
/* We just want the first entry */
|
|
|
|
return 1;
|
2007-01-19 10:19:05 +01:00
|
|
|
}
|
|
|
|
|
2019-04-06 13:34:30 +02:00
|
|
|
int read_ref_at(struct ref_store *refs, const char *refname,
|
|
|
|
unsigned int flags, timestamp_t at_time, int cnt,
|
2017-10-16 00:07:03 +02:00
|
|
|
struct object_id *oid, char **msg,
|
2017-04-26 21:29:31 +02:00
|
|
|
timestamp_t *cutoff_time, int *cutoff_tz, int *cutoff_cnt)
|
2006-05-17 11:56:09 +02:00
|
|
|
{
|
2014-06-03 18:09:59 +02:00
|
|
|
struct read_ref_at_cb cb;
|
2006-05-17 11:56:09 +02:00
|
|
|
|
2014-06-03 18:09:59 +02:00
|
|
|
memset(&cb, 0, sizeof(cb));
|
|
|
|
cb.refname = refname;
|
|
|
|
cb.at_time = at_time;
|
|
|
|
cb.cnt = cnt;
|
|
|
|
cb.msg = msg;
|
|
|
|
cb.cutoff_time = cutoff_time;
|
|
|
|
cb.cutoff_tz = cutoff_tz;
|
|
|
|
cb.cutoff_cnt = cutoff_cnt;
|
2017-10-16 00:07:03 +02:00
|
|
|
cb.oid = oid;
|
2014-06-03 18:09:59 +02:00
|
|
|
|
2021-01-07 11:36:59 +01:00
|
|
|
if (cb.cnt == 0) {
|
|
|
|
refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent_newest, &cb);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-04-06 13:34:30 +02:00
|
|
|
refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent, &cb);
|
2014-06-03 18:09:59 +02:00
|
|
|
|
2014-09-19 05:45:37 +02:00
|
|
|
if (!cb.reccnt) {
|
2017-07-14 01:49:29 +02:00
|
|
|
if (flags & GET_OID_QUIETLY)
|
2014-09-19 05:45:37 +02:00
|
|
|
exit(128);
|
|
|
|
else
|
2018-07-21 09:49:35 +02:00
|
|
|
die(_("log for %s is empty"), refname);
|
2014-09-19 05:45:37 +02:00
|
|
|
}
|
2014-06-03 18:09:59 +02:00
|
|
|
if (cb.found_it)
|
|
|
|
return 0;
|
|
|
|
|
2019-04-06 13:34:30 +02:00
|
|
|
refs_for_each_reflog_ent(refs, refname, read_ref_at_ent_oldest, &cb);
|
2006-05-17 11:56:09 +02:00
|
|
|
|
2007-01-19 10:19:05 +01:00
|
|
|
return 1;
|
2006-05-17 11:56:09 +02:00
|
|
|
}
|
2006-12-18 10:18:16 +01:00
|
|
|
|
2017-03-26 04:42:35 +02:00
|
|
|
struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
|
|
|
|
struct strbuf *err)
|
2014-04-07 15:48:10 +02:00
|
|
|
{
|
2017-03-26 04:42:35 +02:00
|
|
|
struct ref_transaction *tr;
|
2014-08-29 01:42:37 +02:00
|
|
|
assert(err);
|
|
|
|
|
2021-03-13 17:17:22 +01:00
|
|
|
CALLOC_ARRAY(tr, 1);
|
2017-03-26 04:42:35 +02:00
|
|
|
tr->ref_store = refs;
|
|
|
|
return tr;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct ref_transaction *ref_transaction_begin(struct strbuf *err)
|
|
|
|
{
|
2022-04-14 00:51:33 +02:00
|
|
|
return ref_store_transaction_begin(get_main_ref_store(the_repository), err);
|
2014-04-07 15:48:10 +02:00
|
|
|
}
|
|
|
|
|
2014-06-20 16:42:42 +02:00
|
|
|
void ref_transaction_free(struct ref_transaction *transaction)
|
2014-04-07 15:48:10 +02:00
|
|
|
{
|
2017-05-22 16:17:37 +02:00
|
|
|
size_t i;
|
2014-04-07 15:48:10 +02:00
|
|
|
|
2014-06-20 16:42:45 +02:00
|
|
|
if (!transaction)
|
|
|
|
return;
|
|
|
|
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
case REF_TRANSACTION_CLOSED:
|
|
|
|
/* OK */
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_PREPARED:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("free called on a prepared reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("unexpected reference transaction state");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2014-04-30 21:22:42 +02:00
|
|
|
for (i = 0; i < transaction->nr; i++) {
|
|
|
|
free(transaction->updates[i]->msg);
|
2014-04-07 15:48:14 +02:00
|
|
|
free(transaction->updates[i]);
|
2014-04-30 21:22:42 +02:00
|
|
|
}
|
2014-04-07 15:48:10 +02:00
|
|
|
free(transaction->updates);
|
|
|
|
free(transaction);
|
|
|
|
}
|
|
|
|
|
2016-04-25 11:39:54 +02:00
|
|
|
struct ref_update *ref_transaction_add_update(
|
|
|
|
struct ref_transaction *transaction,
|
|
|
|
const char *refname, unsigned int flags,
|
2017-10-16 00:06:53 +02:00
|
|
|
const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid,
|
2016-04-25 11:39:54 +02:00
|
|
|
const char *msg)
|
2014-04-07 15:48:10 +02:00
|
|
|
{
|
2016-02-22 23:44:32 +01:00
|
|
|
struct ref_update *update;
|
2016-04-25 11:39:54 +02:00
|
|
|
|
|
|
|
if (transaction->state != REF_TRANSACTION_OPEN)
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("update called for transaction that is not open");
|
2016-04-25 11:39:54 +02:00
|
|
|
|
2016-02-22 23:44:32 +01:00
|
|
|
FLEX_ALLOC_STR(update, refname, refname);
|
2014-04-07 15:48:10 +02:00
|
|
|
ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
|
|
|
|
transaction->updates[transaction->nr++] = update;
|
2016-04-25 11:39:54 +02:00
|
|
|
|
|
|
|
update->flags = flags;
|
|
|
|
|
|
|
|
if (flags & REF_HAVE_NEW)
|
2017-10-16 00:06:53 +02:00
|
|
|
oidcpy(&update->new_oid, new_oid);
|
2016-04-25 11:39:54 +02:00
|
|
|
if (flags & REF_HAVE_OLD)
|
2017-10-16 00:06:53 +02:00
|
|
|
oidcpy(&update->old_oid, old_oid);
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
update->msg = normalize_reflog_message(msg);
|
2014-04-07 15:48:10 +02:00
|
|
|
return update;
|
|
|
|
}
|
|
|
|
|
2014-06-20 16:43:00 +02:00
|
|
|
int ref_transaction_update(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 00:06:53 +02:00
|
|
|
const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid,
|
2015-02-17 18:00:15 +01:00
|
|
|
unsigned int flags, const char *msg,
|
2014-06-20 16:43:00 +02:00
|
|
|
struct strbuf *err)
|
2014-04-07 15:48:10 +02:00
|
|
|
{
|
2014-08-29 01:42:37 +02:00
|
|
|
assert(err);
|
|
|
|
|
2021-12-07 14:38:18 +01:00
|
|
|
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
|
|
|
|
((new_oid && !is_null_oid(new_oid)) ?
|
|
|
|
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
|
|
|
|
!refname_is_safe(refname))) {
|
2018-07-21 09:49:35 +02:00
|
|
|
strbuf_addf(err, _("refusing to update ref with bad name '%s'"),
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 20:45:43 +02:00
|
|
|
refname);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2017-11-05 09:42:03 +01:00
|
|
|
if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
|
|
|
|
BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
|
refs: strip out not allowed flags from ref_transaction_update
Callers are only allowed to pass certain flags into
ref_transaction_update, other flags are internal to it. To prevent
mistakes from the callers, strip the internal only flags out before
continuing.
This was noticed because of a compiler warning gcc 7.1.1 issued about
passing a NULL parameter as second parameter to memcpy (through
hashcpy):
In file included from refs.c:5:0:
refs.c: In function ‘ref_transaction_verify’:
cache.h:948:2: error: argument 2 null where non-null expected [-Werror=nonnull]
memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:165:0,
from cache.h:4,
from refs.c:5:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared here
extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
^~~~~~
The call to hascpy in ref_transaction_add_update is protected by the
passed in flags, but as we only add flags there, gcc notices
REF_HAVE_NEW or REF_HAVE_OLD flags could be passed in from the outside,
which would potentially result in passing in NULL as second parameter to
memcpy.
Fix both the compiler warning, and make the interface safer for its
users by stripping the internal flags out.
Suggested-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-13 00:59:21 +02:00
|
|
|
|
refs: work around gcc-11 warning with REF_HAVE_NEW
Using gcc-11 (or 12) to compile refs.o with -O3 results in:
In file included from hashmap.h:4,
from cache.h:6,
from refs.c:5:
In function ‘oidcpy’,
inlined from ‘ref_transaction_add_update’ at refs.c:1065:3,
inlined from ‘ref_transaction_update’ at refs.c:1094:2,
inlined from ‘ref_transaction_verify’ at refs.c:1132:9:
hash.h:262:9: warning: argument 2 null where non-null expected [-Wnonnull]
262 | memcpy(dst->hash, src->hash, GIT_MAX_RAWSZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:177,
from cache.h:4,
from refs.c:5:
refs.c: In function ‘ref_transaction_verify’:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared ‘nonnull’
43 | extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
| ^~~~~~
That call to memcpy() is in a conditional block that requires
REF_HAVE_NEW to be set. But in ref_transaction_update(), we make sure it
isn't set coming in:
if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
and then only set it if the variable isn't NULL:
flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
So it should be impossible to reach that memcpy() with a NULL oid. But
for whatever reason, gcc doesn't accept that hitting the BUG() means we
won't go any further, even though it's marked with the noreturn
attribute. And the conditional is correct; ALLOWED_FLAGS doesn't contain
HAVE_NEW or HAVE_OLD, and you can even simplify it to check for those
flags explicitly and the compiler still complains.
We can work around this by just clearing the disallowed flags
explicitly. This should be a noop because of the BUG() check, but it
makes the compiler happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-19 22:28:30 +01:00
|
|
|
/*
|
|
|
|
* Clear flags outside the allowed set; this should be a noop because
|
|
|
|
* of the BUG() check above, but it works around a -Wnonnull warning
|
|
|
|
* with some versions of "gcc -O3".
|
|
|
|
*/
|
|
|
|
flags &= REF_TRANSACTION_UPDATE_ALLOWED_FLAGS;
|
|
|
|
|
2017-10-16 00:06:53 +02:00
|
|
|
flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
|
2016-04-25 11:39:54 +02:00
|
|
|
|
|
|
|
ref_transaction_add_update(transaction, refname, flags,
|
2017-10-16 00:06:53 +02:00
|
|
|
new_oid, old_oid, msg);
|
2014-06-20 16:43:00 +02:00
|
|
|
return 0;
|
2014-04-07 15:48:10 +02:00
|
|
|
}
|
|
|
|
|
2014-04-17 00:26:44 +02:00
|
|
|
int ref_transaction_create(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 00:06:53 +02:00
|
|
|
const struct object_id *new_oid,
|
2015-02-17 18:00:13 +01:00
|
|
|
unsigned int flags, const char *msg,
|
2014-04-17 00:26:44 +02:00
|
|
|
struct strbuf *err)
|
2014-04-07 15:48:10 +02:00
|
|
|
{
|
clone: die() instead of BUG() on bad refs
When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():
BUG: create called without valid new_oid
This BUG() call used to be a die() until 033abf97f (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).
The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.
[1] https://github.com/git-for-windows/git/issues/3781
There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.
The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.
While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.
The added test helps check against a regression, but does check that our
intended error message is handled correctly.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-25 15:47:30 +02:00
|
|
|
if (!new_oid || is_null_oid(new_oid)) {
|
|
|
|
strbuf_addf(err, "'%s' has a null OID", refname);
|
|
|
|
return 1;
|
|
|
|
}
|
2017-10-16 00:06:53 +02:00
|
|
|
return ref_transaction_update(transaction, refname, new_oid,
|
2021-04-26 03:02:56 +02:00
|
|
|
null_oid(), flags, msg, err);
|
2014-04-07 15:48:10 +02:00
|
|
|
}
|
|
|
|
|
2014-04-17 00:27:45 +02:00
|
|
|
int ref_transaction_delete(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 00:06:53 +02:00
|
|
|
const struct object_id *old_oid,
|
2015-02-17 18:00:16 +01:00
|
|
|
unsigned int flags, const char *msg,
|
2014-04-17 00:27:45 +02:00
|
|
|
struct strbuf *err)
|
2014-04-07 15:48:10 +02:00
|
|
|
{
|
2017-10-16 00:06:53 +02:00
|
|
|
if (old_oid && is_null_oid(old_oid))
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("delete called with old_oid set to zeros");
|
2015-02-17 18:00:15 +01:00
|
|
|
return ref_transaction_update(transaction, refname,
|
2021-04-26 03:02:56 +02:00
|
|
|
null_oid(), old_oid,
|
2015-02-17 18:00:15 +01:00
|
|
|
flags, msg, err);
|
2014-04-07 15:48:10 +02:00
|
|
|
}
|
|
|
|
|
2015-02-17 18:00:21 +01:00
|
|
|
int ref_transaction_verify(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 00:06:53 +02:00
|
|
|
const struct object_id *old_oid,
|
2015-02-17 18:00:21 +01:00
|
|
|
unsigned int flags,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
2017-10-16 00:06:53 +02:00
|
|
|
if (!old_oid)
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("verify called with old_oid set to NULL");
|
2015-02-17 18:00:21 +01:00
|
|
|
return ref_transaction_update(transaction, refname,
|
2017-10-16 00:06:53 +02:00
|
|
|
NULL, old_oid,
|
2015-02-17 18:00:21 +01:00
|
|
|
flags, NULL, err);
|
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:35 +02:00
|
|
|
int refs_update_ref(struct ref_store *refs, const char *msg,
|
2017-10-16 00:06:51 +02:00
|
|
|
const char *refname, const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid, unsigned int flags,
|
2017-03-26 04:42:35 +02:00
|
|
|
enum action_on_err onerr)
|
2013-09-04 17:22:40 +02:00
|
|
|
{
|
2015-07-31 08:06:19 +02:00
|
|
|
struct ref_transaction *t = NULL;
|
2014-04-25 01:36:55 +02:00
|
|
|
struct strbuf err = STRBUF_INIT;
|
2015-07-31 08:06:19 +02:00
|
|
|
int ret = 0;
|
2014-04-25 01:36:55 +02:00
|
|
|
|
2022-04-14 00:51:33 +02:00
|
|
|
t = ref_store_transaction_begin(refs, &err);
|
2020-07-27 18:25:46 +02:00
|
|
|
if (!t ||
|
|
|
|
ref_transaction_update(t, refname, new_oid, old_oid, flags, msg,
|
|
|
|
&err) ||
|
|
|
|
ref_transaction_commit(t, &err)) {
|
|
|
|
ret = 1;
|
|
|
|
ref_transaction_free(t);
|
2015-07-31 08:06:19 +02:00
|
|
|
}
|
|
|
|
if (ret) {
|
2018-07-21 09:49:35 +02:00
|
|
|
const char *str = _("update_ref failed for ref '%s': %s");
|
2014-04-25 01:36:55 +02:00
|
|
|
|
|
|
|
switch (onerr) {
|
|
|
|
case UPDATE_REFS_MSG_ON_ERR:
|
|
|
|
error(str, refname, err.buf);
|
|
|
|
break;
|
|
|
|
case UPDATE_REFS_DIE_ON_ERR:
|
|
|
|
die(str, refname, err.buf);
|
|
|
|
break;
|
|
|
|
case UPDATE_REFS_QUIET_ON_ERR:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
strbuf_release(&err);
|
2013-09-04 17:22:40 +02:00
|
|
|
return 1;
|
2014-04-25 01:36:55 +02:00
|
|
|
}
|
|
|
|
strbuf_release(&err);
|
2015-07-31 08:06:19 +02:00
|
|
|
if (t)
|
|
|
|
ref_transaction_free(t);
|
2014-04-25 01:36:55 +02:00
|
|
|
return 0;
|
2013-09-04 17:22:40 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:35 +02:00
|
|
|
int update_ref(const char *msg, const char *refname,
|
2017-10-16 00:06:51 +02:00
|
|
|
const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid,
|
2017-03-26 04:42:35 +02:00
|
|
|
unsigned int flags, enum action_on_err onerr)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_update_ref(get_main_ref_store(the_repository), msg, refname, new_oid,
|
2017-10-16 00:06:51 +02:00
|
|
|
old_oid, flags, onerr);
|
2017-03-26 04:42:35 +02:00
|
|
|
}
|
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
/*
|
|
|
|
* Check that the string refname matches a rule of the form
|
|
|
|
* "{prefix}%.*s{suffix}". So "foo/bar/baz" would match the rule
|
|
|
|
* "foo/%.*s/baz", and return the string "bar".
|
|
|
|
*/
|
|
|
|
static const char *match_parse_rule(const char *refname, const char *rule,
|
|
|
|
size_t *len)
|
2009-04-07 09:14:20 +02:00
|
|
|
{
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
/*
|
|
|
|
* Check that rule matches refname up to the first percent in the rule.
|
|
|
|
* We can bail immediately if not, but otherwise we leave "rule" at the
|
|
|
|
* %-placeholder, and "refname" at the start of the potential matched
|
|
|
|
* name.
|
|
|
|
*/
|
|
|
|
while (*rule != '%') {
|
|
|
|
if (!*rule)
|
|
|
|
BUG("rev-parse rule did not have percent");
|
|
|
|
if (*refname++ != *rule++)
|
|
|
|
return NULL;
|
|
|
|
}
|
2009-04-07 09:14:20 +02:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
/*
|
|
|
|
* Check that our "%" is the expected placeholder. This assumes there
|
|
|
|
* are no other percents (placeholder or quoted) in the string, but
|
|
|
|
* that is sufficient for our rev-parse rules.
|
|
|
|
*/
|
|
|
|
if (!skip_prefix(rule, "%.*s", &rule))
|
|
|
|
return NULL;
|
2009-04-07 09:14:20 +02:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
/*
|
|
|
|
* And now check that our suffix (if any) matches.
|
|
|
|
*/
|
|
|
|
if (!strip_suffix(refname, rule, len))
|
|
|
|
return NULL;
|
2009-04-07 09:14:20 +02:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
return refname; /* len set by strip_suffix() */
|
|
|
|
}
|
2009-04-07 09:14:20 +02:00
|
|
|
|
2019-04-06 13:34:25 +02:00
|
|
|
char *refs_shorten_unambiguous_ref(struct ref_store *refs,
|
|
|
|
const char *refname, int strict)
|
2009-04-07 09:14:20 +02:00
|
|
|
{
|
|
|
|
int i;
|
2017-03-28 21:46:33 +02:00
|
|
|
struct strbuf resolved_buf = STRBUF_INIT;
|
2009-04-07 09:14:20 +02:00
|
|
|
|
|
|
|
/* skip first rule, it will always match */
|
shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.
We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.
The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.
Note that there are two things here that aren't just simple text
replacements:
1. We also use nr_rules to see if a previous call has initialized the
static pre-processing variables. We can just use the scanf_fmts
pointer to do the same thing, as it is non-NULL only after we've
done that initialization.
2. If nr_rules is zero after we've counted it up, we bail from the
function. This code is unreachable, though, as the set of rules is
hard-coded and non-empty. And that becomes even more apparent now
that we are using the constant. So we can drop this conditional
completely (and ironically, the code would have the same output if
it _did_ trigger, as we'd simply skip the loop entirely and return
the whole refname).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:18 +01:00
|
|
|
for (i = NUM_REV_PARSE_RULES - 1; i > 0 ; --i) {
|
2009-04-07 09:14:20 +02:00
|
|
|
int j;
|
2009-04-13 12:25:46 +02:00
|
|
|
int rules_to_fail = i;
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
const char *short_name;
|
shorten_unambiguous_ref(): avoid integer truncation
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.
In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).
And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.
But even if it is not a problem in practice so far, it is worth fixing
for two reasons:
1. We'll shortly be replacing the sscanf() call with a real parser
which will handle arbitrary-sized strings.
2. Assigning strlen() to an int is an anti-pattern that requires
people to look twice when auditing for real overflow problems.
So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:14 +01:00
|
|
|
size_t short_name_len;
|
2009-04-07 09:14:20 +02:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
short_name = match_parse_rule(refname, ref_rev_parse_rules[i],
|
|
|
|
&short_name_len);
|
|
|
|
if (!short_name)
|
2009-04-07 09:14:20 +02:00
|
|
|
continue;
|
|
|
|
|
2009-04-13 12:25:46 +02:00
|
|
|
/*
|
|
|
|
* in strict mode, all (except the matched one) rules
|
|
|
|
* must fail to resolve to a valid non-ambiguous ref
|
|
|
|
*/
|
|
|
|
if (strict)
|
shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.
We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.
The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.
Note that there are two things here that aren't just simple text
replacements:
1. We also use nr_rules to see if a previous call has initialized the
static pre-processing variables. We can just use the scanf_fmts
pointer to do the same thing, as it is non-NULL only after we've
done that initialization.
2. If nr_rules is zero after we've counted it up, we bail from the
function. This code is unreachable, though, as the set of rules is
hard-coded and non-empty. And that becomes even more apparent now
that we are using the constant. So we can drop this conditional
completely (and ironically, the code would have the same output if
it _did_ trigger, as we'd simply skip the loop entirely and return
the whole refname).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:18 +01:00
|
|
|
rules_to_fail = NUM_REV_PARSE_RULES;
|
2009-04-13 12:25:46 +02:00
|
|
|
|
2009-04-07 09:14:20 +02:00
|
|
|
/*
|
|
|
|
* check if the short name resolves to a valid ref,
|
|
|
|
* but use only rules prior to the matched one
|
|
|
|
*/
|
2009-04-13 12:25:46 +02:00
|
|
|
for (j = 0; j < rules_to_fail; j++) {
|
2009-04-07 09:14:20 +02:00
|
|
|
const char *rule = ref_rev_parse_rules[j];
|
|
|
|
|
2009-04-13 12:25:46 +02:00
|
|
|
/* skip matched rule */
|
|
|
|
if (i == j)
|
|
|
|
continue;
|
|
|
|
|
2009-04-07 09:14:20 +02:00
|
|
|
/*
|
|
|
|
* the short name is ambiguous, if it resolves
|
|
|
|
* (with this previous rule) to a valid ref
|
|
|
|
* read_ref() returns 0 on success
|
|
|
|
*/
|
2017-03-28 21:46:33 +02:00
|
|
|
strbuf_reset(&resolved_buf);
|
|
|
|
strbuf_addf(&resolved_buf, rule,
|
shorten_unambiguous_ref(): avoid integer truncation
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.
In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).
And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.
But even if it is not a problem in practice so far, it is worth fixing
for two reasons:
1. We'll shortly be replacing the sscanf() call with a real parser
which will handle arbitrary-sized strings.
2. Assigning strlen() to an int is an anti-pattern that requires
people to look twice when auditing for real overflow problems.
So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:14 +01:00
|
|
|
cast_size_t_to_int(short_name_len),
|
|
|
|
short_name);
|
2019-04-06 13:34:25 +02:00
|
|
|
if (refs_ref_exists(refs, resolved_buf.buf))
|
2009-04-07 09:14:20 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* short name is non-ambiguous if all previous rules
|
|
|
|
* haven't resolved to a valid ref
|
|
|
|
*/
|
2017-03-28 21:46:33 +02:00
|
|
|
if (j == rules_to_fail) {
|
|
|
|
strbuf_release(&resolved_buf);
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 16:16:21 +01:00
|
|
|
return xmemdupz(short_name, short_name_len);
|
2017-03-28 21:46:33 +02:00
|
|
|
}
|
2009-04-07 09:14:20 +02:00
|
|
|
}
|
|
|
|
|
2017-03-28 21:46:33 +02:00
|
|
|
strbuf_release(&resolved_buf);
|
2011-12-12 06:38:09 +01:00
|
|
|
return xstrdup(refname);
|
2009-04-07 09:14:20 +02:00
|
|
|
}
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
|
2019-04-06 13:34:25 +02:00
|
|
|
char *shorten_unambiguous_ref(const char *refname, int strict)
|
|
|
|
{
|
|
|
|
return refs_shorten_unambiguous_ref(get_main_ref_store(the_repository),
|
|
|
|
refname, strict);
|
|
|
|
}
|
|
|
|
|
2022-11-17 06:46:43 +01:00
|
|
|
int parse_hide_refs_config(const char *var, const char *value, const char *section,
|
|
|
|
struct string_list *hide_refs)
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
{
|
2017-02-24 22:08:16 +01:00
|
|
|
const char *key;
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
if (!strcmp("transfer.hiderefs", var) ||
|
2017-02-24 22:08:16 +01:00
|
|
|
(!parse_config_key(var, section, NULL, NULL, &key) &&
|
|
|
|
!strcmp(key, "hiderefs"))) {
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
char *ref;
|
|
|
|
int len;
|
|
|
|
|
|
|
|
if (!value)
|
|
|
|
return config_error_nonbool(var);
|
|
|
|
ref = xstrdup(value);
|
|
|
|
len = strlen(ref);
|
|
|
|
while (len && ref[len - 1] == '/')
|
|
|
|
ref[--len] = '\0';
|
2022-11-17 06:46:39 +01:00
|
|
|
string_list_append_nodup(hide_refs, ref);
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-11-17 06:46:43 +01:00
|
|
|
int ref_is_hidden(const char *refname, const char *refname_full,
|
|
|
|
const struct string_list *hide_refs)
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
{
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 22:23:26 +02:00
|
|
|
int i;
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 22:23:26 +02:00
|
|
|
for (i = hide_refs->nr - 1; i >= 0; i--) {
|
|
|
|
const char *match = hide_refs->items[i].string;
|
2015-11-03 08:58:16 +01:00
|
|
|
const char *subject;
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 22:23:26 +02:00
|
|
|
int neg = 0;
|
2017-07-22 06:39:12 +02:00
|
|
|
const char *p;
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 22:23:26 +02:00
|
|
|
|
|
|
|
if (*match == '!') {
|
|
|
|
neg = 1;
|
|
|
|
match++;
|
|
|
|
}
|
|
|
|
|
2015-11-03 08:58:16 +01:00
|
|
|
if (*match == '^') {
|
|
|
|
subject = refname_full;
|
|
|
|
match++;
|
|
|
|
} else {
|
|
|
|
subject = refname;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* refname can be NULL when namespaces are used. */
|
2017-07-22 06:39:12 +02:00
|
|
|
if (subject &&
|
|
|
|
skip_prefix(subject, match, &p) &&
|
|
|
|
(!*p || *p == '/'))
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 22:23:26 +02:00
|
|
|
return !neg;
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
2014-12-12 09:56:59 +01:00
|
|
|
|
2015-11-10 12:42:40 +01:00
|
|
|
const char *find_descendant_ref(const char *dirname,
|
|
|
|
const struct string_list *extras,
|
|
|
|
const struct string_list *skip)
|
2014-12-12 09:56:59 +01:00
|
|
|
{
|
2015-11-10 12:42:40 +01:00
|
|
|
int pos;
|
2014-12-12 09:56:59 +01:00
|
|
|
|
2015-11-10 12:42:40 +01:00
|
|
|
if (!extras)
|
|
|
|
return NULL;
|
2014-12-12 09:56:59 +01:00
|
|
|
|
|
|
|
/*
|
2015-11-10 12:42:40 +01:00
|
|
|
* Look at the place where dirname would be inserted into
|
|
|
|
* extras. If there is an entry at that position that starts
|
|
|
|
* with dirname (remember, dirname includes the trailing
|
|
|
|
* slash) and is not in skip, then we have a conflict.
|
2014-12-12 09:56:59 +01:00
|
|
|
*/
|
2015-11-10 12:42:40 +01:00
|
|
|
for (pos = string_list_find_insert_index(extras, dirname, 0);
|
|
|
|
pos < extras->nr; pos++) {
|
|
|
|
const char *extra_refname = extras->items[pos].string;
|
2014-12-12 09:56:59 +01:00
|
|
|
|
2015-11-10 12:42:40 +01:00
|
|
|
if (!starts_with(extra_refname, dirname))
|
|
|
|
break;
|
|
|
|
|
|
|
|
if (!skip || !string_list_has_string(skip, extra_refname))
|
|
|
|
return extra_refname;
|
2014-12-12 09:56:59 +01:00
|
|
|
}
|
2015-11-10 12:42:40 +01:00
|
|
|
return NULL;
|
|
|
|
}
|
2014-12-12 09:56:59 +01:00
|
|
|
|
2017-08-23 14:36:55 +02:00
|
|
|
int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
2016-04-07 21:02:48 +02:00
|
|
|
{
|
|
|
|
struct object_id oid;
|
|
|
|
int flag;
|
|
|
|
|
2021-10-16 11:39:27 +02:00
|
|
|
if (refs_resolve_ref_unsafe(refs, "HEAD", RESOLVE_REF_READING,
|
2022-01-26 15:37:01 +01:00
|
|
|
&oid, &flag))
|
2016-04-07 21:02:48 +02:00
|
|
|
return fn("HEAD", &oid, flag, cb_data);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int head_ref(each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_head_ref(get_main_ref_store(the_repository), fn, cb_data);
|
2016-04-07 21:02:48 +02:00
|
|
|
}
|
2016-04-07 21:02:49 +02:00
|
|
|
|
2017-03-20 17:33:08 +01:00
|
|
|
struct ref_iterator *refs_ref_iterator_begin(
|
|
|
|
struct ref_store *refs,
|
2021-09-24 20:39:44 +02:00
|
|
|
const char *prefix, int trim,
|
|
|
|
enum do_for_each_ref_flags flags)
|
2017-03-20 17:33:08 +01:00
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
|
|
|
|
2021-09-24 20:42:38 +02:00
|
|
|
if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN)) {
|
2021-09-24 20:46:37 +02:00
|
|
|
static int ref_paranoia = -1;
|
|
|
|
|
2021-09-24 20:42:38 +02:00
|
|
|
if (ref_paranoia < 0)
|
refs: turn on GIT_REF_PARANOIA by default
The original point of the GIT_REF_PARANOIA flag was to include broken
refs in iterations, so that possibly-destructive operations would not
silently ignore them (and would generally instead try to operate on the
oids and fail when the objects could not be accessed).
We already turned this on by default for some dangerous operations, like
"repack -ad" (where missing a reachability tip would mean dropping the
associated history). But it was not on for general use, even though it
could easily result in the spreading of corruption (e.g., imagine
cloning a repository which simply omits some of its refs because
their objects are missing; the result quietly succeeds even though you
did not clone everything!).
This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned
above would actually fail (upload-pack tells us about the broken ref,
and when we ask for the objects, pack-objects fails to deliver them).
This may be inconvenient when working with a corrupted repository, but:
- we are better off to err on the side of complaining about
corruption, and then provide mechanisms for explicitly loosening
safety.
- this is only one type of corruption anyway. If we are missing any
other objects in the history that _aren't_ ref tips, then we'd
behave similarly (happily show the ref, but then barf when we
started traversing).
We retain the GIT_REF_PARANOIA variable, but simply default it to "1"
instead of "0". That gives the user an escape hatch for loosening this
when working with a corrupt repository. It won't work across a remote
connection to upload-pack (because we can't necessarily set environment
variables on the remote), but there the client has other options (e.g.,
choosing which refs to fetch).
As a bonus, this also makes ref iteration faster in general (because we
don't have to call has_object_file() for each ref), though probably not
noticeably so in the general case. In a repo with a million refs, it
shaved a few hundred milliseconds off of upload-pack's advertisement;
that's noticeable, but most repos are not nearly that large.
The possible downside here is that any operation which iterates refs but
doesn't ever open their objects may now quietly claim to have X when the
object is corrupted (e.g., "git rev-list new-branch --not --all" will
treat a broken ref as uninteresting). But again, that's not really any
different than corruption below the ref level. We might have
refs/heads/old-branch as non-corrupt, but we are not actively checking
that we have the entire reachable history. Or the pointed-to object
could even be corrupted on-disk (but our "do we have it" check would
still succeed). In that sense, this is merely bringing ref-corruption in
line with general object corruption.
One alternative implementation would be to actually check for broken
refs, and then _immediately die_ if we see any. That would cause the
"rev-list --not --all" case above to abort immediately. But in many ways
that's the worst of all worlds:
- it still spends time looking up the objects an extra time
- it still doesn't catch corruption below the ref level
- it's even more inconvenient; with the current implementation of
GIT_REF_PARANOIA for something like upload-pack, we can make
the advertisement and let the client choose a non-broken piece of
history. If we bail as soon as we see a broken ref, they cannot even
see the advertisement.
The test changes here show some of the fallout. A non-destructive "git
repack -adk" now fails by default (but we can override it). Deleting a
broken ref now actually tells the hooks the correct "before" state,
rather than a confusing null oid.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-24 20:46:13 +02:00
|
|
|
ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 1);
|
2021-09-24 20:42:38 +02:00
|
|
|
if (ref_paranoia) {
|
|
|
|
flags |= DO_FOR_EACH_INCLUDE_BROKEN;
|
|
|
|
flags |= DO_FOR_EACH_OMIT_DANGLING_SYMREFS;
|
|
|
|
}
|
|
|
|
}
|
2017-05-22 16:17:52 +02:00
|
|
|
|
2017-03-20 17:33:08 +01:00
|
|
|
iter = refs->be->iterator_begin(refs, prefix, flags);
|
2017-05-22 16:17:36 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* `iterator_begin()` already takes care of prefix, but we
|
|
|
|
* might need to do some trimming:
|
|
|
|
*/
|
|
|
|
if (trim)
|
|
|
|
iter = prefix_ref_iterator_begin(iter, "", trim);
|
2017-03-20 17:33:08 +01:00
|
|
|
|
2017-09-13 19:15:55 +02:00
|
|
|
/* Sanity check for subclasses: */
|
|
|
|
if (!iter->ordered)
|
|
|
|
BUG("reference iterator is not ordered");
|
|
|
|
|
2017-03-20 17:33:08 +01:00
|
|
|
return iter;
|
|
|
|
}
|
|
|
|
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 06:15:16 +02:00
|
|
|
/*
|
|
|
|
* Call fn for each reference in the specified submodule for which the
|
|
|
|
* refname begins with prefix. If trim is non-zero, then trim that
|
|
|
|
* many characters off the beginning of each refname before passing
|
|
|
|
* the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
|
|
|
|
* include broken references in the iteration. If fn ever returns a
|
|
|
|
* non-zero value, stop the iteration and return that value;
|
|
|
|
* otherwise, return 0.
|
|
|
|
*/
|
2018-08-20 20:24:16 +02:00
|
|
|
static int do_for_each_repo_ref(struct repository *r, const char *prefix,
|
|
|
|
each_repo_ref_fn fn, int trim, int flags,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
|
|
|
struct ref_store *refs = get_main_ref_store(r);
|
|
|
|
|
|
|
|
if (!refs)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
|
|
|
|
|
|
|
|
return do_for_each_repo_ref_iterator(r, iter, fn, cb_data);
|
|
|
|
}
|
|
|
|
|
|
|
|
struct do_for_each_ref_help {
|
|
|
|
each_ref_fn *fn;
|
|
|
|
void *cb_data;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int do_for_each_ref_helper(struct repository *r,
|
|
|
|
const char *refname,
|
|
|
|
const struct object_id *oid,
|
|
|
|
int flags,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
struct do_for_each_ref_help *hp = cb_data;
|
|
|
|
|
|
|
|
return hp->fn(refname, oid, flags, hp->cb_data);
|
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
static int do_for_each_ref(struct ref_store *refs, const char *prefix,
|
2021-09-24 20:39:44 +02:00
|
|
|
each_ref_fn fn, int trim,
|
|
|
|
enum do_for_each_ref_flags flags, void *cb_data)
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 06:15:16 +02:00
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
2018-08-20 20:24:16 +02:00
|
|
|
struct do_for_each_ref_help hp = { fn, cb_data };
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 06:15:16 +02:00
|
|
|
|
2016-09-04 18:08:11 +02:00
|
|
|
if (!refs)
|
|
|
|
return 0;
|
|
|
|
|
2017-03-20 17:33:08 +01:00
|
|
|
iter = refs_ref_iterator_begin(refs, prefix, trim, flags);
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 06:15:16 +02:00
|
|
|
|
2018-08-20 20:24:16 +02:00
|
|
|
return do_for_each_repo_ref_iterator(the_repository, iter,
|
|
|
|
do_for_each_ref_helper, &hp);
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 06:15:16 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return do_for_each_ref(refs, "", fn, 0, 0, cb_data);
|
|
|
|
}
|
|
|
|
|
2016-04-07 21:02:49 +02:00
|
|
|
int for_each_ref(each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_ref(get_main_ref_store(the_repository), fn, cb_data);
|
2016-04-07 21:02:49 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
|
|
|
|
each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return do_for_each_ref(refs, prefix, fn, strlen(prefix), 0, cb_data);
|
2016-04-07 21:02:49 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int for_each_ref_in(const char *prefix, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_ref_in(get_main_ref_store(the_repository), prefix, fn, cb_data);
|
2016-04-07 21:02:49 +02:00
|
|
|
}
|
|
|
|
|
2021-09-24 20:48:48 +02:00
|
|
|
int for_each_fullref_in(const char *prefix, each_ref_fn fn, void *cb_data)
|
2016-04-07 21:02:49 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return do_for_each_ref(get_main_ref_store(the_repository),
|
2021-09-24 20:48:48 +02:00
|
|
|
prefix, fn, 0, 0, cb_data);
|
2016-04-07 21:02:49 +02:00
|
|
|
}
|
|
|
|
|
2017-08-23 14:36:56 +02:00
|
|
|
int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
|
2021-09-24 20:48:48 +02:00
|
|
|
each_ref_fn fn, void *cb_data)
|
2017-06-18 15:39:41 +02:00
|
|
|
{
|
2021-09-24 20:48:48 +02:00
|
|
|
return do_for_each_ref(refs, prefix, fn, 0, 0, cb_data);
|
2017-06-18 15:39:41 +02:00
|
|
|
}
|
|
|
|
|
2018-08-20 20:24:19 +02:00
|
|
|
int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
|
2016-04-07 21:02:49 +02:00
|
|
|
{
|
2022-08-05 19:58:37 +02:00
|
|
|
const char *git_replace_ref_base = ref_namespace[NAMESPACE_REPLACE].ref;
|
2018-08-20 20:24:19 +02:00
|
|
|
return do_for_each_repo_ref(r, git_replace_ref_base, fn,
|
|
|
|
strlen(git_replace_ref_base),
|
|
|
|
DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
|
2016-04-07 21:02:49 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int for_each_namespaced_ref(each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int ret;
|
|
|
|
strbuf_addf(&buf, "%srefs/", get_git_namespace());
|
2018-04-12 02:21:09 +02:00
|
|
|
ret = do_for_each_ref(get_main_ref_store(the_repository),
|
2017-03-26 04:42:34 +02:00
|
|
|
buf.buf, fn, 0, 0, cb_data);
|
2016-04-07 21:02:49 +02:00
|
|
|
strbuf_release(&buf);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
2016-04-07 21:02:49 +02:00
|
|
|
{
|
2017-03-26 04:42:34 +02:00
|
|
|
return do_for_each_ref(refs, "", fn, 0,
|
2016-04-07 21:02:49 +02:00
|
|
|
DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
|
|
|
|
}
|
2016-04-07 21:03:10 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int for_each_rawref(each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_rawref(get_main_ref_store(the_repository), fn, cb_data);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
|
|
|
|
2021-01-20 17:04:21 +01:00
|
|
|
static int qsort_strcmp(const void *va, const void *vb)
|
|
|
|
{
|
|
|
|
const char *a = *(const char **)va;
|
|
|
|
const char *b = *(const char **)vb;
|
|
|
|
|
|
|
|
return strcmp(a, b);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void find_longest_prefixes_1(struct string_list *out,
|
|
|
|
struct strbuf *prefix,
|
|
|
|
const char **patterns, size_t nr)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < nr; i++) {
|
|
|
|
char c = patterns[i][prefix->len];
|
|
|
|
if (!c || is_glob_special(c)) {
|
|
|
|
string_list_append(out, prefix->buf);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
i = 0;
|
|
|
|
while (i < nr) {
|
|
|
|
size_t end;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set "end" to the index of the element _after_ the last one
|
|
|
|
* in our group.
|
|
|
|
*/
|
|
|
|
for (end = i + 1; end < nr; end++) {
|
|
|
|
if (patterns[i][prefix->len] != patterns[end][prefix->len])
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
strbuf_addch(prefix, patterns[i][prefix->len]);
|
|
|
|
find_longest_prefixes_1(out, prefix, patterns + i, end - i);
|
|
|
|
strbuf_setlen(prefix, prefix->len - 1);
|
|
|
|
|
|
|
|
i = end;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void find_longest_prefixes(struct string_list *out,
|
|
|
|
const char **patterns)
|
|
|
|
{
|
|
|
|
struct strvec sorted = STRVEC_INIT;
|
|
|
|
struct strbuf prefix = STRBUF_INIT;
|
|
|
|
|
|
|
|
strvec_pushv(&sorted, patterns);
|
|
|
|
QSORT(sorted.v, sorted.nr, qsort_strcmp);
|
|
|
|
|
|
|
|
find_longest_prefixes_1(out, &prefix, sorted.v, sorted.nr);
|
|
|
|
|
|
|
|
strvec_clear(&sorted);
|
|
|
|
strbuf_release(&prefix);
|
|
|
|
}
|
|
|
|
|
2022-12-13 12:11:10 +01:00
|
|
|
int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
|
|
|
|
const char *namespace,
|
|
|
|
const char **patterns,
|
|
|
|
each_ref_fn fn, void *cb_data)
|
2021-01-20 17:04:21 +01:00
|
|
|
{
|
|
|
|
struct string_list prefixes = STRING_LIST_INIT_DUP;
|
|
|
|
struct string_list_item *prefix;
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int ret = 0, namespace_len;
|
|
|
|
|
|
|
|
find_longest_prefixes(&prefixes, patterns);
|
|
|
|
|
|
|
|
if (namespace)
|
|
|
|
strbuf_addstr(&buf, namespace);
|
|
|
|
namespace_len = buf.len;
|
|
|
|
|
|
|
|
for_each_string_list_item(prefix, &prefixes) {
|
|
|
|
strbuf_addstr(&buf, prefix->string);
|
2022-12-13 12:11:10 +01:00
|
|
|
ret = refs_for_each_fullref_in(ref_store, buf.buf, fn, cb_data);
|
2021-01-20 17:04:21 +01:00
|
|
|
if (ret)
|
|
|
|
break;
|
|
|
|
strbuf_setlen(&buf, namespace_len);
|
|
|
|
}
|
|
|
|
|
|
|
|
string_list_clear(&prefixes, 0);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2020-08-19 16:27:58 +02:00
|
|
|
static int refs_read_special_head(struct ref_store *ref_store,
|
|
|
|
const char *refname, struct object_id *oid,
|
2021-10-16 11:39:10 +02:00
|
|
|
struct strbuf *referent, unsigned int *type,
|
|
|
|
int *failure_errno)
|
2020-08-19 16:27:58 +02:00
|
|
|
{
|
|
|
|
struct strbuf full_path = STRBUF_INIT;
|
|
|
|
struct strbuf content = STRBUF_INIT;
|
|
|
|
int result = -1;
|
|
|
|
strbuf_addf(&full_path, "%s/%s", ref_store->gitdir, refname);
|
|
|
|
|
|
|
|
if (strbuf_read_file(&content, full_path.buf, 0) < 0)
|
|
|
|
goto done;
|
|
|
|
|
2021-10-16 11:39:10 +02:00
|
|
|
result = parse_loose_ref_contents(content.buf, oid, referent, type,
|
|
|
|
failure_errno);
|
2020-08-19 16:27:58 +02:00
|
|
|
|
|
|
|
done:
|
|
|
|
strbuf_release(&full_path);
|
|
|
|
strbuf_release(&content);
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
2021-10-16 11:39:09 +02:00
|
|
|
int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
|
|
|
|
struct object_id *oid, struct strbuf *referent,
|
|
|
|
unsigned int *type, int *failure_errno)
|
2017-03-20 17:33:07 +01:00
|
|
|
{
|
2021-10-16 11:39:09 +02:00
|
|
|
assert(failure_errno);
|
2020-08-19 16:27:58 +02:00
|
|
|
if (!strcmp(refname, "FETCH_HEAD") || !strcmp(refname, "MERGE_HEAD")) {
|
|
|
|
return refs_read_special_head(ref_store, refname, oid, referent,
|
2021-10-16 11:39:10 +02:00
|
|
|
type, failure_errno);
|
2020-08-19 16:27:58 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
return ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
|
2021-10-16 11:39:09 +02:00
|
|
|
type, failure_errno);
|
2017-03-20 17:33:07 +01:00
|
|
|
}
|
|
|
|
|
refs: add ability for backends to special-case reading of symbolic refs
Reading of symbolic and non-symbolic references is currently treated the
same in reference backends: we always call `refs_read_raw_ref()` and
then decide based on the returned flags what type it is. This has one
downside though: symbolic references may be treated different from
normal references in a backend from normal references. The packed-refs
backend for example doesn't even know about symbolic references, and as
a result it is pointless to even ask it for one.
There are cases where we really only care about whether a reference is
symbolic or not, but don't care about whether it exists at all or may be
a non-symbolic reference. But it is not possible to optimize for this
case right now, and as a consequence we will always first check for a
loose reference to exist, and if it doesn't, we'll query the packed-refs
backend for a known-to-not-be-symbolic reference. This is inefficient
and requires us to search all packed references even though we know to
not care for the result at all.
Introduce a new function `refs_read_symbolic_ref()` which allows us to
fix this case. This function will only ever return symbolic references
and can thus optimize for the scenario layed out above. By default, if
the backend doesn't provide an implementation for it, we just use the
old code path and fall back to `read_raw_ref()`. But in case the backend
provides its own, more efficient implementation, we will use that one
instead.
Note that this function is explicitly designed to not distinguish
between missing references and non-symbolic references. If it did, we'd
be forced to always search the packed-refs backend to see whether the
symbolic reference the user asked for really doesn't exist, or if it
exists as a non-symbolic reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-01 10:33:46 +01:00
|
|
|
int refs_read_symbolic_ref(struct ref_store *ref_store, const char *refname,
|
|
|
|
struct strbuf *referent)
|
|
|
|
{
|
2022-03-17 18:27:19 +01:00
|
|
|
return ref_store->be->read_symbolic_ref(ref_store, refname, referent);
|
refs: add ability for backends to special-case reading of symbolic refs
Reading of symbolic and non-symbolic references is currently treated the
same in reference backends: we always call `refs_read_raw_ref()` and
then decide based on the returned flags what type it is. This has one
downside though: symbolic references may be treated different from
normal references in a backend from normal references. The packed-refs
backend for example doesn't even know about symbolic references, and as
a result it is pointless to even ask it for one.
There are cases where we really only care about whether a reference is
symbolic or not, but don't care about whether it exists at all or may be
a non-symbolic reference. But it is not possible to optimize for this
case right now, and as a consequence we will always first check for a
loose reference to exist, and if it doesn't, we'll query the packed-refs
backend for a known-to-not-be-symbolic reference. This is inefficient
and requires us to search all packed references even though we know to
not care for the result at all.
Introduce a new function `refs_read_symbolic_ref()` which allows us to
fix this case. This function will only ever return symbolic references
and can thus optimize for the scenario layed out above. By default, if
the backend doesn't provide an implementation for it, we just use the
old code path and fall back to `read_raw_ref()`. But in case the backend
provides its own, more efficient implementation, we will use that one
instead.
Note that this function is explicitly designed to not distinguish
between missing references and non-symbolic references. If it did, we'd
be forced to always search the packed-refs backend to see whether the
symbolic reference the user asked for really doesn't exist, or if it
exists as a non-symbolic reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-01 10:33:46 +01:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
const char *refs_resolve_ref_unsafe(struct ref_store *refs,
|
2017-02-09 21:53:52 +01:00
|
|
|
const char *refname,
|
|
|
|
int resolve_flags,
|
2021-10-16 11:39:08 +02:00
|
|
|
struct object_id *oid,
|
2022-01-26 15:37:01 +01:00
|
|
|
int *flags)
|
2016-04-07 21:03:10 +02:00
|
|
|
{
|
|
|
|
static struct strbuf sb_refname = STRBUF_INIT;
|
2017-09-23 11:41:45 +02:00
|
|
|
struct object_id unused_oid;
|
2016-04-07 21:03:10 +02:00
|
|
|
int unused_flags;
|
|
|
|
int symref_count;
|
|
|
|
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:07:09 +02:00
|
|
|
if (!oid)
|
|
|
|
oid = &unused_oid;
|
2016-04-07 21:03:10 +02:00
|
|
|
if (!flags)
|
|
|
|
flags = &unused_flags;
|
|
|
|
|
|
|
|
*flags = 0;
|
|
|
|
|
|
|
|
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) {
|
|
|
|
if (!(resolve_flags & RESOLVE_REF_ALLOW_BAD_NAME) ||
|
2022-01-26 15:37:01 +01:00
|
|
|
!refname_is_safe(refname))
|
2016-04-07 21:03:10 +02:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* dwim_ref() uses REF_ISBROKEN to distinguish between
|
|
|
|
* missing refs and refs that were present but invalid,
|
|
|
|
* to complain about the latter to stderr.
|
|
|
|
*
|
|
|
|
* We don't know whether the ref exists, so don't set
|
|
|
|
* REF_ISBROKEN yet.
|
|
|
|
*/
|
|
|
|
*flags |= REF_BAD_NAME;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (symref_count = 0; symref_count < SYMREF_MAXDEPTH; symref_count++) {
|
|
|
|
unsigned int read_flags = 0;
|
2022-01-26 15:37:01 +01:00
|
|
|
int failure_errno;
|
2016-04-07 21:03:10 +02:00
|
|
|
|
2021-10-16 11:39:09 +02:00
|
|
|
if (refs_read_raw_ref(refs, refname, oid, &sb_refname,
|
2022-01-26 15:37:01 +01:00
|
|
|
&read_flags, &failure_errno)) {
|
2016-04-07 21:03:10 +02:00
|
|
|
*flags |= read_flags;
|
refs_resolve_ref_unsafe: handle d/f conflicts for writes
If our call to refs_read_raw_ref() fails, we check errno to
see if the ref is simply missing, or if we encountered a
more serious error. If it's just missing, then in "write"
mode (i.e., when RESOLVE_REFS_READING is not set), this is
perfectly fine.
However, checking for ENOENT isn't sufficient to catch all
missing-ref cases. In the filesystem backend, we may also
see EISDIR when we try to resolve "a" and "a/b" exists.
Likewise, we may see ENOTDIR if we try to resolve "a/b" and
"a" exists. In both of those cases, we know that our
resolved ref doesn't exist, but we return an error (rather
than reporting the refname and returning a null sha1).
This has been broken for a long time, but nobody really
noticed because the next step after resolving without the
READING flag is usually to lock the ref and write it. But in
both of those cases, the write will fail with the same
errno due to the directory/file conflict.
There are two cases where we can notice this, though:
1. If we try to write "a" and there's a leftover directory
already at "a", even though there is no ref "a/b". The
actual write is smart enough to move the empty "a" out
of the way.
This is reasonably rare, if only because the writing
code has to do an independent resolution before trying
its write (because the actual update_ref() code handles
this case fine). The notes-merge code does this, and
before the fix in the prior commit t3308 erroneously
expected this case to fail.
2. When resolving symbolic refs, we typically do not use
the READING flag because we want to resolve even
symrefs that point to unborn refs. Even if those unborn
refs could not actually be written because of d/f
conflicts with existing refs.
You can see this by asking "git symbolic-ref" to report
the target of a symref pointing past a d/f conflict.
We can fix the problem by recognizing the other "missing"
errnos and treating them like ENOENT. This should be safe to
do even for callers who are then going to actually write the
ref, because the actual writing process will fail if the d/f
conflict is a real one (and t1404 checks these cases).
Arguably this should be the responsibility of the
files-backend to normalize all "missing ref" errors into
ENOENT (since something like EISDIR may not be meaningful at
all to a database backend). However other callers of
refs_read_raw_ref() may actually care about the distinction;
putting this into resolve_ref() is the minimal fix for now.
The new tests in t1401 use git-symbolic-ref, which is the
most direct way to check the resolution by itself.
Interestingly we actually had a test that setup this case
already, but we only used it to verify that the funny state
could be overwritten, not that it could be resolved.
We also add a new test in t3200, as "branch -m" was the
original motivation for looking into this. What happens is
this:
0. HEAD is pointing to branch "a"
1. The user asks to rename "a" to "a/b".
2. We create "a/b" and delete "a".
3. We then try to update any worktree HEADs that point to
the renamed ref (including the main repo HEAD). To do
that, we have to resolve each HEAD. But now our HEAD is
pointing at "a", and we get EISDIR due to the loose
"a/b". As a result, we think there is no HEAD, and we
do not update it. It now points to the bogus "a".
Interestingly this case used to work, but only accidentally.
Before 31824d180d (branch: fix branch renaming not updating
HEADs correctly, 2017-08-24), we'd update any HEAD which we
couldn't resolve. That was wrong, but it papered over the
fact that we were incorrectly failing to resolve HEAD.
So while the bug demonstrated by the git-symbolic-ref is
quite old, the regression to "branch -m" is recent.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06 16:42:17 +02:00
|
|
|
|
|
|
|
/* In reading mode, refs must eventually resolve */
|
|
|
|
if (resolve_flags & RESOLVE_REF_READING)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Otherwise a missing ref is OK. But the files backend
|
|
|
|
* may show errors besides ENOENT if there are
|
|
|
|
* similarly-named refs.
|
|
|
|
*/
|
2022-01-26 15:37:01 +01:00
|
|
|
if (failure_errno != ENOENT &&
|
|
|
|
failure_errno != EISDIR &&
|
|
|
|
failure_errno != ENOTDIR)
|
2016-04-07 21:03:10 +02:00
|
|
|
return NULL;
|
refs_resolve_ref_unsafe: handle d/f conflicts for writes
If our call to refs_read_raw_ref() fails, we check errno to
see if the ref is simply missing, or if we encountered a
more serious error. If it's just missing, then in "write"
mode (i.e., when RESOLVE_REFS_READING is not set), this is
perfectly fine.
However, checking for ENOENT isn't sufficient to catch all
missing-ref cases. In the filesystem backend, we may also
see EISDIR when we try to resolve "a" and "a/b" exists.
Likewise, we may see ENOTDIR if we try to resolve "a/b" and
"a" exists. In both of those cases, we know that our
resolved ref doesn't exist, but we return an error (rather
than reporting the refname and returning a null sha1).
This has been broken for a long time, but nobody really
noticed because the next step after resolving without the
READING flag is usually to lock the ref and write it. But in
both of those cases, the write will fail with the same
errno due to the directory/file conflict.
There are two cases where we can notice this, though:
1. If we try to write "a" and there's a leftover directory
already at "a", even though there is no ref "a/b". The
actual write is smart enough to move the empty "a" out
of the way.
This is reasonably rare, if only because the writing
code has to do an independent resolution before trying
its write (because the actual update_ref() code handles
this case fine). The notes-merge code does this, and
before the fix in the prior commit t3308 erroneously
expected this case to fail.
2. When resolving symbolic refs, we typically do not use
the READING flag because we want to resolve even
symrefs that point to unborn refs. Even if those unborn
refs could not actually be written because of d/f
conflicts with existing refs.
You can see this by asking "git symbolic-ref" to report
the target of a symref pointing past a d/f conflict.
We can fix the problem by recognizing the other "missing"
errnos and treating them like ENOENT. This should be safe to
do even for callers who are then going to actually write the
ref, because the actual writing process will fail if the d/f
conflict is a real one (and t1404 checks these cases).
Arguably this should be the responsibility of the
files-backend to normalize all "missing ref" errors into
ENOENT (since something like EISDIR may not be meaningful at
all to a database backend). However other callers of
refs_read_raw_ref() may actually care about the distinction;
putting this into resolve_ref() is the minimal fix for now.
The new tests in t1401 use git-symbolic-ref, which is the
most direct way to check the resolution by itself.
Interestingly we actually had a test that setup this case
already, but we only used it to verify that the funny state
could be overwritten, not that it could be resolved.
We also add a new test in t3200, as "branch -m" was the
original motivation for looking into this. What happens is
this:
0. HEAD is pointing to branch "a"
1. The user asks to rename "a" to "a/b".
2. We create "a/b" and delete "a".
3. We then try to update any worktree HEADs that point to
the renamed ref (including the main repo HEAD). To do
that, we have to resolve each HEAD. But now our HEAD is
pointing at "a", and we get EISDIR due to the loose
"a/b". As a result, we think there is no HEAD, and we
do not update it. It now points to the bogus "a".
Interestingly this case used to work, but only accidentally.
Before 31824d180d (branch: fix branch renaming not updating
HEADs correctly, 2017-08-24), we'd update any HEAD which we
couldn't resolve. That was wrong, but it papered over the
fact that we were incorrectly failing to resolve HEAD.
So while the bug demonstrated by the git-symbolic-ref is
quite old, the regression to "branch -m" is recent.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06 16:42:17 +02:00
|
|
|
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:07:09 +02:00
|
|
|
oidclr(oid);
|
2016-04-07 21:03:10 +02:00
|
|
|
if (*flags & REF_BAD_NAME)
|
|
|
|
*flags |= REF_ISBROKEN;
|
|
|
|
return refname;
|
|
|
|
}
|
|
|
|
|
|
|
|
*flags |= read_flags;
|
|
|
|
|
|
|
|
if (!(read_flags & REF_ISSYMREF)) {
|
|
|
|
if (*flags & REF_BAD_NAME) {
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:07:09 +02:00
|
|
|
oidclr(oid);
|
2016-04-07 21:03:10 +02:00
|
|
|
*flags |= REF_ISBROKEN;
|
|
|
|
}
|
|
|
|
return refname;
|
|
|
|
}
|
|
|
|
|
|
|
|
refname = sb_refname.buf;
|
|
|
|
if (resolve_flags & RESOLVE_REF_NO_RECURSE) {
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:07:09 +02:00
|
|
|
oidclr(oid);
|
2016-04-07 21:03:10 +02:00
|
|
|
return refname;
|
|
|
|
}
|
|
|
|
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) {
|
|
|
|
if (!(resolve_flags & RESOLVE_REF_ALLOW_BAD_NAME) ||
|
2022-01-26 15:37:01 +01:00
|
|
|
!refname_is_safe(refname))
|
2016-04-07 21:03:10 +02:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
*flags |= REF_ISBROKEN | REF_BAD_NAME;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2016-09-04 18:08:41 +02:00
|
|
|
/* backend functions */
|
|
|
|
int refs_init_db(struct strbuf *err)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
struct ref_store *refs = get_main_ref_store(the_repository);
|
2016-09-04 18:08:41 +02:00
|
|
|
|
|
|
|
return refs->be->init_db(refs, err);
|
|
|
|
}
|
|
|
|
|
2016-09-04 18:08:21 +02:00
|
|
|
const char *resolve_ref_unsafe(const char *refname, int resolve_flags,
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:07:09 +02:00
|
|
|
struct object_id *oid, int *flags)
|
2016-09-04 18:08:21 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_resolve_ref_unsafe(get_main_ref_store(the_repository), refname,
|
2022-01-26 15:37:01 +01:00
|
|
|
resolve_flags, oid, flags);
|
2016-09-04 18:08:21 +02:00
|
|
|
}
|
|
|
|
|
2016-09-04 18:08:24 +02:00
|
|
|
int resolve_gitlink_ref(const char *submodule, const char *refname,
|
refs: convert resolve_gitlink_ref to struct object_id
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 00:07:07 +02:00
|
|
|
struct object_id *oid)
|
2016-09-04 18:08:22 +02:00
|
|
|
{
|
|
|
|
struct ref_store *refs;
|
|
|
|
int flags;
|
|
|
|
|
2017-08-23 14:36:54 +02:00
|
|
|
refs = get_submodule_ref_store(submodule);
|
2016-09-04 18:08:23 +02:00
|
|
|
|
2016-09-04 18:08:22 +02:00
|
|
|
if (!refs)
|
|
|
|
return -1;
|
|
|
|
|
2022-01-26 15:37:01 +01:00
|
|
|
if (!refs_resolve_ref_unsafe(refs, refname, 0, oid, &flags) ||
|
|
|
|
is_null_oid(oid))
|
2016-09-04 18:08:22 +02:00
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
struct ref_store_hash_entry
|
2017-02-10 12:16:15 +01:00
|
|
|
{
|
2019-10-07 01:30:43 +02:00
|
|
|
struct hashmap_entry ent;
|
2017-02-10 12:16:15 +01:00
|
|
|
|
|
|
|
struct ref_store *refs;
|
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
/* NUL-terminated identifier of the ref store: */
|
|
|
|
char name[FLEX_ARRAY];
|
2017-02-10 12:16:15 +01:00
|
|
|
};
|
|
|
|
|
2022-08-25 19:09:48 +02:00
|
|
|
static int ref_store_hash_cmp(const void *cmp_data UNUSED,
|
2019-10-07 01:30:37 +02:00
|
|
|
const struct hashmap_entry *eptr,
|
|
|
|
const struct hashmap_entry *entry_or_key,
|
2017-02-10 12:16:15 +01:00
|
|
|
const void *keydata)
|
|
|
|
{
|
2019-10-07 01:30:37 +02:00
|
|
|
const struct ref_store_hash_entry *e1, *e2;
|
|
|
|
const char *name;
|
|
|
|
|
|
|
|
e1 = container_of(eptr, const struct ref_store_hash_entry, ent);
|
|
|
|
e2 = container_of(entry_or_key, const struct ref_store_hash_entry, ent);
|
|
|
|
name = keydata ? keydata : e2->name;
|
2017-02-10 12:16:15 +01:00
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
return strcmp(e1->name, name);
|
2017-02-10 12:16:15 +01:00
|
|
|
}
|
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
static struct ref_store_hash_entry *alloc_ref_store_hash_entry(
|
|
|
|
const char *name, struct ref_store *refs)
|
2017-02-10 12:16:15 +01:00
|
|
|
{
|
2017-04-04 12:21:20 +02:00
|
|
|
struct ref_store_hash_entry *entry;
|
2017-02-10 12:16:15 +01:00
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
FLEX_ALLOC_STR(entry, name, name);
|
2019-10-07 01:30:27 +02:00
|
|
|
hashmap_entry_init(&entry->ent, strhash(name));
|
2017-02-10 12:16:15 +01:00
|
|
|
entry->refs = refs;
|
|
|
|
return entry;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* A hashmap of ref_stores, stored by submodule name: */
|
|
|
|
static struct hashmap submodule_ref_stores;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-04-24 12:01:22 +02:00
|
|
|
/* A hashmap of ref_stores, stored by worktree id: */
|
|
|
|
static struct hashmap worktree_ref_stores;
|
|
|
|
|
2017-02-10 12:16:12 +01:00
|
|
|
/*
|
2017-04-04 12:21:20 +02:00
|
|
|
* Look up a ref store by name. If that ref_store hasn't been
|
|
|
|
* registered yet, return NULL.
|
2017-02-10 12:16:12 +01:00
|
|
|
*/
|
2017-04-04 12:21:20 +02:00
|
|
|
static struct ref_store *lookup_ref_store_map(struct hashmap *map,
|
|
|
|
const char *name)
|
2016-09-04 18:08:11 +02:00
|
|
|
{
|
2017-04-04 12:21:20 +02:00
|
|
|
struct ref_store_hash_entry *entry;
|
2019-10-07 01:30:36 +02:00
|
|
|
unsigned int hash;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
if (!map->tablesize)
|
2017-02-10 12:16:15 +01:00
|
|
|
/* It's initialized on demand in register_ref_store(). */
|
|
|
|
return NULL;
|
2017-02-10 12:16:11 +01:00
|
|
|
|
2019-10-07 01:30:36 +02:00
|
|
|
hash = strhash(name);
|
|
|
|
entry = hashmap_get_entry_from_hash(map, hash, name,
|
|
|
|
struct ref_store_hash_entry, ent);
|
2017-02-10 12:16:15 +01:00
|
|
|
return entry ? entry->refs : NULL;
|
2016-09-04 18:08:11 +02:00
|
|
|
}
|
|
|
|
|
2017-02-10 12:16:12 +01:00
|
|
|
/*
|
|
|
|
* Create, record, and return a ref_store instance for the specified
|
2017-03-26 04:42:31 +02:00
|
|
|
* gitdir.
|
2017-02-10 12:16:12 +01:00
|
|
|
*/
|
2021-10-08 23:08:14 +02:00
|
|
|
static struct ref_store *ref_store_init(struct repository *repo,
|
|
|
|
const char *gitdir,
|
2017-03-26 04:42:32 +02:00
|
|
|
unsigned int flags)
|
2016-09-04 18:08:11 +02:00
|
|
|
{
|
|
|
|
const char *be_name = "files";
|
|
|
|
struct ref_storage_be *be = find_ref_storage_backend(be_name);
|
2017-02-10 12:16:14 +01:00
|
|
|
struct ref_store *refs;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
|
|
|
if (!be)
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("reference backend %s is unknown", be_name);
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2021-10-08 23:08:14 +02:00
|
|
|
refs = be->init(repo, gitdir, flags);
|
2017-02-10 12:16:14 +01:00
|
|
|
return refs;
|
2016-09-04 18:08:11 +02:00
|
|
|
}
|
|
|
|
|
2018-04-12 02:21:14 +02:00
|
|
|
struct ref_store *get_main_ref_store(struct repository *r)
|
2017-03-26 04:42:25 +02:00
|
|
|
{
|
repository: mark the "refs" pointer as private
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10 05:04:11 +02:00
|
|
|
if (r->refs_private)
|
|
|
|
return r->refs_private;
|
2017-03-26 04:42:25 +02:00
|
|
|
|
2018-05-19 00:25:53 +02:00
|
|
|
if (!r->gitdir)
|
|
|
|
BUG("attempting to get main_ref_store outside of repository");
|
|
|
|
|
2021-10-08 23:08:14 +02:00
|
|
|
r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
|
2020-09-09 12:15:08 +02:00
|
|
|
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
|
repository: mark the "refs" pointer as private
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10 05:04:11 +02:00
|
|
|
return r->refs_private;
|
2017-03-26 04:42:28 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2017-04-04 12:21:20 +02:00
|
|
|
* Associate a ref store with a name. It is a fatal error to call this
|
|
|
|
* function twice for the same name.
|
2017-03-26 04:42:28 +02:00
|
|
|
*/
|
2017-04-04 12:21:20 +02:00
|
|
|
static void register_ref_store_map(struct hashmap *map,
|
|
|
|
const char *type,
|
|
|
|
struct ref_store *refs,
|
|
|
|
const char *name)
|
2017-03-26 04:42:28 +02:00
|
|
|
{
|
2019-10-07 01:30:32 +02:00
|
|
|
struct ref_store_hash_entry *entry;
|
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
if (!map->tablesize)
|
2017-06-30 21:14:05 +02:00
|
|
|
hashmap_init(map, ref_store_hash_cmp, NULL, 0);
|
2017-03-26 04:42:28 +02:00
|
|
|
|
2019-10-07 01:30:32 +02:00
|
|
|
entry = alloc_ref_store_hash_entry(name, refs);
|
|
|
|
if (hashmap_put(map, &entry->ent))
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("%s ref_store '%s' initialized twice", type, name);
|
2017-03-26 04:42:25 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:33 +02:00
|
|
|
struct ref_store *get_submodule_ref_store(const char *submodule)
|
2016-09-04 18:08:11 +02:00
|
|
|
{
|
2017-03-26 04:42:27 +02:00
|
|
|
struct strbuf submodule_sb = STRBUF_INIT;
|
2016-09-04 18:08:11 +02:00
|
|
|
struct ref_store *refs;
|
2017-08-23 14:36:54 +02:00
|
|
|
char *to_free = NULL;
|
|
|
|
size_t len;
|
2021-10-08 23:08:14 +02:00
|
|
|
struct repository *subrepo;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-08-23 14:37:03 +02:00
|
|
|
if (!submodule)
|
|
|
|
return NULL;
|
|
|
|
|
2017-08-23 14:37:04 +02:00
|
|
|
len = strlen(submodule);
|
|
|
|
while (len && is_dir_sep(submodule[len - 1]))
|
|
|
|
len--;
|
|
|
|
if (!len)
|
|
|
|
return NULL;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-08-23 14:36:54 +02:00
|
|
|
if (submodule[len])
|
|
|
|
/* We need to strip off one or more trailing slashes */
|
|
|
|
submodule = to_free = xmemdupz(submodule, len);
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-04-04 12:21:20 +02:00
|
|
|
refs = lookup_ref_store_map(&submodule_ref_stores, submodule);
|
2017-03-26 04:42:27 +02:00
|
|
|
if (refs)
|
2017-08-23 14:36:53 +02:00
|
|
|
goto done;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-03-26 04:42:27 +02:00
|
|
|
strbuf_addstr(&submodule_sb, submodule);
|
2017-08-23 14:36:53 +02:00
|
|
|
if (!is_nonbare_repository_dir(&submodule_sb))
|
|
|
|
goto done;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2017-08-23 14:36:53 +02:00
|
|
|
if (submodule_to_gitdir(&submodule_sb, submodule))
|
|
|
|
goto done;
|
2016-09-04 18:08:11 +02:00
|
|
|
|
2021-10-08 23:08:14 +02:00
|
|
|
subrepo = xmalloc(sizeof(*subrepo));
|
|
|
|
/*
|
|
|
|
* NEEDSWORK: Make get_submodule_ref_store() work with arbitrary
|
|
|
|
* superprojects other than the_repository. This probably should be
|
|
|
|
* done by making it take a struct repository * parameter instead of a
|
|
|
|
* submodule path.
|
|
|
|
*/
|
|
|
|
if (repo_submodule_init(subrepo, the_repository, submodule,
|
|
|
|
null_oid())) {
|
|
|
|
free(subrepo);
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
refs = ref_store_init(subrepo, submodule_sb.buf,
|
2017-03-26 04:42:32 +02:00
|
|
|
REF_STORE_READ | REF_STORE_ODB);
|
2017-04-04 12:21:20 +02:00
|
|
|
register_ref_store_map(&submodule_ref_stores, "submodule",
|
|
|
|
refs, submodule);
|
2017-03-26 04:42:31 +02:00
|
|
|
|
2017-08-23 14:36:53 +02:00
|
|
|
done:
|
2017-03-26 04:42:31 +02:00
|
|
|
strbuf_release(&submodule_sb);
|
2017-08-23 14:36:54 +02:00
|
|
|
free(to_free);
|
|
|
|
|
2016-09-04 18:08:11 +02:00
|
|
|
return refs;
|
|
|
|
}
|
|
|
|
|
2017-04-24 12:01:22 +02:00
|
|
|
struct ref_store *get_worktree_ref_store(const struct worktree *wt)
|
|
|
|
{
|
|
|
|
struct ref_store *refs;
|
|
|
|
const char *id;
|
|
|
|
|
|
|
|
if (wt->is_current)
|
2018-04-12 02:21:09 +02:00
|
|
|
return get_main_ref_store(the_repository);
|
2017-04-24 12:01:22 +02:00
|
|
|
|
|
|
|
id = wt->id ? wt->id : "/";
|
|
|
|
refs = lookup_ref_store_map(&worktree_ref_stores, id);
|
|
|
|
if (refs)
|
|
|
|
return refs;
|
|
|
|
|
|
|
|
if (wt->id)
|
2021-10-08 23:08:14 +02:00
|
|
|
refs = ref_store_init(the_repository,
|
|
|
|
git_common_path("worktrees/%s", wt->id),
|
2017-04-24 12:01:22 +02:00
|
|
|
REF_STORE_ALL_CAPS);
|
|
|
|
else
|
2021-10-08 23:08:14 +02:00
|
|
|
refs = ref_store_init(the_repository,
|
|
|
|
get_git_common_dir(),
|
2017-04-24 12:01:22 +02:00
|
|
|
REF_STORE_ALL_CAPS);
|
|
|
|
|
|
|
|
if (refs)
|
|
|
|
register_ref_store_map(&worktree_ref_stores, "worktree",
|
|
|
|
refs, id);
|
|
|
|
return refs;
|
|
|
|
}
|
|
|
|
|
2021-12-22 19:11:54 +01:00
|
|
|
void base_ref_store_init(struct ref_store *refs, struct repository *repo,
|
|
|
|
const char *path, const struct ref_storage_be *be)
|
2016-09-04 18:08:11 +02:00
|
|
|
{
|
2017-02-10 12:16:11 +01:00
|
|
|
refs->be = be;
|
2021-12-22 19:11:54 +01:00
|
|
|
refs->repo = repo;
|
|
|
|
refs->gitdir = xstrdup(path);
|
2016-09-04 18:08:11 +02:00
|
|
|
}
|
2016-09-04 18:08:16 +02:00
|
|
|
|
|
|
|
/* backend functions */
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_pack_refs(struct ref_store *refs, unsigned int flags)
|
2016-09-04 18:08:27 +02:00
|
|
|
{
|
|
|
|
return refs->be->pack_refs(refs, flags);
|
|
|
|
}
|
|
|
|
|
refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:
- it's typically used by ref iteration callbacks that have both a
refname and oid. But since they pass only the refname, we may load
the ref value from the filesystem again. This is inefficient, but
also means we are open to a race if somebody simultaneously updates
the ref. E.g., this:
int some_ref_cb(const char *refname, const struct object_id *oid, ...)
{
if (!peel_ref(refname, &peeled))
printf("%s peels to %s",
oid_to_hex(oid), oid_to_hex(&peeled);
}
could print nonsense. It is correct to say "refname peels to..."
(you may see the "before" value or the "after" value, either of
which is consistent), but mentioning both oids may be mixing
before/after values.
Worse, whether this is possible depends on whether the optimization
to read from the current iterator value kicks in. So it is actually
not possible with:
for_each_ref(some_ref_cb);
but it _is_ possible with:
head_ref(some_ref_cb);
which does not use the iterator mechanism (though in practice, HEAD
should never peel to anything, so this may not be triggerable).
- it must take a fully-qualified refname for the read_ref_full() code
path to work. Yet we routinely pass it partial refnames from
callbacks to for_each_tag_ref(), etc. This happens to work when
iterating because there we do not call read_ref_full() at all, and
only use the passed refname to check if it is the same as the
iterator. But the requirements for the function parameters are quite
unclear.
Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().
There are two other directions I considered but rejected:
- we could pass the peel information into the each_ref_fn callback.
However, we don't know if the caller actually wants it or not. For
packed-refs, providing it is essentially free. But for loose refs,
we actually have to peel the object, which would be wasteful in most
cases. We could likewise pass in a flag to the callback indicating
whether the peeled information is known, but that complicates those
callbacks, as they then have to decide whether to manually peel
themselves. Plus it requires changing the interface of every
callback, whether they care about peeling or not, and there are many
of them.
- we could make a function to return the peeled value of the current
iterated ref (computing it if necessary), and BUG() otherwise. I.e.:
int peel_current_iterated_ref(struct object_id *out);
Each of the current callers is an each_ref_fn callback, so they'd
mostly be happy. But:
- we use those callbacks with functions like head_ref(), which do
not use the iteration code. So we'd need to handle the fallback
case there, anyway.
- it's possible that a caller would want to call into generic code
that sometimes is used during iteration and sometimes not. This
encapsulates the logic to do the fast thing when possible, and
fallback when necessary.
The implementation is mostly obvious, but I want to call out a few
things in the patch:
- the test-tool coverage for peel_ref() is now meaningless, as it all
collapses to a single peel_object() call (arguably they were pretty
uninteresting before; the tricky part of that function is the
fast-path we see during iteration, but these calls didn't trigger
that). I've just dropped it entirely, though note that some other
tests relied on the tags we created; I've moved that creation to the
tests where it matters.
- we no longer need to take a ref_store parameter, since we'd never
look up a ref now. We do still rely on a global "current iterator"
variable which _could_ be kept per-ref-store. But in practice this
is only useful if there are multiple recursive iterations, at which
point the more appropriate solution is probably a stack of
iterators. No caller used the actual ref-store parameter anyway
(they all call the wrapper that passes the_repository).
- the original only kicked in the optimization when the "refname"
pointer matched (i.e., not string comparison). We do likewise with
the "oid" parameter here, but fall back to doing an actual oideq()
call. This in theory lets us kick in the optimization more often,
though in practice no current caller cares. It should never be
wrong, though (peeling is a property of an object, so two refs
pointing to the same object would peel identically).
- the original took care not to touch the peeled out-parameter unless
we found something to put in it. But no caller cares about this, and
anyway, it is enforced by peel_object() itself (and even in the
optimized iterator case, that's where we eventually end up). We can
shorten the code and avoid an extra copy by just passing the
out-parameter through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 20:44:43 +01:00
|
|
|
int peel_iterated_oid(const struct object_id *base, struct object_id *peeled)
|
2017-03-26 04:42:34 +02:00
|
|
|
{
|
refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:
- it's typically used by ref iteration callbacks that have both a
refname and oid. But since they pass only the refname, we may load
the ref value from the filesystem again. This is inefficient, but
also means we are open to a race if somebody simultaneously updates
the ref. E.g., this:
int some_ref_cb(const char *refname, const struct object_id *oid, ...)
{
if (!peel_ref(refname, &peeled))
printf("%s peels to %s",
oid_to_hex(oid), oid_to_hex(&peeled);
}
could print nonsense. It is correct to say "refname peels to..."
(you may see the "before" value or the "after" value, either of
which is consistent), but mentioning both oids may be mixing
before/after values.
Worse, whether this is possible depends on whether the optimization
to read from the current iterator value kicks in. So it is actually
not possible with:
for_each_ref(some_ref_cb);
but it _is_ possible with:
head_ref(some_ref_cb);
which does not use the iterator mechanism (though in practice, HEAD
should never peel to anything, so this may not be triggerable).
- it must take a fully-qualified refname for the read_ref_full() code
path to work. Yet we routinely pass it partial refnames from
callbacks to for_each_tag_ref(), etc. This happens to work when
iterating because there we do not call read_ref_full() at all, and
only use the passed refname to check if it is the same as the
iterator. But the requirements for the function parameters are quite
unclear.
Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().
There are two other directions I considered but rejected:
- we could pass the peel information into the each_ref_fn callback.
However, we don't know if the caller actually wants it or not. For
packed-refs, providing it is essentially free. But for loose refs,
we actually have to peel the object, which would be wasteful in most
cases. We could likewise pass in a flag to the callback indicating
whether the peeled information is known, but that complicates those
callbacks, as they then have to decide whether to manually peel
themselves. Plus it requires changing the interface of every
callback, whether they care about peeling or not, and there are many
of them.
- we could make a function to return the peeled value of the current
iterated ref (computing it if necessary), and BUG() otherwise. I.e.:
int peel_current_iterated_ref(struct object_id *out);
Each of the current callers is an each_ref_fn callback, so they'd
mostly be happy. But:
- we use those callbacks with functions like head_ref(), which do
not use the iteration code. So we'd need to handle the fallback
case there, anyway.
- it's possible that a caller would want to call into generic code
that sometimes is used during iteration and sometimes not. This
encapsulates the logic to do the fast thing when possible, and
fallback when necessary.
The implementation is mostly obvious, but I want to call out a few
things in the patch:
- the test-tool coverage for peel_ref() is now meaningless, as it all
collapses to a single peel_object() call (arguably they were pretty
uninteresting before; the tricky part of that function is the
fast-path we see during iteration, but these calls didn't trigger
that). I've just dropped it entirely, though note that some other
tests relied on the tags we created; I've moved that creation to the
tests where it matters.
- we no longer need to take a ref_store parameter, since we'd never
look up a ref now. We do still rely on a global "current iterator"
variable which _could_ be kept per-ref-store. But in practice this
is only useful if there are multiple recursive iterations, at which
point the more appropriate solution is probably a stack of
iterators. No caller used the actual ref-store parameter anyway
(they all call the wrapper that passes the_repository).
- the original only kicked in the optimization when the "refname"
pointer matched (i.e., not string comparison). We do likewise with
the "oid" parameter here, but fall back to doing an actual oideq()
call. This in theory lets us kick in the optimization more often,
though in practice no current caller cares. It should never be
wrong, though (peeling is a property of an object, so two refs
pointing to the same object would peel identically).
- the original took care not to touch the peeled out-parameter unless
we found something to put in it. But no caller cares about this, and
anyway, it is enforced by peel_object() itself (and even in the
optimized iterator case, that's where we eventually end up). We can
shorten the code and avoid an extra copy by just passing the
out-parameter through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 20:44:43 +01:00
|
|
|
if (current_ref_iter &&
|
|
|
|
(current_ref_iter->oid == base ||
|
|
|
|
oideq(current_ref_iter->oid, base)))
|
|
|
|
return ref_iterator_peel(current_ref_iter, peeled);
|
2017-09-25 10:00:14 +02:00
|
|
|
|
2021-05-19 17:31:28 +02:00
|
|
|
return peel_object(base, peeled) ? -1 : 0;
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
2016-09-04 18:08:29 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_create_symref(struct ref_store *refs,
|
|
|
|
const char *ref_target,
|
|
|
|
const char *refs_heads_master,
|
|
|
|
const char *logmsg)
|
|
|
|
{
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
char *msg;
|
|
|
|
int retval;
|
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
|
|
|
retval = refs->be->create_symref(refs, ref_target, refs_heads_master,
|
|
|
|
msg);
|
|
|
|
free(msg);
|
|
|
|
return retval;
|
2016-09-04 18:08:29 +02:00
|
|
|
}
|
|
|
|
|
2016-09-04 18:08:28 +02:00
|
|
|
int create_symref(const char *ref_target, const char *refs_heads_master,
|
|
|
|
const char *logmsg)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_create_symref(get_main_ref_store(the_repository), ref_target,
|
2017-03-26 04:42:34 +02:00
|
|
|
refs_heads_master, logmsg);
|
2016-09-04 18:08:28 +02:00
|
|
|
}
|
|
|
|
|
2017-05-22 16:17:45 +02:00
|
|
|
int ref_update_reject_duplicates(struct string_list *refnames,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
2017-05-22 16:17:46 +02:00
|
|
|
size_t i, n = refnames->nr;
|
2017-05-22 16:17:45 +02:00
|
|
|
|
|
|
|
assert(err);
|
|
|
|
|
2017-05-22 16:17:47 +02:00
|
|
|
for (i = 1; i < n; i++) {
|
|
|
|
int cmp = strcmp(refnames->items[i - 1].string,
|
|
|
|
refnames->items[i].string);
|
|
|
|
|
|
|
|
if (!cmp) {
|
2017-05-22 16:17:45 +02:00
|
|
|
strbuf_addf(err,
|
2018-07-21 09:49:35 +02:00
|
|
|
_("multiple updates for ref '%s' not allowed"),
|
2017-05-22 16:17:45 +02:00
|
|
|
refnames->items[i].string);
|
|
|
|
return 1;
|
2017-05-22 16:17:47 +02:00
|
|
|
} else if (cmp > 0) {
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("ref_update_reject_duplicates() received unsorted list");
|
2017-05-22 16:17:45 +02:00
|
|
|
}
|
2017-05-22 16:17:47 +02:00
|
|
|
}
|
2017-05-22 16:17:45 +02:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
static int run_transaction_hook(struct ref_transaction *transaction,
|
|
|
|
const char *state)
|
|
|
|
{
|
|
|
|
struct child_process proc = CHILD_PROCESS_INIT;
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
refs: remove lookup cache for reference-transaction hook
When adding the reference-transaction hook, there were concerns about
the performance impact it may have on setups which do not make use of
the new hook at all. After all, it gets executed every time a reftx is
prepared, committed or aborted, which linearly scales with the number of
reference-transactions created per session. And as there are code paths
like `git push` which create a new transaction for each reference to be
updated, this may translate to calling `find_hook()` quite a lot.
To address this concern, a cache was added with the intention to not
repeatedly do negative hook lookups. Turns out this cache caused a
regression, which was fixed via e5256c82e5 (refs: fix interleaving hook
calls with reference-transaction hook, 2020-08-07). In the process of
discussing the fix, we realized that the cache doesn't really help even
in the negative-lookup case. While performance tests added to benchmark
this did show a slight improvement in the 1% range, this really doesn't
warrent having a cache. Furthermore, it's quite flaky, too. E.g. running
it twice in succession produces the following results:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2%
1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5%
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5%
1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0%
One case notably absent from those benchmarks is a single executable
searching for the hook hundreds of times, which is exactly the case for
which the negative cache was added. p1400.2 will spawn a new update-ref
for each transaction and p1400.3 only has a single reference-transaction
for all reference updates. So this commit adds a third benchmark, which
performs an non-atomic push of a thousand references. This will create a
new reference transaction per reference. But even for this case, the
negative cache doesn't consistently improve performance:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7%
1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6%
1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2%
So let's just remove the cache altogether to simplify the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-25 12:35:24 +02:00
|
|
|
const char *hook;
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
int ret = 0, i;
|
|
|
|
|
refs: remove lookup cache for reference-transaction hook
When adding the reference-transaction hook, there were concerns about
the performance impact it may have on setups which do not make use of
the new hook at all. After all, it gets executed every time a reftx is
prepared, committed or aborted, which linearly scales with the number of
reference-transactions created per session. And as there are code paths
like `git push` which create a new transaction for each reference to be
updated, this may translate to calling `find_hook()` quite a lot.
To address this concern, a cache was added with the intention to not
repeatedly do negative hook lookups. Turns out this cache caused a
regression, which was fixed via e5256c82e5 (refs: fix interleaving hook
calls with reference-transaction hook, 2020-08-07). In the process of
discussing the fix, we realized that the cache doesn't really help even
in the negative-lookup case. While performance tests added to benchmark
this did show a slight improvement in the 1% range, this really doesn't
warrent having a cache. Furthermore, it's quite flaky, too. E.g. running
it twice in succession produces the following results:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2%
1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5%
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5%
1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0%
One case notably absent from those benchmarks is a single executable
searching for the hook hundreds of times, which is exactly the case for
which the negative cache was added. p1400.2 will spawn a new update-ref
for each transaction and p1400.3 only has a single reference-transaction
for all reference updates. So this commit adds a third benchmark, which
performs an non-atomic push of a thousand references. This will create a
new reference transaction per reference. But even for this case, the
negative cache doesn't consistently improve performance:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7%
1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6%
1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2%
So let's just remove the cache altogether to simplify the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-25 12:35:24 +02:00
|
|
|
hook = find_hook("reference-transaction");
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
if (!hook)
|
|
|
|
return ret;
|
|
|
|
|
2020-07-28 22:25:12 +02:00
|
|
|
strvec_pushl(&proc.args, hook, state, NULL);
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
proc.in = -1;
|
|
|
|
proc.stdout_to_stderr = 1;
|
|
|
|
proc.trace2_hook_name = "reference-transaction";
|
|
|
|
|
|
|
|
ret = start_command(&proc);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
sigchain_push(SIGPIPE, SIG_IGN);
|
|
|
|
|
|
|
|
for (i = 0; i < transaction->nr; i++) {
|
|
|
|
struct ref_update *update = transaction->updates[i];
|
|
|
|
|
|
|
|
strbuf_reset(&buf);
|
|
|
|
strbuf_addf(&buf, "%s %s %s\n",
|
|
|
|
oid_to_hex(&update->old_oid),
|
|
|
|
oid_to_hex(&update->new_oid),
|
|
|
|
update->refname);
|
|
|
|
|
|
|
|
if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
|
2021-10-16 11:39:25 +02:00
|
|
|
if (errno != EPIPE) {
|
|
|
|
/* Don't leak errno outside this API */
|
|
|
|
errno = 0;
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
ret = -1;
|
2021-10-16 11:39:25 +02:00
|
|
|
}
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
close(proc.in);
|
|
|
|
sigchain_pop(SIGPIPE);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
|
|
|
|
ret |= finish_command(&proc);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
int ref_transaction_prepare(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
2016-09-04 18:08:16 +02:00
|
|
|
{
|
2017-03-26 04:42:35 +02:00
|
|
|
struct ref_store *refs = transaction->ref_store;
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
int ret;
|
2016-09-04 18:08:16 +02:00
|
|
|
|
2017-05-22 16:17:43 +02:00
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
/* Good. */
|
|
|
|
break;
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
case REF_TRANSACTION_PREPARED:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("prepare called twice on reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
2017-05-22 16:17:43 +02:00
|
|
|
case REF_TRANSACTION_CLOSED:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("prepare called on a closed reference transaction");
|
2017-05-22 16:17:43 +02:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("unexpected reference transaction state");
|
2017-05-22 16:17:43 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2021-12-06 23:05:05 +01:00
|
|
|
if (refs->repo->objects->odb->disable_ref_updates) {
|
2017-04-11 00:14:12 +02:00
|
|
|
strbuf_addstr(err,
|
|
|
|
_("ref updates forbidden inside quarantine environment"));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
ret = refs->be->transaction_prepare(refs, transaction, err);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = run_transaction_hook(transaction, "prepared");
|
|
|
|
if (ret) {
|
|
|
|
ref_transaction_abort(transaction, err);
|
|
|
|
die(_("ref updates aborted by hook"));
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int ref_transaction_abort(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
|
|
|
struct ref_store *refs = transaction->ref_store;
|
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
/* No need to abort explicitly. */
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_PREPARED:
|
|
|
|
ret = refs->be->transaction_abort(refs, transaction, err);
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_CLOSED:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("abort called on a closed reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("unexpected reference transaction state");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
run_transaction_hook(transaction, "aborted");
|
|
|
|
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
ref_transaction_free(transaction);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ref_transaction_commit(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
|
|
|
struct ref_store *refs = transaction->ref_store;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
/* Need to prepare first. */
|
|
|
|
ret = ref_transaction_prepare(transaction, err);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_PREPARED:
|
|
|
|
/* Fall through to finish. */
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_CLOSED:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("commit called on a closed reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("unexpected reference transaction state");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 16:17:44 +02:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 08:56:14 +02:00
|
|
|
ret = refs->be->transaction_finish(refs, transaction, err);
|
|
|
|
if (!ret)
|
|
|
|
run_transaction_hook(transaction, "committed");
|
|
|
|
return ret;
|
2016-09-04 18:08:16 +02:00
|
|
|
}
|
2016-09-04 18:08:26 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_verify_refname_available(struct ref_store *refs,
|
|
|
|
const char *refname,
|
2017-04-16 08:41:26 +02:00
|
|
|
const struct string_list *extras,
|
2017-03-26 04:42:34 +02:00
|
|
|
const struct string_list *skip,
|
|
|
|
struct strbuf *err)
|
2016-09-04 18:08:26 +02:00
|
|
|
{
|
2017-04-16 08:41:26 +02:00
|
|
|
const char *slash;
|
|
|
|
const char *extra_refname;
|
|
|
|
struct strbuf dirname = STRBUF_INIT;
|
|
|
|
struct strbuf referent = STRBUF_INIT;
|
|
|
|
struct object_id oid;
|
|
|
|
unsigned int type;
|
|
|
|
struct ref_iterator *iter;
|
|
|
|
int ok;
|
|
|
|
int ret = -1;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* For the sake of comments in this function, suppose that
|
|
|
|
* refname is "refs/foo/bar".
|
|
|
|
*/
|
|
|
|
|
|
|
|
assert(err);
|
|
|
|
|
|
|
|
strbuf_grow(&dirname, strlen(refname) + 1);
|
|
|
|
for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, '/')) {
|
2021-10-16 11:39:09 +02:00
|
|
|
/*
|
|
|
|
* Just saying "Is a directory" when we e.g. can't
|
|
|
|
* lock some multi-level ref isn't very informative,
|
|
|
|
* the user won't be told *what* is a directory, so
|
|
|
|
* let's not use strerror() below.
|
|
|
|
*/
|
|
|
|
int ignore_errno;
|
2017-04-16 08:41:26 +02:00
|
|
|
/* Expand dirname to the new prefix, not including the trailing slash: */
|
|
|
|
strbuf_add(&dirname, refname + dirname.len, slash - refname - dirname.len);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We are still at a leading dir of the refname (e.g.,
|
|
|
|
* "refs/foo"; if there is a reference with that name,
|
|
|
|
* it is a conflict, *unless* it is in skip.
|
|
|
|
*/
|
|
|
|
if (skip && string_list_has_string(skip, dirname.buf))
|
|
|
|
continue;
|
|
|
|
|
2021-10-16 11:39:09 +02:00
|
|
|
if (!refs_read_raw_ref(refs, dirname.buf, &oid, &referent,
|
|
|
|
&type, &ignore_errno)) {
|
2018-07-21 09:49:35 +02:00
|
|
|
strbuf_addf(err, _("'%s' exists; cannot create '%s'"),
|
2017-04-16 08:41:26 +02:00
|
|
|
dirname.buf, refname);
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (extras && string_list_has_string(extras, dirname.buf)) {
|
2018-07-21 09:49:35 +02:00
|
|
|
strbuf_addf(err, _("cannot process '%s' and '%s' at the same time"),
|
2017-04-16 08:41:26 +02:00
|
|
|
refname, dirname.buf);
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We are at the leaf of our refname (e.g., "refs/foo/bar").
|
|
|
|
* There is no point in searching for a reference with that
|
|
|
|
* name, because a refname isn't considered to conflict with
|
|
|
|
* itself. But we still need to check for references whose
|
|
|
|
* names are in the "refs/foo/bar/" namespace, because they
|
|
|
|
* *do* conflict.
|
|
|
|
*/
|
|
|
|
strbuf_addstr(&dirname, refname + dirname.len);
|
|
|
|
strbuf_addch(&dirname, '/');
|
|
|
|
|
|
|
|
iter = refs_ref_iterator_begin(refs, dirname.buf, 0,
|
|
|
|
DO_FOR_EACH_INCLUDE_BROKEN);
|
|
|
|
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
|
|
|
|
if (skip &&
|
|
|
|
string_list_has_string(skip, iter->refname))
|
|
|
|
continue;
|
|
|
|
|
2018-07-21 09:49:35 +02:00
|
|
|
strbuf_addf(err, _("'%s' exists; cannot create '%s'"),
|
2017-04-16 08:41:26 +02:00
|
|
|
iter->refname, refname);
|
|
|
|
ref_iterator_abort(iter);
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (ok != ITER_DONE)
|
2018-05-02 11:38:39 +02:00
|
|
|
BUG("error while iterating over references");
|
2017-04-16 08:41:26 +02:00
|
|
|
|
|
|
|
extra_refname = find_descendant_ref(dirname.buf, extras, skip);
|
|
|
|
if (extra_refname)
|
2018-07-21 09:49:35 +02:00
|
|
|
strbuf_addf(err, _("cannot process '%s' and '%s' at the same time"),
|
2017-04-16 08:41:26 +02:00
|
|
|
refname, extra_refname);
|
|
|
|
else
|
|
|
|
ret = 0;
|
|
|
|
|
|
|
|
cleanup:
|
|
|
|
strbuf_release(&referent);
|
|
|
|
strbuf_release(&dirname);
|
|
|
|
return ret;
|
2016-09-04 18:08:26 +02:00
|
|
|
}
|
2016-09-04 18:08:38 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_reflog(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
2016-09-04 18:08:38 +02:00
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
2018-08-20 20:24:16 +02:00
|
|
|
struct do_for_each_ref_help hp = { fn, cb_data };
|
2016-09-04 18:08:38 +02:00
|
|
|
|
|
|
|
iter = refs->be->reflog_iterator_begin(refs);
|
|
|
|
|
2018-08-20 20:24:16 +02:00
|
|
|
return do_for_each_repo_ref_iterator(the_repository, iter,
|
|
|
|
do_for_each_ref_helper, &hp);
|
2016-09-04 18:08:38 +02:00
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int for_each_reflog(each_ref_fn fn, void *cb_data)
|
2016-09-04 18:08:38 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_reflog(get_main_ref_store(the_repository), fn, cb_data);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
2016-09-04 18:08:38 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_for_each_reflog_ent_reverse(struct ref_store *refs,
|
|
|
|
const char *refname,
|
|
|
|
each_reflog_ent_fn fn,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
2016-09-04 18:08:38 +02:00
|
|
|
return refs->be->for_each_reflog_ent_reverse(refs, refname,
|
|
|
|
fn, cb_data);
|
|
|
|
}
|
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int for_each_reflog_ent_reverse(const char *refname, each_reflog_ent_fn fn,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_reflog_ent_reverse(get_main_ref_store(the_repository),
|
2017-03-26 04:42:34 +02:00
|
|
|
refname, fn, cb_data);
|
|
|
|
}
|
|
|
|
|
|
|
|
int refs_for_each_reflog_ent(struct ref_store *refs, const char *refname,
|
|
|
|
each_reflog_ent_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs->be->for_each_reflog_ent(refs, refname, fn, cb_data);
|
|
|
|
}
|
|
|
|
|
2016-09-04 18:08:38 +02:00
|
|
|
int for_each_reflog_ent(const char *refname, each_reflog_ent_fn fn,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_for_each_reflog_ent(get_main_ref_store(the_repository), refname,
|
2017-03-26 04:42:34 +02:00
|
|
|
fn, cb_data);
|
|
|
|
}
|
2016-09-04 18:08:38 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_reflog_exists(struct ref_store *refs, const char *refname)
|
|
|
|
{
|
|
|
|
return refs->be->reflog_exists(refs, refname);
|
2016-09-04 18:08:38 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int reflog_exists(const char *refname)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_reflog_exists(get_main_ref_store(the_repository), refname);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
2016-09-04 18:08:38 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_create_reflog(struct ref_store *refs, const char *refname,
|
2021-11-22 15:19:08 +01:00
|
|
|
struct strbuf *err)
|
2017-03-26 04:42:34 +02:00
|
|
|
{
|
2021-11-22 15:19:08 +01:00
|
|
|
return refs->be->create_reflog(refs, refname, err);
|
2016-09-04 18:08:38 +02:00
|
|
|
}
|
|
|
|
|
2021-11-22 15:19:08 +01:00
|
|
|
int safe_create_reflog(const char *refname, struct strbuf *err)
|
2016-09-04 18:08:38 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_create_reflog(get_main_ref_store(the_repository), refname,
|
2021-11-22 15:19:08 +01:00
|
|
|
err);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
2016-09-04 18:08:38 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_delete_reflog(struct ref_store *refs, const char *refname)
|
|
|
|
{
|
|
|
|
return refs->be->delete_reflog(refs, refname);
|
2016-09-04 18:08:38 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int delete_reflog(const char *refname)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_delete_reflog(get_main_ref_store(the_repository), refname);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
2016-09-04 18:08:38 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_reflog_expire(struct ref_store *refs,
|
2021-08-23 13:36:11 +02:00
|
|
|
const char *refname,
|
2017-03-26 04:42:34 +02:00
|
|
|
unsigned int flags,
|
|
|
|
reflog_expiry_prepare_fn prepare_fn,
|
|
|
|
reflog_expiry_should_prune_fn should_prune_fn,
|
|
|
|
reflog_expiry_cleanup_fn cleanup_fn,
|
|
|
|
void *policy_cb_data)
|
|
|
|
{
|
2021-08-23 13:36:11 +02:00
|
|
|
return refs->be->reflog_expire(refs, refname, flags,
|
2017-03-26 04:42:34 +02:00
|
|
|
prepare_fn, should_prune_fn,
|
|
|
|
cleanup_fn, policy_cb_data);
|
2016-09-04 18:08:38 +02:00
|
|
|
}
|
|
|
|
|
2021-08-23 13:36:11 +02:00
|
|
|
int reflog_expire(const char *refname,
|
2016-09-04 18:08:38 +02:00
|
|
|
unsigned int flags,
|
|
|
|
reflog_expiry_prepare_fn prepare_fn,
|
|
|
|
reflog_expiry_should_prune_fn should_prune_fn,
|
|
|
|
reflog_expiry_cleanup_fn cleanup_fn,
|
|
|
|
void *policy_cb_data)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_reflog_expire(get_main_ref_store(the_repository),
|
2021-08-23 13:36:11 +02:00
|
|
|
refname, flags,
|
2017-03-26 04:42:34 +02:00
|
|
|
prepare_fn, should_prune_fn,
|
|
|
|
cleanup_fn, policy_cb_data);
|
2016-09-04 18:08:38 +02:00
|
|
|
}
|
2016-09-04 18:08:39 +02:00
|
|
|
|
|
|
|
int initial_ref_transaction_commit(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
2017-03-26 04:42:35 +02:00
|
|
|
struct ref_store *refs = transaction->ref_store;
|
2016-09-04 18:08:39 +02:00
|
|
|
|
|
|
|
return refs->be->initial_transaction_commit(refs, transaction, err);
|
|
|
|
}
|
2016-09-04 18:08:40 +02:00
|
|
|
|
2022-02-17 14:04:32 +01:00
|
|
|
void ref_transaction_for_each_queued_update(struct ref_transaction *transaction,
|
|
|
|
ref_transaction_for_each_queued_update_fn cb,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < transaction->nr; i++) {
|
|
|
|
struct ref_update *update = transaction->updates[i];
|
|
|
|
|
|
|
|
cb(update->refname,
|
|
|
|
(update->flags & REF_HAVE_OLD) ? &update->old_oid : NULL,
|
|
|
|
(update->flags & REF_HAVE_NEW) ? &update->new_oid : NULL,
|
|
|
|
cb_data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
int refs_delete_refs(struct ref_store *refs, const char *logmsg,
|
2017-05-22 16:17:38 +02:00
|
|
|
struct string_list *refnames, unsigned int flags)
|
2016-09-04 18:08:40 +02:00
|
|
|
{
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
char *msg;
|
|
|
|
int retval;
|
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
|
|
|
retval = refs->be->delete_refs(refs, msg, refnames, flags);
|
|
|
|
free(msg);
|
|
|
|
return retval;
|
2016-09-04 18:08:40 +02:00
|
|
|
}
|
2016-09-04 18:08:42 +02:00
|
|
|
|
2017-05-22 16:17:38 +02:00
|
|
|
int delete_refs(const char *msg, struct string_list *refnames,
|
|
|
|
unsigned int flags)
|
2016-09-04 18:08:42 +02:00
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_delete_refs(get_main_ref_store(the_repository), msg, refnames, flags);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
2016-09-04 18:08:42 +02:00
|
|
|
|
2017-03-26 04:42:34 +02:00
|
|
|
int refs_rename_ref(struct ref_store *refs, const char *oldref,
|
|
|
|
const char *newref, const char *logmsg)
|
|
|
|
{
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
char *msg;
|
|
|
|
int retval;
|
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
|
|
|
retval = refs->be->rename_ref(refs, oldref, newref, msg);
|
|
|
|
free(msg);
|
|
|
|
return retval;
|
2016-09-04 18:08:42 +02:00
|
|
|
}
|
2017-03-26 04:42:34 +02:00
|
|
|
|
|
|
|
int rename_ref(const char *oldref, const char *newref, const char *logmsg)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_rename_ref(get_main_ref_store(the_repository), oldref, newref, logmsg);
|
2017-03-26 04:42:34 +02:00
|
|
|
}
|
branch: add a --copy (-c) option to go with --move (-m)
Add the ability to --copy a branch and its reflog and configuration,
this uses the same underlying machinery as the --move (-m) option
except the reflog and configuration is copied instead of being moved.
This is useful for e.g. copying a topic branch to a new version,
e.g. work to work-2 after submitting the work topic to the list, while
preserving all the tracking info and other configuration that goes
with the branch, and unlike --move keeping the other already-submitted
branch around for reference.
Like --move, when the source branch is the currently checked out
branch the HEAD is moved to the destination branch. In the case of
--move we don't really have a choice (other than remaining on a
detached HEAD) and in order to keep the functionality consistent, we
are doing it in similar way for --copy too.
The most common usage of this feature is expected to be moving to a
new topic branch which is a copy of the current one, in that case
moving to the target branch is what the user wants, and doesn't
unexpectedly behave differently than --move would.
One outstanding caveat of this implementation is that:
git checkout maint &&
git checkout master &&
git branch -c topic &&
git checkout -
Will check out 'maint' instead of 'master'. This is because the @{-N}
feature (or its -1 shorthand "-") relies on HEAD reflogs created by
the checkout command, so in this case we'll checkout maint instead of
master, as the user might expect. What to do about that is left to a
future change.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Sahil Dua <sahildua2305@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-18 23:19:16 +02:00
|
|
|
|
|
|
|
int refs_copy_existing_ref(struct ref_store *refs, const char *oldref,
|
|
|
|
const char *newref, const char *logmsg)
|
|
|
|
{
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10 19:19:53 +02:00
|
|
|
char *msg;
|
|
|
|
int retval;
|
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
|
|
|
retval = refs->be->copy_ref(refs, oldref, newref, msg);
|
|
|
|
free(msg);
|
|
|
|
return retval;
|
branch: add a --copy (-c) option to go with --move (-m)
Add the ability to --copy a branch and its reflog and configuration,
this uses the same underlying machinery as the --move (-m) option
except the reflog and configuration is copied instead of being moved.
This is useful for e.g. copying a topic branch to a new version,
e.g. work to work-2 after submitting the work topic to the list, while
preserving all the tracking info and other configuration that goes
with the branch, and unlike --move keeping the other already-submitted
branch around for reference.
Like --move, when the source branch is the currently checked out
branch the HEAD is moved to the destination branch. In the case of
--move we don't really have a choice (other than remaining on a
detached HEAD) and in order to keep the functionality consistent, we
are doing it in similar way for --copy too.
The most common usage of this feature is expected to be moving to a
new topic branch which is a copy of the current one, in that case
moving to the target branch is what the user wants, and doesn't
unexpectedly behave differently than --move would.
One outstanding caveat of this implementation is that:
git checkout maint &&
git checkout master &&
git branch -c topic &&
git checkout -
Will check out 'maint' instead of 'master'. This is because the @{-N}
feature (or its -1 shorthand "-") relies on HEAD reflogs created by
the checkout command, so in this case we'll checkout maint instead of
master, as the user might expect. What to do about that is left to a
future change.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Sahil Dua <sahildua2305@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-18 23:19:16 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int copy_existing_ref(const char *oldref, const char *newref, const char *logmsg)
|
|
|
|
{
|
2018-04-12 02:21:09 +02:00
|
|
|
return refs_copy_existing_ref(get_main_ref_store(the_repository), oldref, newref, logmsg);
|
branch: add a --copy (-c) option to go with --move (-m)
Add the ability to --copy a branch and its reflog and configuration,
this uses the same underlying machinery as the --move (-m) option
except the reflog and configuration is copied instead of being moved.
This is useful for e.g. copying a topic branch to a new version,
e.g. work to work-2 after submitting the work topic to the list, while
preserving all the tracking info and other configuration that goes
with the branch, and unlike --move keeping the other already-submitted
branch around for reference.
Like --move, when the source branch is the currently checked out
branch the HEAD is moved to the destination branch. In the case of
--move we don't really have a choice (other than remaining on a
detached HEAD) and in order to keep the functionality consistent, we
are doing it in similar way for --copy too.
The most common usage of this feature is expected to be moving to a
new topic branch which is a copy of the current one, in that case
moving to the target branch is what the user wants, and doesn't
unexpectedly behave differently than --move would.
One outstanding caveat of this implementation is that:
git checkout maint &&
git checkout master &&
git branch -c topic &&
git checkout -
Will check out 'maint' instead of 'master'. This is because the @{-N}
feature (or its -1 shorthand "-") relies on HEAD reflogs created by
the checkout command, so in this case we'll checkout maint instead of
master, as the user might expect. What to do about that is left to a
future change.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Sahil Dua <sahildua2305@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-18 23:19:16 +02:00
|
|
|
}
|