2007-09-11 05:03:15 +02:00
|
|
|
#include "cache.h"
|
2014-10-01 12:28:42 +02:00
|
|
|
#include "lockfile.h"
|
2007-09-11 05:03:15 +02:00
|
|
|
#include "bundle.h"
|
2023-02-24 01:09:27 +01:00
|
|
|
#include "hex.h"
|
2018-05-16 01:42:15 +02:00
|
|
|
#include "object-store.h"
|
2018-06-29 03:21:51 +02:00
|
|
|
#include "repository.h"
|
2007-09-11 05:03:15 +02:00
|
|
|
#include "object.h"
|
|
|
|
#include "commit.h"
|
|
|
|
#include "diff.h"
|
|
|
|
#include "revision.h"
|
|
|
|
#include "list-objects.h"
|
|
|
|
#include "run-command.h"
|
2007-11-23 01:51:18 +01:00
|
|
|
#include "refs.h"
|
2020-07-28 22:23:39 +02:00
|
|
|
#include "strvec.h"
|
2022-03-09 17:01:39 +01:00
|
|
|
#include "list-objects-filter-options.h"
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
#include "connected.h"
|
2020-07-30 01:14:20 +02:00
|
|
|
|
|
|
|
static const char v2_bundle_signature[] = "# v2 git bundle\n";
|
|
|
|
static const char v3_bundle_signature[] = "# v3 git bundle\n";
|
|
|
|
static struct {
|
|
|
|
int version;
|
|
|
|
const char *signature;
|
|
|
|
} bundle_sigs[] = {
|
|
|
|
{ 2, v2_bundle_signature },
|
|
|
|
{ 3, v3_bundle_signature },
|
|
|
|
};
|
2007-09-11 05:03:15 +02:00
|
|
|
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
void bundle_header_init(struct bundle_header *header)
|
2007-09-11 05:03:15 +02:00
|
|
|
{
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
struct bundle_header blank = BUNDLE_HEADER_INIT;
|
|
|
|
memcpy(header, &blank, sizeof(*header));
|
|
|
|
}
|
|
|
|
|
|
|
|
void bundle_header_release(struct bundle_header *header)
|
|
|
|
{
|
|
|
|
string_list_clear(&header->prerequisites, 1);
|
|
|
|
string_list_clear(&header->references, 1);
|
2022-03-09 17:01:39 +01:00
|
|
|
list_objects_filter_release(&header->filter);
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
|
2020-07-30 01:14:20 +02:00
|
|
|
static int parse_capability(struct bundle_header *header, const char *capability)
|
2020-06-19 19:56:00 +02:00
|
|
|
{
|
2020-07-30 01:14:20 +02:00
|
|
|
const char *arg;
|
|
|
|
if (skip_prefix(capability, "object-format=", &arg)) {
|
|
|
|
int algo = hash_algo_by_name(arg);
|
|
|
|
if (algo == GIT_HASH_UNKNOWN)
|
|
|
|
return error(_("unrecognized bundle hash algorithm: %s"), arg);
|
|
|
|
header->hash_algo = &hash_algos[algo];
|
|
|
|
return 0;
|
|
|
|
}
|
2022-03-09 17:01:39 +01:00
|
|
|
if (skip_prefix(capability, "filter=", &arg)) {
|
|
|
|
parse_list_objects_filter(&header->filter, arg);
|
|
|
|
return 0;
|
|
|
|
}
|
2020-07-30 01:14:20 +02:00
|
|
|
return error(_("unknown capability '%s'"), capability);
|
|
|
|
}
|
2020-06-19 19:56:00 +02:00
|
|
|
|
2020-07-30 01:14:20 +02:00
|
|
|
static int parse_bundle_signature(struct bundle_header *header, const char *line)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(bundle_sigs); i++) {
|
|
|
|
if (!strcmp(line, bundle_sigs[i].signature)) {
|
|
|
|
header->version = bundle_sigs[i].version;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return -1;
|
2020-06-19 19:56:00 +02:00
|
|
|
}
|
|
|
|
|
2022-05-16 22:11:05 +02:00
|
|
|
int read_bundle_header_fd(int fd, struct bundle_header *header,
|
|
|
|
const char *report_path)
|
2007-11-09 00:35:32 +01:00
|
|
|
{
|
2011-10-14 00:12:02 +02:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int status = 0;
|
2007-09-11 05:03:15 +02:00
|
|
|
|
2011-10-14 00:12:02 +02:00
|
|
|
/* The bundle header begins with the signature */
|
2012-02-22 20:34:22 +01:00
|
|
|
if (strbuf_getwholeline_fd(&buf, fd, '\n') ||
|
2020-07-30 01:14:20 +02:00
|
|
|
parse_bundle_signature(header, buf.buf)) {
|
2011-10-14 00:19:31 +02:00
|
|
|
if (report_path)
|
2020-07-30 01:14:20 +02:00
|
|
|
error(_("'%s' does not look like a v2 or v3 bundle file"),
|
2011-10-14 00:19:31 +02:00
|
|
|
report_path);
|
2011-10-14 00:12:02 +02:00
|
|
|
status = -1;
|
|
|
|
goto abort;
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
2011-10-14 00:12:02 +02:00
|
|
|
|
2020-07-30 01:14:20 +02:00
|
|
|
header->hash_algo = the_hash_algo;
|
|
|
|
|
2011-10-14 00:12:02 +02:00
|
|
|
/* The bundle header ends with an empty line */
|
2012-02-22 20:34:22 +01:00
|
|
|
while (!strbuf_getwholeline_fd(&buf, fd, '\n') &&
|
2011-10-14 00:12:02 +02:00
|
|
|
buf.len && buf.buf[0] != '\n') {
|
2017-05-01 04:28:59 +02:00
|
|
|
struct object_id oid;
|
2011-10-14 00:12:02 +02:00
|
|
|
int is_prereq = 0;
|
2017-05-01 04:28:59 +02:00
|
|
|
const char *p;
|
2011-10-14 00:12:02 +02:00
|
|
|
|
|
|
|
strbuf_rtrim(&buf);
|
2007-09-11 05:03:15 +02:00
|
|
|
|
2020-07-30 01:14:20 +02:00
|
|
|
if (header->version == 3 && *buf.buf == '@') {
|
|
|
|
if (parse_capability(header, buf.buf + 1)) {
|
2020-06-19 19:56:00 +02:00
|
|
|
status = -1;
|
|
|
|
break;
|
|
|
|
}
|
2020-07-30 01:14:20 +02:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (*buf.buf == '-') {
|
|
|
|
is_prereq = 1;
|
|
|
|
strbuf_remove(&buf, 0, 1);
|
2020-06-19 19:56:00 +02:00
|
|
|
}
|
|
|
|
|
2011-10-14 00:12:02 +02:00
|
|
|
/*
|
|
|
|
* Tip lines have object name, SP, and refname.
|
|
|
|
* Prerequisites have object name that is optionally
|
|
|
|
* followed by SP and subject line.
|
|
|
|
*/
|
2020-06-19 19:56:00 +02:00
|
|
|
if (parse_oid_hex_algop(buf.buf, &oid, &p, header->hash_algo) ||
|
2017-05-01 04:28:59 +02:00
|
|
|
(*p && !isspace(*p)) ||
|
|
|
|
(!is_prereq && !*p)) {
|
2011-10-14 00:19:31 +02:00
|
|
|
if (report_path)
|
2012-04-23 14:30:30 +02:00
|
|
|
error(_("unrecognized header: %s%s (%d)"),
|
2011-10-14 00:19:31 +02:00
|
|
|
(is_prereq ? "-" : ""), buf.buf, (int)buf.len);
|
2011-10-14 00:12:02 +02:00
|
|
|
status = -1;
|
|
|
|
break;
|
|
|
|
} else {
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
struct object_id *dup = oiddup(&oid);
|
2011-10-14 00:12:02 +02:00
|
|
|
if (is_prereq)
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
string_list_append(&header->prerequisites, "")->util = dup;
|
2011-10-14 00:12:02 +02:00
|
|
|
else
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
string_list_append(&header->references, p + 1)->util = dup;
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
}
|
2011-10-14 00:12:02 +02:00
|
|
|
|
|
|
|
abort:
|
|
|
|
if (status) {
|
|
|
|
close(fd);
|
|
|
|
fd = -1;
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
2011-10-14 00:12:02 +02:00
|
|
|
strbuf_release(&buf);
|
2007-09-11 05:03:15 +02:00
|
|
|
return fd;
|
|
|
|
}
|
|
|
|
|
2011-10-14 00:19:31 +02:00
|
|
|
int read_bundle_header(const char *path, struct bundle_header *header)
|
|
|
|
{
|
|
|
|
int fd = open(path, O_RDONLY);
|
|
|
|
|
2007-09-11 05:03:15 +02:00
|
|
|
if (fd < 0)
|
2012-04-23 14:30:30 +02:00
|
|
|
return error(_("could not open '%s'"), path);
|
2022-05-16 22:11:05 +02:00
|
|
|
return read_bundle_header_fd(fd, header, path);
|
2011-10-14 00:19:31 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
int is_bundle(const char *path, int quiet)
|
|
|
|
{
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
struct bundle_header header = BUNDLE_HEADER_INIT;
|
2011-10-14 00:19:31 +02:00
|
|
|
int fd = open(path, O_RDONLY);
|
|
|
|
|
|
|
|
if (fd < 0)
|
|
|
|
return 0;
|
2022-05-16 22:11:05 +02:00
|
|
|
fd = read_bundle_header_fd(fd, &header, quiet ? NULL : path);
|
2011-10-14 00:19:31 +02:00
|
|
|
if (fd >= 0)
|
|
|
|
close(fd);
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
bundle_header_release(&header);
|
2011-10-14 00:19:31 +02:00
|
|
|
return (fd >= 0);
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
static int list_refs(struct string_list *r, int argc, const char **argv)
|
2007-09-11 05:03:15 +02:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < r->nr; i++) {
|
2021-07-02 11:57:31 +02:00
|
|
|
struct object_id *oid;
|
|
|
|
const char *name;
|
|
|
|
|
2007-09-11 05:03:15 +02:00
|
|
|
if (argc > 1) {
|
|
|
|
int j;
|
|
|
|
for (j = 1; j < argc; j++)
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
if (!strcmp(r->items[i].string, argv[j]))
|
2007-09-11 05:03:15 +02:00
|
|
|
break;
|
|
|
|
if (j == argc)
|
|
|
|
continue;
|
|
|
|
}
|
2021-07-02 11:57:31 +02:00
|
|
|
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
oid = r->items[i].util;
|
|
|
|
name = r->items[i].string;
|
2021-07-02 11:57:31 +02:00
|
|
|
printf("%s %s\n", oid_to_hex(oid), name);
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-03-25 14:23:26 +01:00
|
|
|
/* Remember to update object flag allocation in object.h */
|
2007-09-11 05:03:15 +02:00
|
|
|
#define PREREQ_MARK (1u<<16)
|
|
|
|
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
struct string_list_iterator {
|
|
|
|
struct string_list *list;
|
|
|
|
size_t cur;
|
|
|
|
};
|
|
|
|
|
|
|
|
static const struct object_id *iterate_ref_map(void *cb_data)
|
|
|
|
{
|
|
|
|
struct string_list_iterator *iter = cb_data;
|
|
|
|
|
|
|
|
if (iter->cur >= iter->list->nr)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return iter->list->items[iter->cur++].util;
|
|
|
|
}
|
|
|
|
|
2018-11-10 06:49:01 +01:00
|
|
|
int verify_bundle(struct repository *r,
|
|
|
|
struct bundle_header *header,
|
2022-10-12 14:52:37 +02:00
|
|
|
enum verify_bundle_flags flags)
|
2007-09-11 05:03:15 +02:00
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Do fast check, then if any prereqs are missing then go line by line
|
|
|
|
* to be verbose about the errors
|
|
|
|
*/
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
struct string_list *p = &header->prerequisites;
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
int i, ret = 0;
|
2012-04-23 14:30:30 +02:00
|
|
|
const char *message = _("Repository lacks these prerequisite commits:");
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
struct string_list_iterator iter = {
|
|
|
|
.list = p,
|
|
|
|
};
|
|
|
|
struct check_connected_options opts = {
|
|
|
|
.quiet = 1,
|
|
|
|
};
|
2007-09-11 05:03:15 +02:00
|
|
|
|
bundle: properly clear all revision flags
The verify_bundle() method checks two things for a bundle's
prerequisites:
1. Are these objects in the object store?
2. Are these objects reachable from our references?
In this second question, multiple uses of verify_bundle() in the same
process can report an invalid bundle even though it is correct. The
reason is due to not clearing all of the commit marks on the commits
previously walked.
The revision walk machinery was first introduced in-process by
fb9a54150d3 (git-bundle: avoid fork() in verify_bundle(), 2007-02-22).
This implementation used "-1" as the set of flags to clear. The next
meaningful change came in 2b064697a5b (revision traversal: retire
BOUNDARY_SHOW, 2007-03-05), which introduced the PREREQ_MARK flag
instead of a flag normally controlled by the revision-walk machinery.
In 86a0a408b90 (commit: factor out
clear_commit_marks_for_object_array, 2011-10-01), the loop over the
array of commits was replaced with a new
clear_commit_marks_for_object_array(), but simultaneously the "-1" value
was replaced with "ALL_REV_FLAGS", which stopped un-setting the
PREREQ_MARK flag. This means that if multiple commits were marked by the
PREREQ_MARK in a previous run of verify_bundle(), then this loop could
terminate early due to 'i' going to zero:
while (i && (commit = get_revision(&revs)))
if (commit->object.flags & PREREQ_MARK)
i--;
The flag clearing work was changed again in 63647391e6c (bundle: avoid
using the rev_info flag leak_pending, 2017-12-25), but that was only
cosmetic and did not change the behavior.
It may seem that it would be sufficient to add the PREREQ_MARK flag to
the clear_commit_marks() call in its current location. However, we
actually need to do it in the "cleanup:" step, since the first loop
checking "Are these objects in the object store?" might add the
PREREQ_MARK flag to some objects and then terminate without performing a
walk due to one missing object. By clearing the flags in all cases, we
avoid this issue when running verify_bundle() multiple times in the same
process.
Moving this loop to the cleanup step alone would cause a segfault when
running 'git bundle verify' outside of a repository, but this is because
of that error condition using "goto cleanup" when returning is perfectly
safe. Nothing has been initialized at that point, so we can return
immediately without causing any leaks.
This behavior is verified carefully by a test that will be added soon
when Git learns to download bundle lists in a 'git clone --bundle-uri'
command.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12 14:52:35 +02:00
|
|
|
if (!r || !r->objects || !r->objects->odb)
|
|
|
|
return error(_("need a repository to verify a bundle"));
|
2019-05-27 21:59:14 +02:00
|
|
|
|
2007-09-11 05:03:15 +02:00
|
|
|
for (i = 0; i < p->nr; i++) {
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
struct string_list_item *e = p->items + i;
|
|
|
|
const char *name = e->string;
|
|
|
|
struct object_id *oid = e->util;
|
2021-07-02 11:57:31 +02:00
|
|
|
struct object *o = parse_object(r, oid);
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
if (o)
|
2007-09-11 05:03:15 +02:00
|
|
|
continue;
|
2022-10-12 14:52:38 +02:00
|
|
|
ret++;
|
|
|
|
if (flags & VERIFY_BUNDLE_QUIET)
|
|
|
|
continue;
|
|
|
|
if (ret == 1)
|
2008-11-10 22:07:52 +01:00
|
|
|
error("%s", message);
|
2021-07-02 11:57:31 +02:00
|
|
|
error("%s %s", oid_to_hex(oid), name);
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
if (ret)
|
2022-04-13 22:01:38 +02:00
|
|
|
goto cleanup;
|
2022-03-09 17:01:39 +01:00
|
|
|
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
if ((ret = check_connected(iterate_ref_map, &iter, &opts)))
|
|
|
|
error(_("some prerequisite commits exist in the object store, "
|
|
|
|
"but are not connected to the repository's history"));
|
2007-09-11 05:03:15 +02:00
|
|
|
|
bundle: verify using check_connected()
When Git verifies a bundle to see if it is safe for unbundling, it first
looks to see if the prerequisite commits are in the object store. This
is an easy way to "fail fast" but it is not a sufficient check for
updating refs that guarantee closure under reachability. There could
still be issues if those commits are not reachable from the repository's
references. The repository only has guarantees that its object store is
closed under reachability for the objects that are reachable from
references.
Thus, the code in verify_bundle() has previously had the additional
check that all prerequisite commits are reachable from repository
references. This is done via a revision walk from all references,
stopping only if all prerequisite commits are discovered or all commits
are walked. This uses a custom walk to verify_bundle().
This check is more strict than what Git applies to fetched pack-files.
In the fetch case, Git guarantees that the new references are closed
under reachability by walking from the new references until walking
commits that are reachable from repository refs. This is done through
the well-used check_connected() method.
To better align with the restrictions required by 'git fetch',
reimplement this check in verify_bundle() to use check_connected(). This
also simplifies the code significantly.
The previous change added a test that verified the behavior of 'git
bundle verify' and 'git bundle unbundle' in this case, and the error
messages looked like this:
error: Could not read <missing-commit>
fatal: Failed to traverse parents of commit <extant-commit>
However, by changing the revision walk slightly within check_connected()
and using its quiet mode, we can omit those messages. Instead, we get
only this message, tailored to describing the current state of the
repository:
error: some prerequisite commits exist in the object store,
but are not connected to the repository's history
(Line break added here for the commit message formatting, only.)
While this message does not include any object IDs, there is no
guarantee that those object IDs would help the user diagnose what is
going on, as they could be separated from the prerequisite commits by
some distance. At minimum, this situation describes the situation in a
more informative way than the previous error messages.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-31 14:29:10 +01:00
|
|
|
/* TODO: preserve this verbose language. */
|
2022-10-12 14:52:37 +02:00
|
|
|
if (flags & VERIFY_BUNDLE_VERBOSE) {
|
bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-02 11:57:32 +02:00
|
|
|
struct string_list *r;
|
2007-09-11 05:03:15 +02:00
|
|
|
|
|
|
|
r = &header->references;
|
2013-03-08 19:01:26 +01:00
|
|
|
printf_ln(Q_("The bundle contains this ref:",
|
2022-03-07 16:27:08 +01:00
|
|
|
"The bundle contains these %"PRIuMAX" refs:",
|
2012-04-23 14:30:30 +02:00
|
|
|
r->nr),
|
2022-03-07 16:27:08 +01:00
|
|
|
(uintmax_t)r->nr);
|
2007-09-11 05:03:15 +02:00
|
|
|
list_refs(r, 0, NULL);
|
2022-03-09 17:01:39 +01:00
|
|
|
|
2013-03-07 01:56:35 +01:00
|
|
|
r = &header->prerequisites;
|
2012-06-04 20:51:13 +02:00
|
|
|
if (!r->nr) {
|
|
|
|
printf_ln(_("The bundle records a complete history."));
|
|
|
|
} else {
|
2013-03-08 19:01:26 +01:00
|
|
|
printf_ln(Q_("The bundle requires this ref:",
|
2022-03-07 16:27:08 +01:00
|
|
|
"The bundle requires these %"PRIuMAX" refs:",
|
2012-06-04 20:51:13 +02:00
|
|
|
r->nr),
|
2022-03-07 16:27:08 +01:00
|
|
|
(uintmax_t)r->nr);
|
2012-06-04 20:51:13 +02:00
|
|
|
list_refs(r, 0, NULL);
|
|
|
|
}
|
2022-03-22 18:28:38 +01:00
|
|
|
|
2022-03-22 18:28:39 +01:00
|
|
|
printf_ln("The bundle uses this hash algorithm: %s",
|
|
|
|
header->hash_algo->name);
|
2022-03-22 18:28:38 +01:00
|
|
|
if (header->filter.choice)
|
|
|
|
printf_ln("The bundle uses this filter: %s",
|
|
|
|
list_objects_filter_spec(&header->filter));
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
2022-04-13 22:01:38 +02:00
|
|
|
cleanup:
|
2007-09-11 05:03:15 +02:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
int list_bundle_refs(struct bundle_header *header, int argc, const char **argv)
|
|
|
|
{
|
|
|
|
return list_refs(&header->references, argc, argv);
|
|
|
|
}
|
|
|
|
|
2009-01-02 19:08:46 +01:00
|
|
|
static int is_tag_in_date_range(struct object *tag, struct rev_info *revs)
|
|
|
|
{
|
|
|
|
unsigned long size;
|
|
|
|
enum object_type type;
|
2014-10-04 00:40:24 +02:00
|
|
|
char *buf = NULL, *line, *lineend;
|
2017-04-26 21:29:31 +02:00
|
|
|
timestamp_t date;
|
2014-10-04 00:40:24 +02:00
|
|
|
int result = 1;
|
2009-01-02 19:08:46 +01:00
|
|
|
|
|
|
|
if (revs->max_age == -1 && revs->min_age == -1)
|
2014-10-04 00:40:24 +02:00
|
|
|
goto out;
|
2009-01-02 19:08:46 +01:00
|
|
|
|
sha1_file: convert read_sha1_file to struct object_id
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-12 03:27:53 +01:00
|
|
|
buf = read_object_file(&tag->oid, &type, &size);
|
2009-01-02 19:08:46 +01:00
|
|
|
if (!buf)
|
2014-10-04 00:40:24 +02:00
|
|
|
goto out;
|
2009-01-02 19:08:46 +01:00
|
|
|
line = memmem(buf, size, "\ntagger ", 8);
|
|
|
|
if (!line++)
|
2014-10-04 00:40:24 +02:00
|
|
|
goto out;
|
2014-08-02 10:39:06 +02:00
|
|
|
lineend = memchr(line, '\n', buf + size - line);
|
|
|
|
line = memchr(line, '>', lineend ? lineend - line : buf + size - line);
|
2009-01-02 19:08:46 +01:00
|
|
|
if (!line++)
|
2014-10-04 00:40:24 +02:00
|
|
|
goto out;
|
2017-04-21 12:45:44 +02:00
|
|
|
date = parse_timestamp(line, NULL, 10);
|
2014-10-04 00:40:24 +02:00
|
|
|
result = (revs->max_age == -1 || revs->max_age < date) &&
|
2009-01-02 19:08:46 +01:00
|
|
|
(revs->min_age == -1 || revs->min_age > date);
|
2014-10-04 00:40:24 +02:00
|
|
|
out:
|
|
|
|
free(buf);
|
|
|
|
return result;
|
2009-01-02 19:08:46 +01:00
|
|
|
}
|
|
|
|
|
2015-08-10 11:47:37 +02:00
|
|
|
|
bundle: dup() output descriptor closer to point-of-use
When writing a bundle to a file, the bundle code actually creates
"your.bundle.lock" using our lockfile interface. We feed that output
descriptor to a child git-pack-objects via run-command, which has the
quirk that it closes the output descriptor in the parent.
To avoid confusing the lockfile code (which still thinks the descriptor
is valid), we dup() it, and operate on the duplicate.
However, this has a confusing side effect: after the dup() but before we
call pack-objects, we have _two_ descriptors open to the lockfile. If we
call die() during that time, the lockfile code will try to clean up the
partially-written file. It knows to close() the file before unlinking,
since on some platforms (i.e., Windows) the open file would block the
deletion. But it doesn't know about the duplicate descriptor. On
Windows, triggering an error at the right part of the code will result
in the cleanup failing and the lockfile being left in the filesystem.
We can solve this by moving the dup() much closer to start_command(),
shrinking the window in which we have the second descriptor open. It's
easy to place this in such a way that no die() is possible. We could
still die due to a signal in the exact wrong moment, but we already
tolerate races there (e.g., a signal could come before we manage to put
the file on the cleanup list in the first place).
As a bonus, this shields create_bundle() itself from the duplicate-fd
trick, and we can simplify its error handling (note that the lock
rollback now happens unconditionally, but that's OK; it's a noop if we
didn't open the lock in the first place).
The included test uses an empty bundle to cause a failure at the right
spot in the code, because that's easy to trigger (the other likely
errors are write() problems like ENOSPC). Note that it would already
pass on non-Windows systems (because they are happy to unlink an
already-open file).
Based-on-a-patch-by: Gaël Lhez <gael.lhez@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Tested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-16 10:43:59 +01:00
|
|
|
/* Write the pack data to bundle_fd */
|
2020-07-28 22:24:53 +02:00
|
|
|
static int write_pack_data(int bundle_fd, struct rev_info *revs, struct strvec *pack_options)
|
2014-10-30 18:45:41 +01:00
|
|
|
{
|
|
|
|
struct child_process pack_objects = CHILD_PROCESS_INIT;
|
|
|
|
int i;
|
|
|
|
|
2020-07-28 22:24:53 +02:00
|
|
|
strvec_pushl(&pack_objects.args,
|
strvec: fix indentation in renamed calls
Code which split an argv_array call across multiple lines, like:
argv_array_pushl(&args, "one argument",
"another argument", "and more",
NULL);
was recently mechanically renamed to use strvec, which results in
mis-matched indentation like:
strvec_pushl(&args, "one argument",
"another argument", "and more",
NULL);
Let's fix these up to align the arguments with the opening paren. I did
this manually by sifting through the results of:
git jump grep 'strvec_.*,$'
and liberally applying my editor's auto-format. Most of the changes are
of the form shown above, though I also normalized a few that had
originally used a single-tab indentation (rather than our usual style of
aligning with the open paren). I also rewrapped a couple of obvious
cases (e.g., where previously too-long lines became short enough to fit
on one), but I wasn't aggressive about it. In cases broken to three or
more lines, the grouping of arguments is sometimes meaningful, and it
wasn't worth my time or reviewer time to ponder each case individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28 22:26:31 +02:00
|
|
|
"pack-objects",
|
|
|
|
"--stdout", "--thin", "--delta-base-offset",
|
|
|
|
NULL);
|
2020-07-29 02:37:20 +02:00
|
|
|
strvec_pushv(&pack_objects.args, pack_options->v);
|
2022-03-09 17:01:41 +01:00
|
|
|
if (revs->filter.choice)
|
|
|
|
strvec_pushf(&pack_objects.args, "--filter=%s",
|
|
|
|
list_objects_filter_spec(&revs->filter));
|
2014-10-30 18:45:41 +01:00
|
|
|
pack_objects.in = -1;
|
|
|
|
pack_objects.out = bundle_fd;
|
|
|
|
pack_objects.git_cmd = 1;
|
bundle: dup() output descriptor closer to point-of-use
When writing a bundle to a file, the bundle code actually creates
"your.bundle.lock" using our lockfile interface. We feed that output
descriptor to a child git-pack-objects via run-command, which has the
quirk that it closes the output descriptor in the parent.
To avoid confusing the lockfile code (which still thinks the descriptor
is valid), we dup() it, and operate on the duplicate.
However, this has a confusing side effect: after the dup() but before we
call pack-objects, we have _two_ descriptors open to the lockfile. If we
call die() during that time, the lockfile code will try to clean up the
partially-written file. It knows to close() the file before unlinking,
since on some platforms (i.e., Windows) the open file would block the
deletion. But it doesn't know about the duplicate descriptor. On
Windows, triggering an error at the right part of the code will result
in the cleanup failing and the lockfile being left in the filesystem.
We can solve this by moving the dup() much closer to start_command(),
shrinking the window in which we have the second descriptor open. It's
easy to place this in such a way that no die() is possible. We could
still die due to a signal in the exact wrong moment, but we already
tolerate races there (e.g., a signal could come before we manage to put
the file on the cleanup list in the first place).
As a bonus, this shields create_bundle() itself from the duplicate-fd
trick, and we can simplify its error handling (note that the lock
rollback now happens unconditionally, but that's OK; it's a noop if we
didn't open the lock in the first place).
The included test uses an empty bundle to cause a failure at the right
spot in the code, because that's easy to trigger (the other likely
errors are write() problems like ENOSPC). Note that it would already
pass on non-Windows systems (because they are happy to unlink an
already-open file).
Based-on-a-patch-by: Gaël Lhez <gael.lhez@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Tested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-16 10:43:59 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* start_command() will close our descriptor if it's >1. Duplicate it
|
|
|
|
* to avoid surprising the caller.
|
|
|
|
*/
|
|
|
|
if (pack_objects.out > 1) {
|
|
|
|
pack_objects.out = dup(pack_objects.out);
|
|
|
|
if (pack_objects.out < 0) {
|
|
|
|
error_errno(_("unable to dup bundle descriptor"));
|
|
|
|
child_process_clear(&pack_objects);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-10-30 18:45:41 +01:00
|
|
|
if (start_command(&pack_objects))
|
|
|
|
return error(_("Could not spawn pack-objects"));
|
|
|
|
|
|
|
|
for (i = 0; i < revs->pending.nr; i++) {
|
|
|
|
struct object *object = revs->pending.objects[i].item;
|
|
|
|
if (object->flags & UNINTERESTING)
|
|
|
|
write_or_die(pack_objects.in, "^", 1);
|
2019-08-18 22:04:11 +02:00
|
|
|
write_or_die(pack_objects.in, oid_to_hex(&object->oid), the_hash_algo->hexsz);
|
2014-10-30 18:45:41 +01:00
|
|
|
write_or_die(pack_objects.in, "\n", 1);
|
|
|
|
}
|
|
|
|
close(pack_objects.in);
|
|
|
|
if (finish_command(&pack_objects))
|
|
|
|
return error(_("pack-objects died"));
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-10-30 22:35:24 +01:00
|
|
|
/*
|
|
|
|
* Write out bundle refs based on the tips already
|
|
|
|
* parsed into revs.pending. As a side effect, may
|
|
|
|
* manipulate revs.pending to include additional
|
|
|
|
* necessary objects (like tags).
|
|
|
|
*
|
|
|
|
* Returns the number of refs written, or negative
|
|
|
|
* on error.
|
|
|
|
*/
|
|
|
|
static int write_bundle_refs(int bundle_fd, struct rev_info *revs)
|
2014-10-30 19:01:37 +01:00
|
|
|
{
|
2014-10-30 22:35:24 +01:00
|
|
|
int i;
|
|
|
|
int ref_count = 0;
|
2009-01-18 07:27:08 +01:00
|
|
|
|
2014-10-30 22:35:24 +01:00
|
|
|
for (i = 0; i < revs->pending.nr; i++) {
|
|
|
|
struct object_array_entry *e = revs->pending.objects + i;
|
2015-11-10 03:22:28 +01:00
|
|
|
struct object_id oid;
|
2007-09-11 05:03:15 +02:00
|
|
|
char *ref;
|
2007-11-23 01:51:18 +01:00
|
|
|
const char *display_ref;
|
|
|
|
int flag;
|
2007-09-11 05:03:15 +02:00
|
|
|
|
|
|
|
if (e->item->flags & UNINTERESTING)
|
|
|
|
continue;
|
2020-09-02 00:28:09 +02:00
|
|
|
if (dwim_ref(e->name, strlen(e->name), &oid, &ref, 0) != 1)
|
2015-03-11 00:51:48 +01:00
|
|
|
goto skip_write_ref;
|
2017-10-16 00:06:56 +02:00
|
|
|
if (read_ref_full(e->name, RESOLVE_REF_READING, &oid, &flag))
|
2007-11-23 01:51:18 +01:00
|
|
|
flag = 0;
|
|
|
|
display_ref = (flag & REF_ISSYMREF) ? e->name : ref;
|
|
|
|
|
2009-01-02 19:08:46 +01:00
|
|
|
if (e->item->type == OBJ_TAG &&
|
2014-10-30 22:35:24 +01:00
|
|
|
!is_tag_in_date_range(e->item, revs)) {
|
2009-01-02 19:08:46 +01:00
|
|
|
e->item->flags |= UNINTERESTING;
|
2015-03-11 00:51:48 +01:00
|
|
|
goto skip_write_ref;
|
2009-01-02 19:08:46 +01:00
|
|
|
}
|
|
|
|
|
2007-09-11 05:03:15 +02:00
|
|
|
/*
|
|
|
|
* Make sure the refs we wrote out is correct; --max-count and
|
|
|
|
* other limiting options could have prevented all the tips
|
|
|
|
* from getting output.
|
|
|
|
*
|
|
|
|
* Non commit objects such as tags and blobs do not have
|
|
|
|
* this issue as they are not affected by those extra
|
|
|
|
* constraints.
|
|
|
|
*/
|
|
|
|
if (!(e->item->flags & SHOWN) && e->item->type == OBJ_COMMIT) {
|
2012-04-23 14:30:30 +02:00
|
|
|
warning(_("ref '%s' is excluded by the rev-list options"),
|
2007-09-11 05:03:15 +02:00
|
|
|
e->name);
|
2015-03-11 00:51:48 +01:00
|
|
|
goto skip_write_ref;
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
/*
|
|
|
|
* If you run "git bundle create bndl v1.0..v2.0", the
|
|
|
|
* name of the positive ref is "v2.0" but that is the
|
|
|
|
* commit that is referenced by the tag, and not the tag
|
|
|
|
* itself.
|
|
|
|
*/
|
2018-08-28 23:22:48 +02:00
|
|
|
if (!oideq(&oid, &e->item->oid)) {
|
2007-09-11 05:03:15 +02:00
|
|
|
/*
|
|
|
|
* Is this the positive end of a range expressed
|
|
|
|
* in terms of a tag (e.g. v2.0 from the range
|
|
|
|
* "v1.0..v2.0")?
|
|
|
|
*/
|
2018-11-10 06:49:01 +01:00
|
|
|
struct commit *one = lookup_commit_reference(revs->repo, &oid);
|
2007-09-11 05:03:15 +02:00
|
|
|
struct object *obj;
|
|
|
|
|
|
|
|
if (e->item == &(one->object)) {
|
|
|
|
/*
|
|
|
|
* Need to include e->name as an
|
|
|
|
* independent ref to the pack-objects
|
|
|
|
* input, so that the tag is included
|
|
|
|
* in the output; otherwise we would
|
|
|
|
* end up triggering "empty bundle"
|
|
|
|
* error.
|
|
|
|
*/
|
object: convert parse_object* to take struct object_id
Make parse_object, parse_object_or_die, and parse_object_buffer take a
pointer to struct object_id. Remove the temporary variables inserted
earlier, since they are no longer necessary. Transform all of the
callers using the following semantic patch:
@@
expression E1;
@@
- parse_object(E1.hash)
+ parse_object(&E1)
@@
expression E1;
@@
- parse_object(E1->hash)
+ parse_object(E1)
@@
expression E1, E2;
@@
- parse_object_or_die(E1.hash, E2)
+ parse_object_or_die(&E1, E2)
@@
expression E1, E2;
@@
- parse_object_or_die(E1->hash, E2)
+ parse_object_or_die(E1, E2)
@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1.hash, E2, E3, E4, E5)
+ parse_object_buffer(&E1, E2, E3, E4, E5)
@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1->hash, E2, E3, E4, E5)
+ parse_object_buffer(E1, E2, E3, E4, E5)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-07 00:10:38 +02:00
|
|
|
obj = parse_object_or_die(&oid, e->name);
|
2007-09-11 05:03:15 +02:00
|
|
|
obj->flags |= SHOWN;
|
2014-10-30 22:35:24 +01:00
|
|
|
add_pending_object(revs, obj, e->name);
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
2015-03-11 00:51:48 +01:00
|
|
|
goto skip_write_ref;
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
ref_count++;
|
2019-08-18 22:04:11 +02:00
|
|
|
write_or_die(bundle_fd, oid_to_hex(&e->item->oid), the_hash_algo->hexsz);
|
2007-09-11 05:03:15 +02:00
|
|
|
write_or_die(bundle_fd, " ", 1);
|
2007-11-23 01:51:18 +01:00
|
|
|
write_or_die(bundle_fd, display_ref, strlen(display_ref));
|
2007-09-11 05:03:15 +02:00
|
|
|
write_or_die(bundle_fd, "\n", 1);
|
2015-03-11 00:51:48 +01:00
|
|
|
skip_write_ref:
|
2007-09-11 05:03:15 +02:00
|
|
|
free(ref);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* end header */
|
|
|
|
write_or_die(bundle_fd, "\n", 1);
|
2014-10-30 22:35:24 +01:00
|
|
|
return ref_count;
|
|
|
|
}
|
|
|
|
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
struct bundle_prerequisites_info {
|
|
|
|
struct object_array *pending;
|
|
|
|
int fd;
|
|
|
|
};
|
|
|
|
|
|
|
|
static void write_bundle_prerequisites(struct commit *commit, void *data)
|
|
|
|
{
|
|
|
|
struct bundle_prerequisites_info *bpi = data;
|
|
|
|
struct object *object;
|
|
|
|
struct pretty_print_context ctx = { 0 };
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
|
|
|
if (!(commit->object.flags & BOUNDARY))
|
|
|
|
return;
|
|
|
|
strbuf_addf(&buf, "-%s ", oid_to_hex(&commit->object.oid));
|
|
|
|
write_or_die(bpi->fd, buf.buf, buf.len);
|
|
|
|
|
|
|
|
ctx.fmt = CMIT_FMT_ONELINE;
|
|
|
|
ctx.output_encoding = get_log_output_encoding();
|
|
|
|
strbuf_reset(&buf);
|
|
|
|
pretty_print_commit(&ctx, commit, &buf);
|
|
|
|
strbuf_trim(&buf);
|
|
|
|
|
|
|
|
object = (struct object *)commit;
|
|
|
|
object->flags |= UNINTERESTING;
|
|
|
|
add_object_array_with_path(object, buf.buf, bpi->pending, S_IFINVALID,
|
|
|
|
NULL);
|
|
|
|
strbuf_addch(&buf, '\n');
|
|
|
|
write_or_die(bpi->fd, buf.buf, buf.len);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
|
2019-01-24 14:11:51 +01:00
|
|
|
int create_bundle(struct repository *r, const char *path,
|
2020-08-12 03:04:11 +02:00
|
|
|
int argc, const char **argv, struct strvec *pack_options, int version)
|
2014-10-30 22:35:24 +01:00
|
|
|
{
|
2018-05-09 22:55:38 +02:00
|
|
|
struct lock_file lock = LOCK_INIT;
|
2014-10-30 22:35:24 +01:00
|
|
|
int bundle_fd = -1;
|
|
|
|
int bundle_to_stdout;
|
|
|
|
int ref_count = 0;
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
struct rev_info revs, revs_copy;
|
2022-03-09 17:01:41 +01:00
|
|
|
int min_version = 2;
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
struct bundle_prerequisites_info bpi;
|
|
|
|
int i;
|
2014-10-30 22:35:24 +01:00
|
|
|
|
2022-03-09 17:01:41 +01:00
|
|
|
/* init revs to list objects for pack-objects later */
|
|
|
|
save_commit_buffer = 0;
|
|
|
|
repo_init_revisions(r, &revs, NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Pre-initialize the '--objects' flag so we can parse a
|
|
|
|
* --filter option successfully.
|
|
|
|
*/
|
|
|
|
revs.tree_objects = revs.blob_objects = 1;
|
|
|
|
|
|
|
|
argc = setup_revisions(argc, argv, &revs, NULL);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reasons to require version 3:
|
|
|
|
*
|
|
|
|
* 1. @object-format is required because our hash algorithm is not
|
|
|
|
* SHA1.
|
|
|
|
* 2. @filter is required because we parsed an object filter.
|
|
|
|
*/
|
|
|
|
if (the_hash_algo != &hash_algos[GIT_HASH_SHA1] || revs.filter.choice)
|
|
|
|
min_version = 3;
|
|
|
|
|
|
|
|
if (argc > 1) {
|
|
|
|
error(_("unrecognized argument: %s"), argv[1]);
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
2014-10-30 22:35:24 +01:00
|
|
|
bundle_to_stdout = !strcmp(path, "-");
|
|
|
|
if (bundle_to_stdout)
|
|
|
|
bundle_fd = 1;
|
bundle: dup() output descriptor closer to point-of-use
When writing a bundle to a file, the bundle code actually creates
"your.bundle.lock" using our lockfile interface. We feed that output
descriptor to a child git-pack-objects via run-command, which has the
quirk that it closes the output descriptor in the parent.
To avoid confusing the lockfile code (which still thinks the descriptor
is valid), we dup() it, and operate on the duplicate.
However, this has a confusing side effect: after the dup() but before we
call pack-objects, we have _two_ descriptors open to the lockfile. If we
call die() during that time, the lockfile code will try to clean up the
partially-written file. It knows to close() the file before unlinking,
since on some platforms (i.e., Windows) the open file would block the
deletion. But it doesn't know about the duplicate descriptor. On
Windows, triggering an error at the right part of the code will result
in the cleanup failing and the lockfile being left in the filesystem.
We can solve this by moving the dup() much closer to start_command(),
shrinking the window in which we have the second descriptor open. It's
easy to place this in such a way that no die() is possible. We could
still die due to a signal in the exact wrong moment, but we already
tolerate races there (e.g., a signal could come before we manage to put
the file on the cleanup list in the first place).
As a bonus, this shields create_bundle() itself from the duplicate-fd
trick, and we can simplify its error handling (note that the lock
rollback now happens unconditionally, but that's OK; it's a noop if we
didn't open the lock in the first place).
The included test uses an empty bundle to cause a failure at the right
spot in the code, because that's easy to trigger (the other likely
errors are write() problems like ENOSPC). Note that it would already
pass on non-Windows systems (because they are happy to unlink an
already-open file).
Based-on-a-patch-by: Gaël Lhez <gael.lhez@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Tested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-16 10:43:59 +01:00
|
|
|
else
|
2014-10-30 22:35:24 +01:00
|
|
|
bundle_fd = hold_lock_file_for_update(&lock, path,
|
|
|
|
LOCK_DIE_ON_ERROR);
|
|
|
|
|
2020-07-30 01:14:20 +02:00
|
|
|
if (version == -1)
|
|
|
|
version = min_version;
|
|
|
|
|
|
|
|
if (version < 2 || version > 3) {
|
|
|
|
die(_("unsupported bundle version %d"), version);
|
|
|
|
} else if (version < min_version) {
|
|
|
|
die(_("cannot write bundle version %d with algorithm %s"), version, the_hash_algo->name);
|
|
|
|
} else if (version == 2) {
|
|
|
|
write_or_die(bundle_fd, v2_bundle_signature, strlen(v2_bundle_signature));
|
|
|
|
} else {
|
|
|
|
const char *capability = "@object-format=";
|
|
|
|
write_or_die(bundle_fd, v3_bundle_signature, strlen(v3_bundle_signature));
|
|
|
|
write_or_die(bundle_fd, capability, strlen(capability));
|
|
|
|
write_or_die(bundle_fd, the_hash_algo->name, strlen(the_hash_algo->name));
|
|
|
|
write_or_die(bundle_fd, "\n", 1);
|
2014-10-30 22:35:24 +01:00
|
|
|
|
2022-03-09 17:01:41 +01:00
|
|
|
if (revs.filter.choice) {
|
|
|
|
const char *value = expand_list_objects_filter_spec(&revs.filter);
|
|
|
|
capability = "@filter=";
|
|
|
|
write_or_die(bundle_fd, capability, strlen(capability));
|
|
|
|
write_or_die(bundle_fd, value, strlen(value));
|
|
|
|
write_or_die(bundle_fd, "\n", 1);
|
|
|
|
}
|
2016-04-01 02:35:45 +02:00
|
|
|
}
|
2014-10-30 22:35:24 +01:00
|
|
|
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
/* save revs.pending in revs_copy for later use */
|
|
|
|
memcpy(&revs_copy, &revs, sizeof(revs));
|
|
|
|
revs_copy.pending.nr = 0;
|
|
|
|
revs_copy.pending.alloc = 0;
|
|
|
|
revs_copy.pending.objects = NULL;
|
|
|
|
for (i = 0; i < revs.pending.nr; i++) {
|
|
|
|
struct object_array_entry *e = revs.pending.objects + i;
|
|
|
|
if (e)
|
|
|
|
add_object_array_with_path(e->item, e->name,
|
|
|
|
&revs_copy.pending,
|
|
|
|
e->mode, e->path);
|
|
|
|
}
|
2014-10-30 22:35:24 +01:00
|
|
|
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
/* write prerequisites */
|
|
|
|
revs.boundary = 1;
|
|
|
|
if (prepare_revision_walk(&revs))
|
|
|
|
die("revision walk setup failed");
|
|
|
|
bpi.fd = bundle_fd;
|
|
|
|
bpi.pending = &revs_copy.pending;
|
2022-03-09 17:01:38 +01:00
|
|
|
|
2022-03-09 17:01:41 +01:00
|
|
|
/*
|
|
|
|
* Remove any object walking here. We only care about commits and
|
|
|
|
* tags here. The revs_copy has the right instances of these values.
|
|
|
|
*/
|
2022-03-09 17:01:38 +01:00
|
|
|
revs.blob_objects = revs.tree_objects = 0;
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
traverse_commit_list(&revs, write_bundle_prerequisites, NULL, &bpi);
|
|
|
|
object_array_remove_duplicates(&revs_copy.pending);
|
|
|
|
|
|
|
|
/* write bundle refs */
|
|
|
|
ref_count = write_bundle_refs(bundle_fd, &revs_copy);
|
2014-10-30 22:35:24 +01:00
|
|
|
if (!ref_count)
|
|
|
|
die(_("Refusing to create empty bundle."));
|
|
|
|
else if (ref_count < 0)
|
2016-04-01 02:35:45 +02:00
|
|
|
goto err;
|
2007-09-11 05:03:15 +02:00
|
|
|
|
|
|
|
/* write pack */
|
bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits. It will be more
convenient to pass args via stdin. But the current implementation does
not allow us to do this.
This is because args are parsed twice when creating bundle. The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`. Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.
The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`. In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`. But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`). Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.
The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept. Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.
Also add testcases for git bundle in t6020, which read args from stdin.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 03:27:03 +01:00
|
|
|
if (write_pack_data(bundle_fd, &revs_copy, pack_options))
|
2016-04-01 02:35:45 +02:00
|
|
|
goto err;
|
2008-01-16 20:12:46 +01:00
|
|
|
|
2010-08-27 22:31:47 +02:00
|
|
|
if (!bundle_to_stdout) {
|
|
|
|
if (commit_lock_file(&lock))
|
2012-04-23 14:30:30 +02:00
|
|
|
die_errno(_("cannot create '%s'"), path);
|
2010-08-27 22:31:47 +02:00
|
|
|
}
|
2008-02-21 23:42:56 +01:00
|
|
|
return 0;
|
2016-04-01 02:35:45 +02:00
|
|
|
err:
|
bundle: dup() output descriptor closer to point-of-use
When writing a bundle to a file, the bundle code actually creates
"your.bundle.lock" using our lockfile interface. We feed that output
descriptor to a child git-pack-objects via run-command, which has the
quirk that it closes the output descriptor in the parent.
To avoid confusing the lockfile code (which still thinks the descriptor
is valid), we dup() it, and operate on the duplicate.
However, this has a confusing side effect: after the dup() but before we
call pack-objects, we have _two_ descriptors open to the lockfile. If we
call die() during that time, the lockfile code will try to clean up the
partially-written file. It knows to close() the file before unlinking,
since on some platforms (i.e., Windows) the open file would block the
deletion. But it doesn't know about the duplicate descriptor. On
Windows, triggering an error at the right part of the code will result
in the cleanup failing and the lockfile being left in the filesystem.
We can solve this by moving the dup() much closer to start_command(),
shrinking the window in which we have the second descriptor open. It's
easy to place this in such a way that no die() is possible. We could
still die due to a signal in the exact wrong moment, but we already
tolerate races there (e.g., a signal could come before we manage to put
the file on the cleanup list in the first place).
As a bonus, this shields create_bundle() itself from the duplicate-fd
trick, and we can simplify its error handling (note that the lock
rollback now happens unconditionally, but that's OK; it's a noop if we
didn't open the lock in the first place).
The included test uses an empty bundle to cause a failure at the right
spot in the code, because that's easy to trigger (the other likely
errors are write() problems like ENOSPC). Note that it would already
pass on non-Windows systems (because they are happy to unlink an
already-open file).
Based-on-a-patch-by: Gaël Lhez <gael.lhez@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Tested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-16 10:43:59 +01:00
|
|
|
rollback_lock_file(&lock);
|
2016-04-01 02:35:45 +02:00
|
|
|
return -1;
|
2007-09-11 05:03:15 +02:00
|
|
|
}
|
|
|
|
|
2018-11-10 06:49:01 +01:00
|
|
|
int unbundle(struct repository *r, struct bundle_header *header,
|
2022-10-12 14:52:37 +02:00
|
|
|
int bundle_fd, struct strvec *extra_index_pack_args,
|
|
|
|
enum verify_bundle_flags flags)
|
2007-09-11 05:03:15 +02:00
|
|
|
{
|
2014-08-19 21:09:35 +02:00
|
|
|
struct child_process ip = CHILD_PROCESS_INIT;
|
2023-02-07 00:07:37 +01:00
|
|
|
|
|
|
|
if (verify_bundle(r, header, flags))
|
|
|
|
return -1;
|
|
|
|
|
2021-09-05 09:34:43 +02:00
|
|
|
strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
|
2007-09-11 05:03:15 +02:00
|
|
|
|
2022-03-09 17:01:42 +01:00
|
|
|
/* If there is a filter, then we need to create the promisor pack. */
|
|
|
|
if (header->filter.choice)
|
|
|
|
strvec_push(&ip.args, "--promisor=from-bundle");
|
|
|
|
|
2021-09-05 09:34:43 +02:00
|
|
|
if (extra_index_pack_args) {
|
|
|
|
strvec_pushv(&ip.args, extra_index_pack_args->v);
|
|
|
|
strvec_clear(extra_index_pack_args);
|
|
|
|
}
|
2011-09-19 01:52:32 +02:00
|
|
|
|
2007-09-11 05:03:15 +02:00
|
|
|
ip.in = bundle_fd;
|
|
|
|
ip.no_stdout = 1;
|
|
|
|
ip.git_cmd = 1;
|
|
|
|
if (run_command(&ip))
|
2012-04-23 14:30:30 +02:00
|
|
|
return error(_("index-pack died"));
|
2007-09-11 05:03:15 +02:00
|
|
|
return 0;
|
|
|
|
}
|