The previous patch demonstrates a scenario where the list of packs
written by `pack-objects` (and stored in the `names` string_list) is
out-of-order, and can thus cause us to delete packs we shouldn't.
This patch resolves that bug by ensuring that `names` is sorted in all
cases, not just when
delete_redundant && pack_everything & ALL_INTO_ONE
is true.
Because we did sort `names` in that case (which, prior to `--geometric`
repacks, was the only time we would actually delete packs, this is only
a bug for `--geometric` repacks.
It would be sufficient to only sort `names` when `delete_redundant` is
set to a non-zero value. But sorting a small list of strings is cheap,
and it is defensive against future calls to `string_list_has_string()`
on this list.
Co-discovered-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing a `--geometric=<d>` repack, `git repack` determines a
splitting point among packs ordered by their object count such that:
- each pack above the split has at least `<d>` times as many objects
as the next-largest pack by object count, and
- the first pack above the split has at least `<d>` times as many
object as the sum of all packs below the split line combined
`git repack` then creates a pack containing all of the objects contained
in packs below the split line by running `git pack-objects
--stdin-packs` underneath. Once packs are moved into place, then any
packs below the split line are removed, since their objects were just
combined into a new pack.
But `git repack` tries to be careful to avoid removing a pack that it
just wrote, by checking:
struct packed_git *p = geometry->pack[i];
if (string_list_has_string(&names, hash_to_hex(p->hash)))
continue;
in the `delete_redundant` and `geometric` conditional towards the end of
`cmd_repack`.
But it's possible to trick `git repack` into not recognizing a pack that
it just wrote when `names` is out-of-order (which violates
`string_list_has_string()`'s assumption that the list is sorted and thus
binary search-able).
When this happens in just the right circumstances, it is possible to
remove a pack that we just wrote, leading to object corruption.
Luckily, this is quite difficult to provoke in practice (for a couple of
reasons):
- we ordinarily write just one pack, so `names` usually contains just
one entry, and is thus sorted
- when we do write more than one pack (e.g., due to `--max-pack-size`)
we have to: (a) write a pack identical to one that already
exists, (b) have that pack be below the split line, and (c) have
the set of packs written by `pack-objects` occur in an order which
tricks `string_list_has_string()`.
Demonstrate the above scenario in a failing test, which causes `git
repack --geometric` to write a pack which occurs below the split line,
_and_ fail to recognize that it wrote that pack.
The following patch will fix this bug.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'repack' to ignore packs named on the command line with the
'--keep-pack' option. Specifically, modify 'init_pack_geometry()' to treat
command line-kept packs the same way it treats packs with an on-disk '.keep'
file (that is, skip the pack and do not include it in the 'geometry'
structure).
Without this handling, a '--keep-pack' pack would be included in the
'geometry' structure. If the pack is *before* the geometry split line (with
at least one other pack and/or loose objects present), 'repack' assumes the
pack's contents are "rolled up" into another pack via 'pack-objects'.
However, because the internally-invoked 'pack-objects' properly excludes
'--keep-pack' objects, any new pack it creates will not contain the kept
objects. Finally, 'repack' deletes the '--keep-pack' as "redundant" (since
it assumes 'pack-objects' created a new pack with its contents), resulting
in possible object loss and repository corruption.
Add a test ensuring that '--keep-pack' packs are now appropriately handled.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Correct choices of C compilers used in various CI jobs.
source: <patch-v3-1.1-8b3444ecc87-20220422T092015Z-avarab@gmail.com>
* ab/cc-package-fixes:
CI: select CC based on CC_PACKAGE (again)
Get rid of a bogus and over-eager coccinelle rule.
source: <xmqq1qxd6e4x.fsf@gitster.g>
* jc/cocci-xstrdup-or-null-fix:
cocci: drop bogus xstrdup_or_null() rule
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
source: <c36896a1-6247-123b-4fa3-b7eb24af1897@web.de>
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
source: <2c988c7b-0efe-4222-4a43-8124fe1a9da6@web.de>
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
source: <xmqqo80j87g0.fsf_-_@gitster.g>
* jc/show-pathspec-fix:
2.36 show regression fix
Regression fix for 2.36 where "git name-rev" started to sometimes
reference strings after they are freed.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <340c8810-d912-7b18-d46e-a9d43f20216a@web.de>
* rs/name-rev-fix-free-after-use:
Revert "name-rev: release unused name strings"
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <xmqq7d7bsu2n.fsf@gitster.g>
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <pull.1258.v2.git.git.1650890741430.gitgitgadget@gmail.com>
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
13092a91 (cocci: refactor common patterns to use xstrdup_or_null(),
2016-10-12) introduced a rule to rewrite this conditional call to
xstrdup(E) and an assignment to variable V:
- if (E)
- V = xstrdup(E);
into an unconditional call to xstrdup_or_null(E) and an assignment
to variable V:
+ V = xstrdup_or_null(E);
which is utterly bogus. The original code may already have an
acceptable value in V and the conditional assignment may be to
improve the value already in V with a copy of a better value E when
(and only when) E is not NULL.
The rewritten construct unconditionally discards the existing value
of V and replaces it with a copy of E, even when E is NULL, which
changes the meaning of the program.
By the way, if it were
-if (E && !V)
- V = xstrdup(E);
+V = xstrdup_or_null(E);
it would probably have been correct. But there is no existing code
that would have been improved by such a rule, so let's just remove
the bogus one without replacing with the more specific one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git fast-export doesn't set no_free, so path limiting stopped working
after the first commit. Set the flag and add a basic test to make sure
only changes to the specified files are exported.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git format-patch only sets no_free when --output is given, causing it to
forget pathspecs after the first commit. Set no_free unconditionally
instead.
The existing test was unable to detect this breakage because it checks
stderr for the absence of a certain string, but format-patch writes to
stdout. Also the test was not checking the case of one commit modifying
multiple files and a pathspec limiting the diff. Replace it with a more
thorough one.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
e900d494 (diff: add an API for deferred freeing, 2021-02-11)
introduced a mechanism to delay freeing resources held in
diff_options struct that need to be kept as long as the struct will
be reused to compute diff. "git log -p" was taught to utilize the
mechanism but it was done with an incorrect assumption that the
underlying helper function, cmd_log_walk(), is called only once,
and it is OK to do the freeing at the end of it.
Alas, for "git show A B", the function is called once for each
commit given, so it is not OK to free the resources until we finish
calling it for all the commits given from the command line.
During 2.36 release cycle, we started clearing the <pathspec> as
part of this freeing, which made the bug a lot more visible.
Fix this breakage by tweaking how cmd_log_walk() frees the resources
at the end and using a variant of it that does not immediately free
the resources to show each commit object from the command line in
"git show".
Protect the fix with a few new tests.
Reported-by: Daniel Li <dan@danielyli.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .warn_if_uninitialized member was introduced by 48308681
(git submodule update: have a dedicated helper for cloning,
2016-02-29) to submodule_update_clone struct and initialized to
false. When c9911c93 (submodule--helper: teach update_data more
options, 2022-03-15) moved it to update_data struct, it started
to initialize it to true but this change was not explained in
its log message.
The member is set to true only when pathspec was given, and is
used when a submodule that matched the pathspec is found
uninitialized to give diagnostic message. "submodule update"
without pathspec is supposed to iterate over all submodules
(i.e. without pathspec limitation) and update only the
initialized submodules, and finding uninitialized submodules
during the iteration is a totally expected and normal thing that
should not be warned.
[jc: added tests]
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
The diff_free() call is to be used after we completely finished with
a diffopt structure. After "git diff A B" finishes producing
output, calling it before process exit is fine. But there are
commands that prepares diff_options struct once, compares two sets
of paths, releases resources that were used to do the comparison,
then reuses the same diff_option struct to go on to compare the next
two sets of paths, like "git log -p".
After "git log -p" finishes showing a single commit, calling it
before it goes on to the next commit is NOT fine. There is a
mechanism, the .no_free member in diff_options struct, to help "git
log" to avoid calling diff_free() after showing each commit and
instead call it just one. When the mechanism was introduced in
e900d494 (diff: add an API for deferred freeing, 2021-02-11),
however, we forgot to do the same to "diff-tree --stdin", which *is*
a moral equivalent to "git log".
During 2.36 release cycle, we started clearing the pathspec in
diff_free(), so programs like gitk that runs
git diff-tree --stdin -- <pathspec>
downstream of a pipe, processing one commit after another, started
showing irrelevant comparison outside the given <pathspec> from the
second commit. The same commit, by forgetting to teach the .no_free
mechanism, broke "diff-tree --stdin -I<regexp>" and nobody noticed
it for over a year, presumably because it is so seldom used an
option.
But <pathspec> is a different story. The breakage was very
prominently visible and was reported immediately after 2.36 was
released.
Fix this breakage by mimicking how "git log" utilizes the .no_free
member so that "diff-tree --stdin" behaves more similarly to "log".
Protect the fix with a few new tests.
Reported-by: Matthias Aßhauer <mha1993@live.de>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 2d53975488.
3656f84278 (name-rev: prefer shorter names over following merges,
2021-12-04) broke the assumption of 2d53975488 (name-rev: release unused
name strings, 2020-02-04) that a better name for a child is a better
name for all of its ancestors as well, because it added a penalty for
generation > 0. This leads to strings being free(3)'d that are still
needed.
079f970971 (name-rev: sort tip names before applying, 2020-02-05)
already reduced the number of free(3) calls for the use case that
motivated the original patch (name-rev --all in the Chromium repository)
from ca. 44000 to 5, and 3656f84278 eliminated even those few. So this
revert won't affect name-rev's performance on that particular repo.
Reported-by: Thomas Hurst <tom@hur.st>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in 707d2f2fe8 (CI: use "$runs_on_pool", not
"$jobname" to select packages & config, 2021-11-23).
In that commit I changed CC=gcc from CC=gcc-9, but on OSX the "gcc" in
$PATH points to clang, we need to use gcc-9 instead. Likewise for the
linux-gcc job CC=gcc-8 was changed to the implicit CC=gcc, which would
select GCC 9.4.0 instead of GCC 8.4.0.
Furthermore in 25715419bf (CI: don't run "make test" twice in one
job, 2021-11-23) when the "linux-TEST-vars" job was split off from
"linux-gcc" the "cc_package: gcc-8" line was copied along with
it, so its "cc_package" line wasn't working as intended either.
As a table, this is what's changed by this commit, i.e. it only
affects the linux-gcc, linux-TEST-vars and osx-gcc jobs:
|-------------------+-----------+-------------------+-------+-------|
| jobname | vector.cc | vector.cc_package | old | new |
|-------------------+-----------+-------------------+-------+-------|
| linux-clang | clang | - | clang | clang |
| linux-sha256 | clang | - | clang | clang |
| linux-gcc | gcc | gcc-8 | gcc | gcc-8 |
| osx-clang | clang | - | clang | clang |
| osx-gcc | gcc | gcc-9 | clang | gcc-9 |
| linux-gcc-default | gcc | - | gcc | gcc |
| linux-TEST-vars | gcc | gcc-8 | gcc | gcc-8 |
|-------------------+-----------+-------------------+-------+-------|
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of work around for CI breaking warnings from gcc 12.
* cb/buggy-gcc-12-workaround:
config.mak.dev: alternative workaround to gcc 12 warning in http.c
config.mak.dev: workaround gcc 12 bug affecting "pedantic" CI job
This provides a "no code change needed" option to the "fix" currently
queued as part of ab/http-gcc-12-workaround and therefore should be
reverted once that gets merged.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally noticed by Peff[1], but yet to be corrected[2] and planned to
be released with Fedora 36 (scheduled for Apr 19).
dir.c: In function ‘git_url_basename’:
dir.c:3085:13: error: ‘memchr’ specified bound [9223372036854775808, 0] exceeds maximum object size 9223372036854775807 [-Werror=stringop-overread]
3085 | if (memchr(start, '/', end - start) == NULL
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fedora is used as part of the CI, and therefore that release will trigger
failures, unless the version of the image used is locked to an older
release, as an alternative.
Restricting the flag to the affected source file, as well as implementing
an independent facility to track these workarounds was specifically punted
to minimize the risk of introducing problems so close to a release.
This change should be reverted once the underlying gcc bug is solved and
which should be visible by NOT triggering a warning, otherwise.
[1] https://lore.kernel.org/git/YZQhLh2BU5Hquhpo@coredump.intra.peff.net/
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2075786
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revert the "deletion of a ref should not trigger transaction events
for loose and packed ref backends separately" that regresses the
behaviour when a ref is not modified since it was packed.
* jc/revert-ref-transaction-hook-changes:
RelNotes: revert the description on the reverted topics
Revert "fetch: increase test coverage of fetches"
Revert "Merge branch 'ps/avoid-unnecessary-hook-invocation-with-packed-refs'"
* update the following words translations: commit, untracked, stage,
cache, stash, work..., index, reset, label, check..., tags, graft,
alternate object, amend, ancestor, cherry-pick, bisect, blame, chain,
cache, bug, chunk, branch, bundle, clean, clone, commit-graph, commit
object, commit-ish, committer, cover letter, conflict, dangling,
detach, dir, dumb, fast-forward, file system, fixup, fork, fetch, Git
archive, gitdir, graft, replace ref
* correct some mispellings
* git-po-helper update
* remove some obsolete lines
* unfuzzy entries
* random translation updates
* update contact in pt_PT.po
* add the following words to the translation table: override, recurse,
print, offset, unbundle, mirror repository, multi-pack, bad,
whitespace, batch
* remove the following words of the translation table: core Git
* change the following words on the translation table: dry-run, apply,
patch, replay, blame, chain, gitdir, file system, fork, unset, handle
* some translation to the first person
* update copyright text
* word 'utilização:' to 'uso:'
* word 'pai' to 'parente'
Signed-off-by: Daniel Santos <dacs.git@brilhante.top>
We do not have to guess how common the mistake the change targets is
when describing it. Such an argument may be good while proposing a
change, but does not quite belong in the record of what has already
happened, i.e. a release note.
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 2a0cafd464,
as it expects a working "a ref deletion must produce a single
transaction, not one for loose and another for packed" topic,
which we do not have.