Commit Graph

64266 Commits

Author SHA1 Message Date
Junio C Hamano
c8f491668e Merge branch 'dd/diff-files-unmerged-fix'
"git diff --relative" segfaulted and/or produced incorrect result
when there are unmerged paths.

* dd/diff-files-unmerged-fix:
  diff-lib: ignore paths that are outside $cwd if --relative asked
2021-09-08 13:30:31 -07:00
Junio C Hamano
85246a7054 Merge branch 'dd/t6300-wo-gpg-fix'
Test fix.

* dd/t6300-wo-gpg-fix:
  t6300: check for cat-file exit status code
  t6300: don't run cat-file on non-existent object
2021-09-08 13:30:31 -07:00
Junio C Hamano
efae5c2e22 Merge branch 'mh/credential-leakfix'
Leak fix.

* mh/credential-leakfix:
  credential: fix leak in credential_apply_config()
2021-09-08 13:30:30 -07:00
Junio C Hamano
a20a40e3b6 Merge branch 'jk/t5323-no-pack-test-fix'
Test fix.

* jk/t5323-no-pack-test-fix:
  t5323: drop mentions of "master"
2021-09-08 13:30:30 -07:00
Junio C Hamano
4293c057dc Merge branch 'js/maintenance-launchctl-fix'
"git maintenance" scheduler fix for macOS.

* js/maintenance-launchctl-fix:
  maintenance: skip bootout/bootstrap when plist is registered
  maintenance: create `launchctl` configuration using a lock file
2021-09-08 13:30:29 -07:00
Junio C Hamano
31e4a0db03 Merge branch 'ab/rebase-fatal-fatal-fix'
Error message fix.

* ab/rebase-fatal-fatal-fix:
  rebase: emit one "fatal" in "fatal: fatal: <error>"
2021-09-08 13:30:29 -07:00
Junio C Hamano
63ddde68cd Merge branch 'ab/ls-remote-packet-trace'
Debugging aid fix.

* ab/ls-remote-packet-trace:
  ls-remote: set packet_trace_identity(<name>)
2021-09-08 13:30:28 -07:00
Junio C Hamano
ce7ae09bd4 Merge branch 'rs/git-mmap-uses-malloc'
mmap() imitation used to call xmalloc() that dies upon malloc()
failure, which has been corrected to just return an error to the
caller to be handled.

* rs/git-mmap-uses-malloc:
  compat: let git_mmap use malloc(3) directly
2021-09-08 13:30:27 -07:00
Junio C Hamano
e18f4de927 Merge branch 'ga/send-email-sendmail-cmd'
Test fix.

* ga/send-email-sendmail-cmd:
  t9001: PATH must not use Windows-style paths
2021-09-08 13:30:27 -07:00
Junio C Hamano
03137a4804 Merge branch 'me/t5582-cleanup'
Test fix.

* me/t5582-cleanup:
  t5582: remove spurious 'cd "$D"' line
2021-09-08 13:30:26 -07:00
Johannes Schindelin
7e44ff7a39 pull: release packs before fetching
On Windows, files cannot be removed nor renamed if there are still
handles held by a process. To remedy that, we try to release all open
handles to any `.pack` file before e.g. repacking (which would want to
remove the original `.pack` file(s) after it is done).

Since the `read_cache_unmerged()` and/or the `get_oid()` call in `git
pull` can cause `.pack` files to be opened, we need to release the open
handles before calling `git fetch`: the latter process might want to
spawn an auto-gc, which in turn might want to repack the objects.

This commit is similar in spirit to 5bdece0d70 (gc/repack: release
packs when needed, 2018-12-15).

This fixes https://github.com/git-for-windows/git/issues/3336.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 12:17:15 -07:00
Johannes Schindelin
957ba814bf commit-graph: when closing the graph, also release the slab
The slab has information about the commit graph. That means that it is
meaningless (and even misleading) when the commit graph was closed.

This seems not to matter currently, but we're about to fix a
Windows-specific bug where `git pull` does not close the object store
before fetching (risking that an implicit auto-gc fails to remove the
now-obsolete pack file(s)), and once we have that bug fix in place, it
does matter: after that bug fix, we will open the object store, do some
stuff with it, then close it, fetch, and then open it again, and do more
stuff. If we close the commit graph without releasing the corresponding
slab, we're hit by a symptom like this in t5520.19:

	BUG: commit-reach.c:85: bad generation skip 9223372036854775807
	> 3 at 5cd378271655d43a3b4477520014f02213ad1546

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 12:17:14 -07:00
Jonathan Tan
18a2f66d8a t7814: show lack of alternate ODB-adding
The previous patches have made "git grep" no longer need to add
submodule ODBs as alternates, at least for the code paths tested in
t7814. Demonstrate this by making adding a submodule ODB as an alternate
fatal in this test.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:13 -07:00
Jonathan Tan
e3e8bf046e submodule-config: pass repo upon blob config read
When reading the config of a submodule, if reading from a blob, read
using an explicitly specified repository instead of by adding the
submodule's ODB as an alternate and then reading an object from
the_repository.

This makes the "grep --recurse-submodules with submodules without
.gitmodules in the working tree" test in t7814 work when
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB is true.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:09 -07:00
Jonathan Tan
0693806bf8 grep: add repository to OID grep sources
Record the repository whenever an OID grep source is created, and teach
the worker threads to explicitly provide the repository when accessing
objects.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:05 -07:00
Jonathan Tan
dd45471a37 grep: allocate subrepos on heap
Currently, struct repository objects corresponding to submodules are
allocated on the stack in grep_submodule(). This currently works because
they will not be used once grep_submodule() exits, but a subsequent
patch will require these structs to be accessible for longer (perhaps
even in another thread). Allocate them on the heap and clear them only
at the very end.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:48:02 -07:00
Jonathan Tan
78ca584f1c grep: read submodule entry with explicit repo
Replace an existing parse_object_or_die() call (which implicitly works
on the_repository) with a function call that allows a repository to be
passed in. There is no such direct equivalent to parse_object_or_die(),
but we only need the type of the object, so replace with
oid_object_info().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:59 -07:00
Jonathan Tan
50d92b5f03 grep: typesafe versions of grep_source_init
grep_source_init() can create "struct grep_source" objects and,
depending on the value of the type passed, some void-pointer parameters have
different meanings. Because one of these types (GREP_SOURCE_OID) will
require an additional parameter in a subsequent patch, take the
opportunity to increase clarity and type safety by replacing this
function with individual functions for each type.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:55 -07:00
Jonathan Tan
8d33c3af0b grep: use submodule-ODB-as-alternate lazy-addition
In the parent commit, Git was taught to add submodule ODBs as alternates
lazily, but grep does not use this because it computes the path to add
directly, not going through add_submodule_odb(). Add an equivalent to
add_submodule_odb() that takes the exact ODB path and teach grep to use
it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:49 -07:00
Jonathan Tan
a35e03dee0 submodule: lazily add submodule ODBs as alternates
Teach Git to add submodule ODBs as alternates to the object store of
the_repository only upon the first access of an object not in
the_repository, and not when add_submodule_odb() is called.

This provides a means of gradually migrating from accessing a
submodule's object through alternates to accessing a submodule's object
by explicitly passing its repository object. Any Git command can declare
that it might access submodule objects by calling add_submodule_odb()
(as they do now), but the submodule ODBs themselves will not be added
until needed, so individual commands and/or combinations of arguments
can be migrated one by one.

[The advantage of explicit repository-object passing is code clarity (it
is clear which repository an object read is from), performance (there is
no need to linearly search through all submodule ODBs whenever an object
is accessed from any repository, whether superproject or submodule), and
the possibility of future features like partial clone submodules (which
right now is not possible because if an object is missing, we do not
know which repository to lazy-fetch into).]

This commit also introduces an environment variable that a test may set
to make the actual registration of alternates fatal, in order to
demonstrate that its codepaths do not need this registration.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08 11:47:36 -07:00
Derrick Stolee
55dfcf9591 sparse-checkout: clear tracked sparse dirs
When changing the scope of a sparse-checkout using cone mode, we might
have some tracked directories go out of scope. The current logic removes
the tracked files from within those directories, but leaves the ignored
files within those directories. This is a bit unexpected to users who
have given input to Git saying they don't need those directories
anymore.

This is something that is new to the cone mode pattern type: the user
has explicitly said "I want these directories and _not_ those
directories." The typical sparse-checkout patterns more generally apply
to "I want files with with these patterns" so it is natural to leave
ignored files as they are. This focus on directories in cone mode
provides us an opportunity to change the behavior.

Leaving these ignored files in the sparse directories makes it
impossible to gain performance benefits in the sparse index. When we
track into these directories, we need to know if the files are ignored
or not, which might depend on the _tracked_ .gitignore file(s) within
the sparse directory. This depends on the indexed version of the file,
so the sparse directory must be expanded.

We must take special care to look for untracked, non-ignored files in
these directories before deleting them. We do not want to delete any
meaningful work that the users were doing in those directories and
perhaps forgot to add and commit before switching sparse-checkout
definitions. Since those untracked files might be code files that
generated ignored build output, also do not delete any ignored files
from these directories in that case. The users can recover their state
by resetting their sparse-checkout definition to include that directory
and continue. Alternatively, they can see the warning that is presented
and delete the directory themselves to regain the performance they
expect.

By deleting the sparse directories when changing scope (or running 'git
sparse-checkout reapply') we regain these performance benefits as if the
repository was in a clean state.

Since these ignored files are frequently build output or helper files
from IDEs, the users should not need the files now that the tracked
files are removed. If the tracked files reappear, then they will have
newer timestamps than the build artifacts, so the artifacts will need to
be regenerated anyway.

Use the sparse-index as a data structure in order to find the sparse
directories that can be safely deleted. Re-expand the index to a full
one if it was full before.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
Derrick Stolee
ce7a9f0141 sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
The convert_to_sparse() method checks for the GIT_TEST_SPARSE_INDEX
environment variable or the "index.sparse" config setting before
converting the index to a sparse one. This is for ease of use since all
current consumers are preparing to compress the index before writing it
to disk. If these settings are not enabled, then convert_to_sparse()
silently returns without doing anything.

We will add a consumer in the next change that wants to use the sparse
index as an in-memory data structure, regardless of whether the on-disk
format should be sparse.

To that end, create the SPARSE_INDEX_MEMORY_ONLY flag that will skip
these config checks when enabled. All current consumers are modified to
pass '0' in the new 'flags' parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
Derrick Stolee
77efbb366a attr: be careful about sparse directories
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
Derrick Stolee
02155c8c00 sparse-checkout: create helper methods
As we integrate the sparse index into more builtins, we occasionally
need to check the sparse-checkout patterns to see if a path is within
the sparse-checkout cone. Create some helper methods that help
initialize the patterns and check for pattern matching to make this
easier.

The existing callers of commands like get_sparse_checkout_patterns() use
a custom 'struct pattern_list' that is not necessarily the one in the
'struct index_state', so there are not many previous uses that could
adopt these helpers. There are just two in builtin/add.c and
sparse-index.c that can use path_in_sparse_checkout().

We add a path_in_cone_mode_sparse_checkout() as well that will only
return false if the path is outside of the sparse-checkout definition
_and_ the sparse-checkout patterns are in cone mode.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:10 -07:00
Derrick Stolee
8a96b9d0a7 sparse-index: use WRITE_TREE_MISSING_OK
When updating the cache tree in convert_to_sparse(), the
WRITE_TREE_MISSING_OK flag indicates that trees might be computed that
do not already exist within the object database. This happens in cases
such as 'git add' creating new trees that it wants to store in
anticipation of a following 'git commit'. If this flag is not specified,
then it might trigger a promisor fetch or a failure due to the object
not existing locally.

Use WRITE_TREE_MISSING_OK during convert_to_sparse() to avoid these
possible reasons for the cache_tree_update() to fail.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
Derrick Stolee
5dc16756b2 sparse-index: silently return when cache tree fails
If cache_tree_update() returns a non-zero value, then it could not
create the cache tree. This is likely due to a path having a merge
conflict. Since we are already returning early, let's return silently to
avoid making it seem like we failed to write the index at all.

If we remove our dependence on the cache tree within
convert_to_sparse(), then we could still recover from this scenario and
have a sparse index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
Derrick Stolee
72d84ea347 unpack-trees: fix nested sparse-dir search
The iterated search in find_cache_entry() was recently modified to
include a loop that searches backwards for a sparse directory entry that
matches the given traverse_info and name_entry. However, the string
comparison failed to actually concatenate those two strings, so this
failed to find a sparse directory when it was not a top-level directory.

This caused some errors in rare cases where a 'git checkout' spanned a
diff that modified files within the sparse directory entry, but we could
not correctly find the entry.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
Derrick Stolee
e27eab45c7 sparse-index: silently return when not using cone-mode patterns
While the sparse-index is only enabled when core.sparseCheckoutCone is
also enabled, it is possible for the user to modify the sparse-checkout
file manually in a way that does not match cone-mode patterns. In this
case, we should refuse to convert an index into a sparse index, since
the sparse_checkout_patterns will not be initialized with recursive and
parent path hashsets.

Also silently return if there are no cache entries, which is a simple
case: there are no paths to make sparse!

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
Derrick Stolee
522d3cec00 t7519: rewrite sparse index test
The sparse index is tested with the FS Monitor hook and extension since
f8fe49e (fsmonitor: integrate with sparse index, 2021-07-14). This test
was very fragile because it shared an index across sparse and non-sparse
behavior. Since that expansion and contraction could cause the index to
lose its FS Monitor bitmap and token, behavior is fragile to changes in
'git sparse-checkout set'.

Rewrite the test to use two clones of the original repo: full and
sparse. This allows us to also keep the test files (actual, expect,
trace2.txt) out of the repos we are testing with 'git status'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:41:09 -07:00
Ævar Arnfjörð Bjarmason
8fe8bae9d2 pack-write: skip *.rev work when not writing *.rev
Fix a performance regression introduced in a587b5a786 (pack-write.c:
extract 'write_rev_file_order', 2021-03-30) and stop needlessly
allocating the "pack_order" array and sorting it with
"pack_order_cmp()", only to throw that work away when we discover that
we're not writing *.rev files after all.

This redundant work was not present in the original version of this
code added in 8ef50d9958 (pack-write.c: prepare to write 'pack-*.rev'
files, 2021-01-25). There we'd call write_rev_file() from
e.g. finish_tmp_packfile(), but we'd "return NULL" early in
write_rev_file() if not doing a "WRITE_REV" or "WRITE_REV_VERIFY".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 22:04:03 -07:00
Junio C Hamano
92a5d1c9b4 hash-object: prefix_filename() returns allocated memory these days
Back when a1be47e4 (hash-object: fix buffer reuse with --path in a
subdirectory, 2017-03-20) was written, the prefix_filename() helper
used a static piece of memory to the caller, making the caller
responsible for copying it, if it wants to keep it across another
call to the same function.  Two callers of the prefix_filename() in
hash-object were made to xstrdup() the value obtained from it.

But in the same series, when e4da43b1 (prefix_filename: return newly
allocated string, 2017-03-20) changed the rule to gave the caller
possession of the memory, we forgot to revert one of the xstrdup()
changes, allowing the returned value to leak.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 15:18:59 -07:00
Bagas Sanjaya
ca0cc98e03 Documentation: fix default directory of git bugreport -o
git bugreport writes bug report to the current directory by default,
instead of repository root.

Fix the documentation.

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 14:25:13 -07:00
Ævar Arnfjörð Bjarmason
72b113e562 Makefile: remove the check_bindir script
This script was added in f28ac70f48 (Move all dashed-form commands to
libexecdir, 2007-11-28) when commands such as "git-add" lived in the
bin directory, instead of the git exec directory.

This notice helped someone incorrectly installing version v1.6.0 and
later into a directory built for a pre-v1.6.0 git version.

We're now long past the point where anyone who'd be helped by this
warning is likely to be doing that, so let's just remove this check
and warning to simplify the Makefile.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:27:40 -07:00
Ævar Arnfjörð Bjarmason
b996f84989 send-email: fix a "first config key wins" regression in v2.33.0
Fix a regression in my c95e3a3f0b (send-email: move trivial config
handling to Perl, 2021-05-28) where we'd pick the first config key out
of multiple defined ones, instead of using the normal "last key wins"
semantics of "git config --get".

This broke e.g. cases where a .git/config would have a different
sendemail.smtpServer than ~/.gitconfig. We'd pick the ~/.gitconfig
over .git/config, instead of preferring the repository-local
version. The same would go for /etc/gitconfig etc.

The full list of impacted config keys (the %config_settings values
which are references to scalars, not arrays) is:

    sendemail.smtpencryption
    sendemail.smtpserver
    sendemail.smtpserverport
    sendemail.smtpuser
    sendemail.smtppass
    sendemail.smtpdomain
    sendemail.smtpauth
    sendemail.smtpbatchsize
    sendemail.smtprelogindelay
    sendemail.tocmd
    sendemail.cccmd
    sendemail.aliasfiletype
    sendemail.envelopesender
    sendemail.confirm
    sendemail.from
    sendemail.assume8bitencoding
    sendemail.composeencoding
    sendemail.transferencoding
    sendemail.sendmailcmd

I.e. having any of these set in say ~/.gitconfig and in-repo
.git/config regressed in v2.33.0 to prefer the --global one over the
--local.

To test this add a test of config priority to one of these config
variables, most don't have tests at all, but there was an existing one
for sendemail.8bitEncoding.

The "git config" (instead of "test_config") is somewhat of an
anti-pattern, but follows established conventions in
t9001-send-email.sh, likewise with any other pattern or idiom in this
test.

The populating of home/.gitconfig and setting of HOME= is copied from
a test in t0017-env-helper.sh added in 1ff750b128 (tests: make
GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21). This test fails
without this bugfix, but now it works.

Reported-by: Eli Schwartz <eschwartz@archlinux.org>
Tested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:18:55 -07:00
René Scharfe
709b3f32d3 range-diff: avoid segfault with -I
output() reuses the same struct diff_options for multiple calls of
diff_flush().  Set the option no_free to instruct it to keep the
ignore regexes between calls and release them explicitly at the end.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 13:03:13 -07:00
Sergey Organov
5acffd3473 diff-index: restore -c/--cc options handling
This fixes 19b2517f (diff-merges: move specific diff-index "-m"
handling to diff-index, 2021-05-21).

That commit disabled handling of all diff for merges options in
diff-index on an assumption that they are unused. However, it later
appeared that -c and --cc, even though undocumented and not being
covered by tests, happen to have had particular effect on diff-index
output.

Restore original -c/--cc options handling by diff-index.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:11:35 -07:00
Ævar Arnfjörð Bjarmason
2d3491b117 tr2: log N parent process names on Linux
In 2f732bf15e (tr2: log parent process name, 2021-07-21) we started
logging parent process names, but only logged all parents on Windows.
on Linux only the name of the immediate parent process was logged.

Extend the functionality added there to also log full parent chain on
Linux.

This requires us to lookup "/proc/<getppid()>/stat" instead of
"/proc/<getppid()>/comm". The "comm" file just contains the name of the
process, but the "stat" file has both that information, and the parent
PID of that process, see procfs(5). We parse out the parent PID of our
own parent, and recursively walk the chain of "/proc/*/stat" files all
the way up the chain. A parent PID of 0 indicates the end of the
chain.

It's possible given the semantics of Linux's PID files that we end up
getting an entirely nonsensical chain of processes. It could happen if
e.g. we have a chain of processes like:

    1 (init) => 321 (bash) => 123 (git)

Let's assume that "bash" was started a while ago, and that as shown
the OS has already cycled back to using a lower PID for us than our
parent process. In the time it takes us to start up and get to
trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent
process might exit, and be replaced by an entirely different process!

We'd racily look up our own getppid(), but in the meantime our parent
would exit, and Linux would have cycled all the way back to starting
an entirely unrelated process as PID 321.

If that happens we'll just silently log incorrect data in our ancestry
chain. Luckily we don't need to worry about this except in this
specific cycling scenario, as Linux does not have PID
randomization. It appears it once did through a third-party feature,
but that it was removed around 2006[1]. For anyone worried about this
edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate
it, but not eliminate it.

One thing we don't need to worry about is getting into an infinite
loop when walking "/proc/*/stat". See 353d3d77f4 (trace2: collect
Windows-specific process information, 2019-02-22) for the related
Windows code that needs to deal with that, and [2] for an explanation
of that edge case.

Aside from potential race conditions it's also a bit painful to
correctly parse the process name out of "/proc/*/stat". A simpler
approach is to use fscanf(), see [3] for an implementation of that,
but as noted in the comment being added here it would fail in the face
of some weird process names, so we need our own parse_proc_stat() to
parse it out.

With this patch the "ancestry" chain for a trace2 event might look
like this:

    $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry
    [
      "bash",
      "screen",
      "systemd"
    ]

And in the case of naughty process names like the following. This uses
perl's ability to use prctl(PR_SET_NAME, ...). See
Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on
assignment to $0 on Linux, 2010-04-15)[4]:

    $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry
    [
      "sh",
      "(naughty\nname)",
      "bash",
      "screen",
      "systemd"
    ]

1. https://grsecurity.net/news#grsec2110
2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/
3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/
4. 7636ea95c5

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:08:00 -07:00
Ævar Arnfjörð Bjarmason
326460a870 tr2: do compiler enum check in trace2_collect_process_info()
Change code added in 2f732bf15e (tr2: log parent process name,
2021-07-21) to use a switch statement without a "default" branch to
have the compiler error if this code ever drifts out of sync with the
members of the "enum trace2_process_info_reason".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
Ævar Arnfjörð Bjarmason
6eccfc3adf tr2: leave the parent list empty upon failure & don't leak memory
In a subsequent commit I'll be replacing most of this code to log N
parents, but let's first fix bugs introduced in the recent
2f732bf15e (tr2: log parent process name, 2021-07-21).

It was using the strbuf_read_file() in the wrong way, its return value
is either a length or a negative value on error. If we didn't have a
procfs, or otherwise couldn't access it we'd end up pushing an empty
string to the trace2 ancestry array.

It was also using the strvec_push() API the wrong way. That API always
does an xstrdup(), so by detaching the strbuf here we'd leak
memory. Let's instead pass in our pointer for strvec_push() to
xstrdup(), and then free our own strbuf. I do have some WIP changes to
make strvec_push_nodup() non-static, which makes this and some other
callsites nicer, but let's just follow the prevailing pattern of using
strvec_push() for now.

We'll also need to free that "procfs_path" strbuf whether or not
strbuf_read_file() succeeds, which was another source of memory leaks
in 2f732bf15e, i.e. we'd leak that memory as well if we weren't on a
system where we could read the file from procfs.

Let's move all the freeing of the memory to the end of the
function. If we're still at STRBUF_INIT with "name" due to not having
taken the branch where the strbuf_read_file() succeeds freeing it is
redundant. So we could move it into the body of the "if", but just
handling freeing the same way for all branches of the function makes
it more readable.

In combination with the preceding commit this makes all of
t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
Ævar Arnfjörð Bjarmason
48f68715b1 tr2: stop leaking "thread_name" memory
Fix a memory leak introduced in ee4512ed48 (trace2: create new
combined trace facility, 2019-02-22), we were doing a free() of other
memory allocated in tr2tls_create_self(), but not the "thread_name"
"struct strbuf".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
Ævar Arnfjörð Bjarmason
f2cc8881d7 tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
Rewrite a comment added in 2f732bf15e (tr2: log parent process name,
2021-07-21) to describe what we might do under
TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring
to "something extra".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
Ævar Arnfjörð Bjarmason
7d9c80f626 tr2: remove NEEDSWORK comment for "non-procfs" implementations
I'm fairly sure that there is no way on Linux to inspect the process
tree without using procfs, any tool such as ps(1), top(1) etc. that
shows this sort of information ultimately looks the information up in
procfs.

So let's remove this comment added in 2f732bf15e (tr2: log parent
process name, 2021-07-21), it's setting us up for an impossible task.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 11:07:59 -07:00
Ævar Arnfjörð Bjarmason
d941cc4c34 bundle: show progress on "unbundle"
The "unbundle" command added in 2e0afafebd (Add git-bundle: move
objects and references by archive, 2007-02-22) did not show progress
output, even though the underlying API learned how to show progress in
be042aff24 (Teach progress eye-candy to fetch_refs_from_bundle(),
2011-09-18).

Now we'll show "Unbundling objects" using the new --progress-title
option to "git index-pack", to go with its existing "Receiving
objects" and "Indexing objects" (which it shows when invoked with
"--stdin", and with a pack file, respectively).

Unlike "git bundle create" we don't handle "--quiet" here, nor
"--all-progress" and "--all-progress-implied". Those are all specific
to "create" (and "verify", in the case of "--quiet").

The structure of the existing documentation is a bit unclear, e.g. the
documentation for the "--quiet" option added in
79862b6b77 (bundle-create: progress output control, 2019-11-10) only
describes how it works for "create", and not for "verify". That and
other issues in it should be fixed, but I'd like to avoid untangling
that mess right now. Let's just support the standard "--no-progress"
implicitly here, and leave cleaning up the general behavior of "git
bundle" for a later change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:59:23 -07:00
Ævar Arnfjörð Bjarmason
f46c46e4f2 index-pack: add --progress-title option
Add a --progress-title option to index-pack, when data is piped into
index-pack its progress is a proxy for whatever's feeding it data.

This option will allow us to set a more relevant progress bar title in
"git bundle unbundle", and is also used in my "bundle-uri" RFC
patches[1] by a new caller in fetch-pack.c.

The code change in cmd_index_pack() won't handle
"--progress-title=xyz", only "--progress-title xyz", and the "(i+1)"
style (as opposed to "i + 1") is a bit odd.

Not using the "--long-option=value" style is inconsistent with
existing long options handled by cmd_index_pack(), but makes the code
that needs to call it better (two strvec_push(), instead of needing a
strvec_pushf()). Since the option is internal-only the inconsistency
shouldn't matter.

I'm copying the pattern to handle it as-is from the handling of the
existing "-o" option in the same function, see 9cf6d3357a (Add
git-index-pack utility, 2005-10-12) for its addition. That's a short
option, but the code to implement the two is the same in functionality
and style. Eventually we'd like to migrate all of this this to
parse_options(), which would make these differences in behavior go
away.

1. https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:59:23 -07:00
Ævar Arnfjörð Bjarmason
7366096de9 bundle API: change "flags" to be "extra_index_pack_args"
Since the "flags" parameter was added in be042aff24 (Teach progress
eye-candy to fetch_refs_from_bundle(), 2011-09-18) there's never been
more than the one flag: BUNDLE_VERBOSE.

Let's have the only caller who cares about that pass "-v" itself
instead through new "extra_index_pack_args" parameter. The flexibility
of being able to pass arbitrary arguments to "unbundle" will be used
in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:59:23 -07:00
Lénaïc Huard
b681b191f9 maintenance: add support for systemd timers on Linux
The existing mechanism for scheduling background maintenance is done
through cron. On Linux systems managed by systemd, systemd provides an
alternative to schedule recurring tasks: systemd timers.

The main motivations to implement systemd timers in addition to cron
are:
* cron is optional and Linux systems running systemd might not have it
  installed.
* The execution of `crontab -l` can tell us if cron is installed but not
  if the daemon is actually running.
* With systemd, each service is run in its own cgroup and its logs are
  tagged by the service inside journald. With cron, all scheduled tasks
  are running in the cron daemon cgroup and all the logs of the
  user-scheduled tasks are pretended to belong to the system cron
  service.
  Concretely, a user that doesn’t have access to the system logs won’t
  have access to the log of their own tasks scheduled by cron whereas
  they will have access to the log of their own tasks scheduled by
  systemd timer.
  Although `cron` attempts to send email, that email may go unseen by
  the user because these days, local mailboxes are not heavily used
  anymore.

In order to schedule git maintenance, we need two unit template files:
* ~/.config/systemd/user/git-maintenance@.service
  to define the command to be started by systemd and
* ~/.config/systemd/user/git-maintenance@.timer
  to define the schedule at which the command should be run.

Those units are templates that are parameterized by the frequency.

Based on those templates, 3 timers are started:
* git-maintenance@hourly.timer
* git-maintenance@daily.timer
* git-maintenance@weekly.timer

The command launched by those three timers are the same as with the
other scheduling methods:

/path/to/git for-each-repo --exec-path=/path/to
--config=maintenance.repo maintenance run --schedule=%i

with the full path for git to ensure that the version of git launched
for the scheduled maintenance is the same as the one used to run
`maintenance start`.

The timer unit contains `Persistent=true` so that, if the computer is
powered down when a maintenance task should run, the task will be run
when the computer is back powered on.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:57:04 -07:00
Lénaïc Huard
eba1ba9d32 maintenance: git maintenance run learned --scheduler=<scheduler>
Depending on the system, different schedulers can be used to schedule
the hourly, daily and weekly executions of `git maintenance run`:
* `launchctl` for MacOS,
* `schtasks` for Windows and
* `crontab` for everything else.

`git maintenance run` now has an option to let the end-user explicitly
choose which scheduler he wants to use:
`--scheduler=auto|crontab|launchctl|schtasks`.

When `git maintenance start --scheduler=XXX` is run, it not only
registers `git maintenance run` tasks in the scheduler XXX, it also
removes the `git maintenance run` tasks from all the other schedulers to
ensure we cannot have two schedulers launching concurrent identical
tasks.

The default value is `auto` which chooses a suitable scheduler for the
system.

`git maintenance stop` doesn't have any `--scheduler` parameter because
this command will try to remove the `git maintenance run` tasks from all
the available schedulers.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:57:04 -07:00
Lénaïc Huard
cb7db5bbd5 cache.h: Introduce a generic "xdg_config_home_for(…)" function
Current implementation of `xdg_config_home(filename)` returns
`$XDG_CONFIG_HOME/git/$filename`, with the `git` subdirectory inserted
between the `XDG_CONFIG_HOME` environment variable and the parameter.

This patch introduces a `xdg_config_home_for(subdir, filename)` function
which is more generic. It only concatenates "$XDG_CONFIG_HOME", or
"$HOME/.config" if the former isn’t defined, with the parameters,
without adding `git` in between.

`xdg_config_home(filename)` is now implemented by calling
`xdg_config_home_for("git", filename)` but this new generic function can
be used to compute the configuration directory of other programs.

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:57:04 -07:00
Philippe Blain
01c381037c test-lib-functions: keep user's debugger config files and TERM in 'debug'
The 'debug' function in test-lib-functions.sh is used to invoke a
debugger at a specific line in a test. It inherits the value of HOME and
TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb.

Changing the value of HOME means that any customization configured in a
developers' debugger configuration file (like $HOME/.gdbinit or
$HOME/.lldbinit) are not available in the debugger invoked by
'test_pause'.

Changing the value of TERM to 'dumb' means that colored output
is disabled in the debugger.

To make the debugging experience with 'debug' more pleasant, leverage
the variable USER_HOME, added in the previous commit, to copy a
developer's ~/.gdbinit and ~/.lldbinit to the test HOME. We do not set
HOME to USER_HOME as in 'test_pause' to avoid user configuration in
$USER_HOME/.gitconfig from interfering with the command being debugged.

Also, add a flag to launch the debugger with the original value of
TERM, and add the same warning as for 'test_pause'.

Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:53:39 -07:00
Philippe Blain
add5240fa5 test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
The 'test_pause' function, which is designed to help interactive
debugging and exploration of tests, currently inherits the value of HOME
and TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb. It
also invokes the shell defined by TEST_SHELL_PATH, which defaults to
/bin/sh (through SHELL_PATH).

Changing the value of HOME means that any customization configured in a
developers' shell startup files and any Git aliases defined in their
global Git configuration file are not available in the shell invoked by
'test_pause'.

Changing the value of TERM to 'dumb' means that colored output
is disabled for all commands in that shell.

Using /bin/sh as the shell invoked by 'test_pause' is not ideal since
some platforms (i.e. Debian and derivatives) use Dash as /bin/sh, and
this shell is usually compiled without readline support, which makes for
a poor interactive command line experience.

To make the interactive command line experience in the shell invoked by
'test_pause' more pleasant, save the values of HOME and TERM in
USER_HOME and USER_TERM before changing them in test-lib.sh, and add
options to 'test_pause' to optionally use these variables to invoke the
shell. Also add an option to invoke SHELL instead of TEST_SHELL_PATH, so
that developer's interactive shell is used.

We use options instead of changing the behaviour unconditionally since
these three variables can slightly change command behaviour. Moreover,
using the original HOME means commands could overwrite files in a user's
home directory. Be explicit about these caveats in the new 'Usage'
section in test-lib-functions.sh.

Finally, add '[options]' to the test_pause synopsys in t/README, and
mention that the full list of helper functions and their options can be
found in test-lib-functions.sh.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07 10:53:39 -07:00