common-main: call sanitize_stdfds()
This is setup that should be done in every program for
safety, but we never got around to adding it everywhere (so
builtins benefited from the call in git.c, but any external
commands did not). Putting it in the common main() gives us
this safety everywhere.
Note that the case in daemon.c is a little funny. We wait
until we know whether we want to daemonize, and then either:
- call daemonize(), which will close stdio and reopen it to
/dev/null under the hood
- sanitize_stdfds(), to fix up any odd cases
But that is way too late; the point of sanitizing is to give
us reliable descriptors on 0/1/2, and we will already have
executed code, possibly called die(), etc. The sanitizing
should be the very first thing that happens.
With this patch, git-daemon will sanitize first, and can
remove the call in the non-daemonize case. It does mean that
daemonize() may just end up closing the descriptors we
opened, but that's not a big deal (it's not wrong to do so,
nor is it really less optimal than the case where our parent
process redirected us from /dev/null ahead of time).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 08:06:02 +02:00
|
|
|
#include "cache.h"
|
2018-04-10 23:26:18 +02:00
|
|
|
#include "exec-cmd.h"
|
2023-03-21 07:25:54 +01:00
|
|
|
#include "gettext.h"
|
2017-01-28 03:02:01 +01:00
|
|
|
#include "attr.h"
|
2023-03-21 07:26:05 +01:00
|
|
|
#include "setup.h"
|
2023-04-11 05:00:38 +02:00
|
|
|
#include "trace2.h"
|
add an extra level of indirection to main()
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).
Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.
Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.
We basically have two options to do this:
- the compat/mingw.h file already does something like this by
adding a #define that replaces the definition of main with a
wrapper that calls mingw_startup().
The upside is that the code in each program doesn't need
to be changed at all; it's rewritten on the fly by the
preprocessor.
The downside is that it may make debugging of the startup
sequence a bit more confusing, as the preprocessor is
quietly inserting new code.
- the builtin functions are all of the form cmd_foo(),
and git.c's main() calls them.
This is much more explicit, which may make things more
obvious to somebody reading the code. It's also more
flexible (because of course we have to figure out _which_
cmd_foo() to call).
The downside is that each of the builtins must define
cmd_foo(), instead of just main().
This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.
We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).
The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.
This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 07:58:58 +02:00
|
|
|
|
2016-07-01 08:06:35 +02:00
|
|
|
/*
|
|
|
|
* Many parts of Git have subprograms communicate via pipe, expect the
|
|
|
|
* upstream of a pipe to die with SIGPIPE when the downstream of a
|
|
|
|
* pipe does not need to read all that is written. Some third-party
|
|
|
|
* programs that ignore or block SIGPIPE for their own reason forget
|
|
|
|
* to restore SIGPIPE handling to the default before spawning Git and
|
|
|
|
* break this carefully orchestrated machinery.
|
|
|
|
*
|
|
|
|
* Restore the way SIGPIPE is handled to default, which is what we
|
|
|
|
* expect.
|
|
|
|
*/
|
|
|
|
static void restore_sigpipe_to_default(void)
|
|
|
|
{
|
|
|
|
sigset_t unblock;
|
|
|
|
|
|
|
|
sigemptyset(&unblock);
|
|
|
|
sigaddset(&unblock, SIGPIPE);
|
|
|
|
sigprocmask(SIG_UNBLOCK, &unblock, NULL);
|
|
|
|
signal(SIGPIPE, SIG_DFL);
|
|
|
|
}
|
|
|
|
|
2016-07-01 15:01:28 +02:00
|
|
|
int main(int argc, const char **argv)
|
add an extra level of indirection to main()
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).
Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.
Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.
We basically have two options to do this:
- the compat/mingw.h file already does something like this by
adding a #define that replaces the definition of main with a
wrapper that calls mingw_startup().
The upside is that the code in each program doesn't need
to be changed at all; it's rewritten on the fly by the
preprocessor.
The downside is that it may make debugging of the startup
sequence a bit more confusing, as the preprocessor is
quietly inserting new code.
- the builtin functions are all of the form cmd_foo(),
and git.c's main() calls them.
This is much more explicit, which may make things more
obvious to somebody reading the code. It's also more
flexible (because of course we have to figure out _which_
cmd_foo() to call).
The downside is that each of the builtins must define
cmd_foo(), instead of just main().
This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.
We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).
The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.
This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 07:58:58 +02:00
|
|
|
{
|
2019-02-22 23:25:01 +01:00
|
|
|
int result;
|
2021-12-09 06:08:26 +01:00
|
|
|
struct strbuf tmp = STRBUF_INIT;
|
2019-02-22 23:25:01 +01:00
|
|
|
|
2019-04-15 22:39:43 +02:00
|
|
|
trace2_initialize_clock();
|
|
|
|
|
common-main: call sanitize_stdfds()
This is setup that should be done in every program for
safety, but we never got around to adding it everywhere (so
builtins benefited from the call in git.c, but any external
commands did not). Putting it in the common main() gives us
this safety everywhere.
Note that the case in daemon.c is a little funny. We wait
until we know whether we want to daemonize, and then either:
- call daemonize(), which will close stdio and reopen it to
/dev/null under the hood
- sanitize_stdfds(), to fix up any odd cases
But that is way too late; the point of sanitizing is to give
us reliable descriptors on 0/1/2, and we will already have
executed code, possibly called die(), etc. The sanitizing
should be the very first thing that happens.
With this patch, git-daemon will sanitize first, and can
remove the call in the non-daemonize case. It does mean that
daemonize() may just end up closing the descriptors we
opened, but that's not a big deal (it's not wrong to do so,
nor is it really less optimal than the case where our parent
process redirected us from /dev/null ahead of time).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 08:06:02 +02:00
|
|
|
/*
|
|
|
|
* Always open file descriptors 0/1/2 to avoid clobbering files
|
|
|
|
* in die(). It also avoids messing up when the pipes are dup'ed
|
|
|
|
* onto stdin/stdout/stderr in the child processes we spawn.
|
|
|
|
*/
|
|
|
|
sanitize_stdfds();
|
2019-02-22 23:25:01 +01:00
|
|
|
restore_sigpipe_to_default();
|
|
|
|
|
2019-04-15 22:39:45 +02:00
|
|
|
git_resolve_executable_dir(argv[0]);
|
|
|
|
|
grep: fix multibyte regex handling under macOS
The commit 29de20504e (Makefile: fix default regex settings on
Darwin, 2013-05-11) fixed t0070-fundamental.sh under Darwin (macOS) by
adopting Git's regex library. However, this library is compiled with
NO_MBSUPPORT, which causes git-grep to work incorrectly on multibyte
(e.g. UTF-8) files. Current macOS versions pass t0070-fundamental.sh
with the native macOS regex library, which also supports multibyte
characters.
Adjust the Makefile to use the native regex library, and call
setlocale(3) to set CTYPE according to the user's preference.
The setlocale call is required on all platforms, but in platforms
supporting gettext(3), setlocale was called as a side-effect of
initializing gettext. Therefore, move the CTYPE setlocale call from
gettext.c to common-main.c and the corresponding locale.h include
into git-compat-util.h.
Thanks to the global initialization of CTYPE setlocale, the test-tool
regex command now works correctly with supported multibyte regexes, and
is used to set the MB_REGEX test prerequisite by assessing a platform's
support for them.
Signed-off-by: Diomidis Spinellis <dds@aueb.gr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-26 10:58:15 +02:00
|
|
|
setlocale(LC_CTYPE, "");
|
2016-07-01 08:07:01 +02:00
|
|
|
git_setup_gettext();
|
|
|
|
|
2018-03-03 12:35:54 +01:00
|
|
|
initialize_the_repository();
|
|
|
|
|
2017-01-28 03:02:01 +01:00
|
|
|
attr_start();
|
|
|
|
|
2019-08-06 14:27:26 +02:00
|
|
|
trace2_initialize();
|
|
|
|
trace2_cmd_start(argv);
|
|
|
|
trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP);
|
|
|
|
|
2021-12-09 06:08:26 +01:00
|
|
|
if (!strbuf_getcwd(&tmp))
|
|
|
|
tmp_original_cwd = strbuf_detach(&tmp, NULL);
|
|
|
|
|
2019-02-22 23:25:01 +01:00
|
|
|
result = cmd_main(argc, argv);
|
|
|
|
|
2022-06-02 14:25:32 +02:00
|
|
|
/* Not exit(3), but a wrapper calling our common_exit() */
|
|
|
|
exit(result);
|
|
|
|
}
|
|
|
|
|
2022-06-02 14:25:33 +02:00
|
|
|
static void check_bug_if_BUG(void)
|
|
|
|
{
|
|
|
|
if (!bug_called_must_BUG)
|
|
|
|
return;
|
|
|
|
BUG("on exit(): had bug() call(s) in this process without explicit BUG_if_bug()");
|
|
|
|
}
|
|
|
|
|
2022-06-02 14:25:32 +02:00
|
|
|
/* We wrap exit() to call common_exit() in git-compat-util.h */
|
|
|
|
int common_exit(const char *file, int line, int code)
|
|
|
|
{
|
common-main.c: call exit(), don't return
Change the main() function to call "exit()" instead of ending with a
"return" statement. The "exit()" function is our own wrapper that
calls trace2_cmd_exit_fl() for us, from git-compat-util.h:
#define exit(code) exit(trace2_cmd_exit_fl(__FILE__, __LINE__, (code)))
That "exit()" wrapper has been in use ever since ee4512ed481 (trace2:
create new combined trace facility, 2019-02-22).
This changes nothing about how we "exit()", as we'd invoke
"trace2_cmd_exit_fl()" in both cases due to the wrapper, this change
makes it easier to reason about this code, as we're now always
obviously relying on our "exit()" wrapper.
There is already code immediately downstream of our "main()" which has
a hard reliance on that, e.g. the various "exit()" calls downstream of
"cmd_main()" in "git.c".
We even had a comment in "t/helper/test-trace2.c" that seemed to be
confused about how the "exit()" wrapper interacted with uses of
"return", even though it was introduced in the same trace2 series in
a15860dca3f (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
t0212.sh, 2019-02-22), after the aforementioned ee4512ed481. Perhaps
it pre-dated the "exit()" wrapper?
This change makes the "trace2_cmd_exit()" macro orphaned, we now
always use "trace2_cmd_exit_fl()" directly, but let's keep that
simpler example in place. Even if we're unlikely to get another
"main()" other than the one in our "common-main.c", there's some value
in having the API documentation and example discuss a simpler version
that doesn't require an "exit()" wrapper macro.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 11:13:51 +01:00
|
|
|
/*
|
2022-06-02 14:25:32 +02:00
|
|
|
* For non-POSIX systems: Take the lowest 8 bits of the "code"
|
|
|
|
* to e.g. turn -1 into 255. On a POSIX system this is
|
|
|
|
* redundant, see exit(3) and wait(2), but as it doesn't harm
|
|
|
|
* anything there we don't need to guard this with an "ifdef".
|
common-main.c: call exit(), don't return
Change the main() function to call "exit()" instead of ending with a
"return" statement. The "exit()" function is our own wrapper that
calls trace2_cmd_exit_fl() for us, from git-compat-util.h:
#define exit(code) exit(trace2_cmd_exit_fl(__FILE__, __LINE__, (code)))
That "exit()" wrapper has been in use ever since ee4512ed481 (trace2:
create new combined trace facility, 2019-02-22).
This changes nothing about how we "exit()", as we'd invoke
"trace2_cmd_exit_fl()" in both cases due to the wrapper, this change
makes it easier to reason about this code, as we're now always
obviously relying on our "exit()" wrapper.
There is already code immediately downstream of our "main()" which has
a hard reliance on that, e.g. the various "exit()" calls downstream of
"cmd_main()" in "git.c".
We even had a comment in "t/helper/test-trace2.c" that seemed to be
confused about how the "exit()" wrapper interacted with uses of
"return", even though it was introduced in the same trace2 series in
a15860dca3f (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
t0212.sh, 2019-02-22), after the aforementioned ee4512ed481. Perhaps
it pre-dated the "exit()" wrapper?
This change makes the "trace2_cmd_exit()" macro orphaned, we now
always use "trace2_cmd_exit_fl()" directly, but let's keep that
simpler example in place. Even if we're unlikely to get another
"main()" other than the one in our "common-main.c", there's some value
in having the API documentation and example discuss a simpler version
that doesn't require an "exit()" wrapper macro.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-07 11:13:51 +01:00
|
|
|
*/
|
2022-06-02 14:25:32 +02:00
|
|
|
code &= 0xff;
|
|
|
|
|
2022-06-02 14:25:33 +02:00
|
|
|
check_bug_if_BUG();
|
2022-06-02 14:25:32 +02:00
|
|
|
trace2_cmd_exit_fl(file, line, code);
|
|
|
|
|
|
|
|
return code;
|
add an extra level of indirection to main()
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).
Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.
Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.
We basically have two options to do this:
- the compat/mingw.h file already does something like this by
adding a #define that replaces the definition of main with a
wrapper that calls mingw_startup().
The upside is that the code in each program doesn't need
to be changed at all; it's rewritten on the fly by the
preprocessor.
The downside is that it may make debugging of the startup
sequence a bit more confusing, as the preprocessor is
quietly inserting new code.
- the builtin functions are all of the form cmd_foo(),
and git.c's main() calls them.
This is much more explicit, which may make things more
obvious to somebody reading the code. It's also more
flexible (because of course we have to figure out _which_
cmd_foo() to call).
The downside is that each of the builtins must define
cmd_foo(), instead of just main().
This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.
We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).
The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.
This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 07:58:58 +02:00
|
|
|
}
|