git-commit-vandalism/tempfile.h
Jeff King 77a42b3b84 tempfile: drop active flag
Our tempfile struct contains an "active" flag. Long ago, this flag was
important: tempfile structs were always allocated for the lifetime of
the program and added to a global linked list, and the active flag was
what told us whether a struct's tempfile needed to be cleaned up on
exit.

But since 422a21c6a0 (tempfile: remove deactivated list entries,
2017-09-05) and 076aa2cbda (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we actually remove items from the list, and the active flag
is generally always set to true for any allocated struct. We set it to
true in all of the creation functions, and in the normal code flow it
becomes false only in deactivate_tempfile(), which then immediately
frees the struct.

So the flag isn't performing that role anymore, and in fact makes things
more confusing. Dscho noted that delete_tempfile() is a noop for an
inactive struct. Since 076aa2cbda taught it to free the struct when
deactivating, we'd leak any struct whose active flag is unset. But in
practice it's not a leak, because again, we'll free when we unset the
flag, and never see the allocated-but-inactive state.

Can we just get rid of the flag? The answer is yes, but it requires
looking at a few other spots:

  1. I said above that the flag only becomes false before we deallocate,
     but there's one exception: when we call remove_tempfiles() from a
     signal or atexit handler, we unset the active flag as we remove
     each file. This isn't important for delete_tempfile(), as nobody
     would call it anymore, since we're exiting.

     It does in theory provide us some protection against racily
     double-removing a tempfile. If we receive a second signal while we
     are already in the cleanup routines, we'll start the cleanup loop
     again, and may visit the same tempfile. But this race already
     exists, because calling unlink() and unsetting the active flag
     aren't atomic! And it's OK in practice, because unlink() is
     idempotent (barring the unlikely event that some other process
     chooses our exact temp filename in that instant).

     So dropping the active flag widens the race a bit, but it was
     already there, and is fairly harmless in practice. If we really
     care about addressing it, the right thing is probably to block
     further signals while we're doing our cleanup (which we could
     actually do atomically).

  2. The active flag is declared as "volatile sig_atomic_t". The idea is
     that it's the final bit that gets set to tell the cleanup routines
     that the tempfile is ready to be used (or not used), and it's safe
     to receive a signal racing with regular code which adds or removes
     a tempfile from the list.

     In practice, I don't think this is buying us anything. The presence
     on the linked list is really what tells the cleanup routines to
     look at the struct. That is already marked as "volatile". It's not
     a sig_atomic_t, so it's possible that we could see a sheared write
     there as an entry is added or removed. But that is true of the
     current code, too! Before we can even look at the "active" flag,
     we'd have to follow a link to the struct itself. If we see a
     sheared write in the pointer to the struct, then we'll look at
     garbage memory anyway, and there's not much we can do.

This patch removes the active flag entirely, using presence on the
global linked list as an indicator that a tempfile ought to be cleaned
up. We are already careful to add to the list as the final step in
activating. On deactivation, we'll make sure to remove from the list as
the first step, before freeing any fields. The use of the volatile
keyword should mean that those things happen in the expected order.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 14:16:49 -07:00

286 lines
9.9 KiB
C

#ifndef TEMPFILE_H
#define TEMPFILE_H
#include "list.h"
#include "strbuf.h"
/*
* Handle temporary files.
*
* The tempfile API allows temporary files to be created, deleted, and
* atomically renamed. Temporary files that are still active when the
* program ends are cleaned up automatically. Lockfiles (see
* "lockfile.h") are built on top of this API.
*
*
* Calling sequence
* ----------------
*
* The caller:
*
* * Attempts to create a temporary file by calling
* `create_tempfile()`. The resources used for the temporary file are
* managed by the tempfile API.
*
* * Writes new content to the file by either:
*
* * writing to the `tempfile->fd` file descriptor
*
* * calling `fdopen_tempfile()` to get a `FILE` pointer for the
* open file and writing to the file using stdio.
*
* Note that the file descriptor created by create_tempfile()
* is marked O_CLOEXEC, so the new contents must be written by
* the current process, not any spawned one.
*
* When finished writing, the caller can:
*
* * Close the file descriptor and remove the temporary file by
* calling `delete_tempfile()`.
*
* * Close the temporary file and rename it atomically to a specified
* filename by calling `rename_tempfile()`. This relinquishes
* control of the file.
*
* * Close the file descriptor without removing or renaming the
* temporary file by calling `close_tempfile_gently()`, and later call
* `delete_tempfile()` or `rename_tempfile()`.
*
* After the temporary file is renamed or deleted, the `tempfile`
* object is no longer valid and should not be reused.
*
* If the program exits before `rename_tempfile()` or
* `delete_tempfile()` is called, an `atexit(3)` handler will close
* and remove the temporary file.
*
* If you need to close the file descriptor yourself, do so by calling
* `close_tempfile_gently()`. You should never call `close(2)` or `fclose(3)`
* yourself, otherwise the `struct tempfile` structure would still
* think that the file descriptor needs to be closed, and a later
* cleanup would result in duplicate calls to `close(2)`. Worse yet,
* if you close and then later open another file descriptor for a
* completely different purpose, then the unrelated file descriptor
* might get closed.
*
*
* Error handling
* --------------
*
* `create_tempfile()` returns an allocated tempfile on success or NULL
* on failure. On errors, `errno` describes the reason for failure.
*
* `rename_tempfile()` and `close_tempfile_gently()` return 0 on success.
* On failure they set `errno` appropriately and return -1.
* `delete_tempfile()` and `rename` (but not `close`) do their best to
* delete the temporary file before returning.
*/
struct tempfile {
volatile struct volatile_list_head list;
volatile int fd;
FILE *volatile fp;
volatile pid_t owner;
struct strbuf filename;
char *directory;
};
/*
* Attempt to create a temporary file at the specified `path`. Return
* a tempfile (whose "fd" member can be used for writing to it), or
* NULL on error. It is an error if a file already exists at that path.
* Note that `mode` will be further modified by the umask, and possibly
* `core.sharedRepository`, so it is not guaranteed to have the given
* mode.
*/
struct tempfile *create_tempfile_mode(const char *path, int mode);
static inline struct tempfile *create_tempfile(const char *path)
{
return create_tempfile_mode(path, 0666);
}
/*
* Register an existing file as a tempfile, meaning that it will be
* deleted when the program exits. The tempfile is considered closed,
* but it can be worked with like any other closed tempfile (for
* example, it can be opened using reopen_tempfile()).
*/
struct tempfile *register_tempfile(const char *path);
/*
* mks_tempfile functions
*
* The following functions attempt to create and open temporary files
* with names derived automatically from a template, in the manner of
* mkstemps(), and arrange for them to be deleted if the program ends
* before they are deleted explicitly. There is a whole family of such
* functions, named according to the following pattern:
*
* x?mks_tempfile_t?s?m?()
*
* The optional letters have the following meanings:
*
* x - die if the temporary file cannot be created.
*
* t - create the temporary file under $TMPDIR (as opposed to
* relative to the current directory). When these variants are
* used, template should be the pattern for the filename alone,
* without a path.
*
* s - template includes a suffix that is suffixlen characters long.
*
* m - the temporary file should be created with the specified mode
* (otherwise, the mode is set to 0600).
*
* None of these functions modify template. If the caller wants to
* know the (absolute) path of the file that was created, it can be
* read from tempfile->filename.
*
* On success, the functions return a tempfile whose "fd" member is open
* for writing the temporary file. On errors, they return NULL and set
* errno appropriately (except for the "x" variants, which die() on
* errors).
*/
/* See "mks_tempfile functions" above. */
struct tempfile *mks_tempfile_sm(const char *filename_template,
int suffixlen, int mode);
/* See "mks_tempfile functions" above. */
static inline struct tempfile *mks_tempfile_s(const char *filename_template,
int suffixlen)
{
return mks_tempfile_sm(filename_template, suffixlen, 0600);
}
/* See "mks_tempfile functions" above. */
static inline struct tempfile *mks_tempfile_m(const char *filename_template, int mode)
{
return mks_tempfile_sm(filename_template, 0, mode);
}
/* See "mks_tempfile functions" above. */
static inline struct tempfile *mks_tempfile(const char *filename_template)
{
return mks_tempfile_sm(filename_template, 0, 0600);
}
/* See "mks_tempfile functions" above. */
struct tempfile *mks_tempfile_tsm(const char *filename_template,
int suffixlen, int mode);
/* See "mks_tempfile functions" above. */
static inline struct tempfile *mks_tempfile_ts(const char *filename_template,
int suffixlen)
{
return mks_tempfile_tsm(filename_template, suffixlen, 0600);
}
/* See "mks_tempfile functions" above. */
static inline struct tempfile *mks_tempfile_tm(const char *filename_template, int mode)
{
return mks_tempfile_tsm(filename_template, 0, mode);
}
/* See "mks_tempfile functions" above. */
static inline struct tempfile *mks_tempfile_t(const char *filename_template)
{
return mks_tempfile_tsm(filename_template, 0, 0600);
}
/* See "mks_tempfile functions" above. */
struct tempfile *xmks_tempfile_m(const char *filename_template, int mode);
/* See "mks_tempfile functions" above. */
static inline struct tempfile *xmks_tempfile(const char *filename_template)
{
return xmks_tempfile_m(filename_template, 0600);
}
/*
* Attempt to create a temporary directory in $TMPDIR and to create and
* open a file in that new directory. Derive the directory name from the
* template in the manner of mkdtemp(). Arrange for directory and file
* to be deleted if the program exits before they are deleted
* explicitly. On success return a tempfile whose "filename" member
* contains the full path of the file and its "fd" member is open for
* writing the file. On error return NULL and set errno appropriately.
*/
struct tempfile *mks_tempfile_dt(const char *directory_template,
const char *filename);
/*
* Associate a stdio stream with the temporary file (which must still
* be open). Return `NULL` (*without* deleting the file) on error. The
* stream is closed automatically when `close_tempfile_gently()` is called or
* when the file is deleted or renamed.
*/
FILE *fdopen_tempfile(struct tempfile *tempfile, const char *mode);
static inline int is_tempfile_active(struct tempfile *tempfile)
{
return !!tempfile;
}
/*
* Return the path of the lockfile. The return value is a pointer to a
* field within the lock_file object and should not be freed.
*/
const char *get_tempfile_path(struct tempfile *tempfile);
int get_tempfile_fd(struct tempfile *tempfile);
FILE *get_tempfile_fp(struct tempfile *tempfile);
/*
* If the temporary file is still open, close it (and the file pointer
* too, if it has been opened using `fdopen_tempfile()`) without
* deleting the file. Return 0 upon success. On failure to `close(2)`,
* return a negative value. Usually `delete_tempfile()` or `rename_tempfile()`
* should eventually be called regardless of whether `close_tempfile_gently()`
* succeeds.
*/
int close_tempfile_gently(struct tempfile *tempfile);
/*
* Re-open a temporary file that has been closed using
* `close_tempfile_gently()` but not yet deleted or renamed. This can be used
* to implement a sequence of operations like the following:
*
* * Create temporary file.
*
* * Write new contents to file, then `close_tempfile_gently()` to cause the
* contents to be written to disk.
*
* * Pass the name of the temporary file to another program to allow
* it (and nobody else) to inspect or even modify the file's
* contents.
*
* * `reopen_tempfile()` to reopen the temporary file, truncating the existing
* contents. Write out the new contents.
*
* * `rename_tempfile()` to move the file to its permanent location.
*/
int reopen_tempfile(struct tempfile *tempfile);
/*
* Close the file descriptor and/or file pointer and remove the
* temporary file associated with `tempfile`. It is a NOOP to call
* `delete_tempfile()` for a `tempfile` object that has already been
* deleted or renamed.
*/
void delete_tempfile(struct tempfile **tempfile_p);
/*
* Close the file descriptor and/or file pointer if they are still
* open, and atomically rename the temporary file to `path`. `path`
* must be on the same filesystem as the lock file. Return 0 on
* success. On failure, delete the temporary file and return -1, with
* `errno` set to the value from the failing call to `close(2)` or
* `rename(2)`. It is a bug to call `rename_tempfile()` for a
* `tempfile` object that is not currently active.
*/
int rename_tempfile(struct tempfile **tempfile_p, const char *path);
#endif /* TEMPFILE_H */