* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
* maint-2.16: (31 commits)
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
...
* maint-2.15: (29 commits)
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
...
* maint-2.14: (28 commits)
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
test-path-utils: offer to run a protectNTFS/protectHFS benchmark
...
As with export-marks in the previous commit, import-marks can access the
filesystem. This is significantly less dangerous than export-marks
because it only involves reading from arbitrary paths, rather than
writing them. However, it could still be surprising and have security
implications (e.g., exfiltrating data from a service that accepts
fast-import streams).
Let's lump it (and its "if-exists" counterpart) in with export-marks,
and enable the in-stream version only if --allow-unsafe-features is set.
Signed-off-by: Jeff King <peff@peff.net>
The fast-import stream command "feature export-marks=<path>" lets the
stream write marks to an arbitrary path. This may be surprising if you
are running fast-import against an untrusted input (which otherwise
cannot do anything except update Git objects and refs).
Let's disallow the use of this feature by default, and provide a
command-line option to re-enable it (you can always just use the
command-line --export-marks as well, but the in-stream version provides
an easy way for exporters to control the process).
This is a backwards-incompatible change, since the default is flipping
to the new, safer behavior. However, since the main users of the
in-stream versions would be import/export-based remote helpers, and
since we trust remote helpers already (which are already running
arbitrary code), we'll pass the new option by default when reading a
remote helper's stream. This should minimize the impact.
Note that the implementation isn't totally simple, as we have to work
around the fact that fast-import doesn't parse its command-line options
until after it has read any "feature" lines from the stream. This is how
it lets command-line options override in-stream. But in our case, it's
important to parse the new --allow-unsafe-features first.
There are three options for resolving this:
1. Do a separate "early" pass over the options. This is easy for us to
do because there are no command-line options that allow the
"unstuck" form (so there's no chance of us mistaking an argument
for an option), though it does introduce a risk of incorrect
parsing later (e.g,. if we convert to parse-options).
2. Move the option parsing phase back to the start of the program, but
teach the stream-reading code never to override an existing value.
This is tricky, because stream "feature" lines override each other
(meaning we'd have to start tracking the source for every option).
3. Accept that we might parse a "feature export-marks" line that is
forbidden, as long we don't _act_ on it until after we've parsed
the command line options.
This would, in fact, work with the current code, but only because
the previous patch fixed the export-marks parser to avoid touching
the filesystem.
So while it works, it does carry risk of somebody getting it wrong
in the future in a rather subtle and unsafe way.
I've gone with option (1) here as simple, safe, and unlikely to cause
regressions.
This fixes CVE-2019-1348.
Signed-off-by: Jeff King <peff@peff.net>
When we parse the --export-marks option, we don't immediately open the
file, but we do create any leading directories. This can be especially
confusing when a command-line option overrides an in-stream one, in
which case we'd create the leading directory for the in-stream file,
even though we never actually write the file.
Let's instead create the directories just before opening the file, which
means we'll create only useful directories. Note that this could change
the handling of relative paths if we chdir() in between, but we don't
actually do so; the only permanent chdir is from setup_git_directory()
which runs before either code path (potentially we should take the
pre-setup dir into account to avoid surprising the user, but that's an
orthogonal change).
The test just adapts the existing "override" test to use paths with
leading directories. This checks both that the correct directory is
created (which worked before but was not tested), and that the
overridden one is not (our new fix here).
While we're here, let's also check the error result of
safe_create_leading_directories(). We'd presumably notice any failure
immediately after when we try to open the file itself, but we can give a
more specific error message in this case.
Signed-off-by: Jeff King <peff@peff.net>
When asked to import marks from "subdir/file.marks", we create the
leading directory "subdir" if it doesn't exist. This makes no sense for
importing marks, where we only ever open the path for reading.
Most of the time this would be a noop, since if the marks file exists,
then the leading directories exist, too. But if it doesn't (e.g.,
because --import-marks-if-exists was used), then we'd create the useless
directory.
This dates back to 580d5f83e7 (fast-import: always create marks_file
directories, 2010-03-29). Even then it was useless, so it seems to have
been added in error alongside the --export-marks case (which _is_
helpful).
Signed-off-by: Jeff King <peff@peff.net>
We parse options like "--max-pack-size=" using skip_prefix(), which
makes sense to get at the bytes after the "=". However, we also parse
"--quiet" and "--stats" with skip_prefix(), which allows things like
"--quiet-nonsense" to behave like "--quiet".
This was a mistaken conversion in 0f6927c229 (fast-import: put option
parsing code in separate functions, 2009-12-04). Let's tighten this to
an exact match, which was the original intent.
Signed-off-by: Jeff King <peff@peff.net>
Code clean-up to adjust to a more recent lockfile API convention that
allows lockfile instances kept on the stack.
* ma/lockfile-cleanup:
lock_file: move static locks into functions
lock_file: make function-local locks non-static
refs.c: do not die if locking fails in `delete_pseudoref()`
refs.c: do not die if locking fails in `write_pseudoref()`
t/helper/test-write-cache: clean up lock-handling
The codepath around object-info API has been taught to take the
repository object (which in turn tells the API which object store
the objects are to be located).
* sb/oid-object-info:
cache.h: allow oid_object_info to handle arbitrary repositories
packfile: add repository argument to cache_or_unpack_entry
packfile: add repository argument to unpack_entry
packfile: add repository argument to read_object
packfile: add repository argument to packed_object_info
packfile: add repository argument to packed_to_object_type
packfile: add repository argument to retry_bad_packed_offset
cache.h: add repository argument to oid_object_info
cache.h: add repository argument to oid_object_info_extended
Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)
These `struct lock_file`s are local to their respective functions and we
can drop their staticness.
For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.
As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.
* ds/commit-graph:
commit-graph: implement "--append" option
commit-graph: build graph from starting commits
commit-graph: read only from specific pack-indexes
commit: integrate commit graph with commit parsing
commit-graph: close under reachability
commit-graph: add core.commitGraph setting
commit-graph: implement git commit-graph read
commit-graph: implement git-commit-graph write
commit-graph: implement write_commit_graph()
commit-graph: create git-commit-graph builtin
graph: add commit graph design document
commit-graph: add format document
csum-file: refactor finalize_hashfile() method
csum-file: rename hashclose() to finalize_hashfile()
Add a repository argument to allow the callers of unpack_entry
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of oid_object_info
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An reusable "memory pool" implementation has been extracted from
fast-import.c, which in turn has become the first user of the
mem-pool API.
* jm/mem-pool:
mem-pool: move reusable parts of memory pool into its own file
fast-import: introduce mem_pool type
fast-import: rename mem_pool type to mp_block
This moves the reusable parts of the memory pool logic used by
fast-import.c into its own file for use by other components.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the mem_pool type which encapsulates all the information necessary to
manage a pool of memory. This change moves the existing variables in
fast-import used to support the global memory pool to use this structure. It
also renames variables that are no longer used by memory pools to reflect their
more scoped usage.
These changes allow for the multiple instances of a memory pool to
exist and be reused outside of fast-import. In a future commit the
mem_pool type will be moved to its own file.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is part of a patch series to extract the memory pool logic in
fast-import into a more generalized version. The existing mem_pool type
maps more closely to a "block of memory" (mp_block) in the more
generalized memory pool. This commit renames the mem_pool to mp_block to
reduce churn in future patches.
Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.
Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.
* sb/object-store: (27 commits)
sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
sha1_file: allow map_sha1_file to handle arbitrary repositories
sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
sha1_file: allow open_sha1_file to handle arbitrary repositories
sha1_file: allow stat_sha1_file to handle arbitrary repositories
sha1_file: allow sha1_file_name to handle arbitrary repositories
sha1_file: add repository argument to sha1_loose_object_info
sha1_file: add repository argument to map_sha1_file
sha1_file: add repository argument to map_sha1_file_1
sha1_file: add repository argument to open_sha1_file
sha1_file: add repository argument to stat_sha1_file
sha1_file: add repository argument to sha1_file_name
sha1_file: allow prepare_alt_odb to handle arbitrary repositories
sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
sha1_file: add repository argument to prepare_alt_odb
sha1_file: add repository argument to link_alt_odb_entries
sha1_file: add repository argument to read_info_alternates
sha1_file: add repository argument to link_alt_odb_entry
sha1_file: add raw_object_store argument to alt_odb_usable
pack: move approximate object count to object store
...
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (36 commits)
convert: convert to struct object_id
sha1_file: introduce a constant for max header length
Convert lookup_replace_object to struct object_id
sha1_file: convert read_sha1_file to struct object_id
sha1_file: convert read_object_with_reference to object_id
tree-walk: convert tree entry functions to object_id
streaming: convert istream internals to struct object_id
tree-walk: convert get_tree_entry_follow_symlinks internals to object_id
builtin/notes: convert static functions to object_id
builtin/fmt-merge-msg: convert remaining code to object_id
sha1_file: convert sha1_object_info* to object_id
Convert remaining callers of sha1_object_info_extended to object_id
packfile: convert unpack_entry to struct object_id
sha1_file: convert retry_bad_packed_offset to struct object_id
sha1_file: convert assert_sha1_type to object_id
builtin/mktree: convert to struct object_id
streaming: convert open_istream to use struct object_id
sha1_file: convert check_sha1_signature to struct object_id
sha1_file: convert read_loose_object to use struct object_id
builtin/index-pack: convert struct ref_delta_entry to object_id
...
The hashclose() method behaves very differently depending on the flags
parameter. In particular, the file descriptor is not always closed.
Perform a simple rename of "hashclose()" to "finalize_hashfile()" in
preparation for functional changes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reason callers have to call this is to make sure either packed_git
or packed_git_mru pointers are initialized since we don't do that by
default. Sometimes it's hard to see this connection between where the
function is called and where packed_git pointer is used (sometimes in
separate functions).
Keep this dependency internal because now all access to packed_git and
packed_git_mru must go through get_xxx() wrappers.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow prepare_packed_git callers to be
more specific about which repository to handle. See commit "sha1_file:
add repository argument to link_alt_odb_entry" for an explanation of
the #define trick.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This conversion was done without the #define trick used in the earlier
series refactoring to have better repository access, because this function
is easy to review, as it only has one caller and all lines but the first
two are converted.
We must not convert 'pack_open_fds' to be a repository specific variable,
as it is used to monitor resource usage of the machine that Git executes
on.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a process with multiple repositories open, packfile accessors
should be associated to a single repository and not shared globally.
Move packed_git and packed_git_mru into the_repository and adjust
callers to reflect this.
[nd: while at there, wrap access to these two fields in get_packed_git()
and get_packed_git_mru(). This allows us to lazily initialize these
fields without caller doing that explicitly]
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'self-initialised' variables construct (ie <type> var = var;) has
been used to silence gcc '-W[maybe-]uninitialized' warnings. This has,
unfortunately, caused MSVC to issue 'uninitialized variable' warnings.
Also, using clang static analysis causes complaints about an 'Assigned
value is garbage or undefined'.
There are six such constructs in the current codebase. Only one of the
six causes gcc to issue a '-Wmaybe-uninitialized' warning (which will
be addressed elsewhere). The remaining five 'init-self' gcc workarounds
are noted below, along with the commit which introduced them:
1. builtin/rev-list.c: 'reaches' and 'all', see commit 457f08a030
("git-rev-list: add --bisect-vars option.", 2007-03-21).
2. merge-recursive.c:2064 'mrtree', see commit f120ae2a8e ("merge-
recursive.c: mrtree in merge() is not used before set", 2007-10-29).
3. fast-import.c:3023 'oe', see commit 85c62395b1 ("fast-import: let
importers retrieve blobs", 2010-11-28).
4. fast-import.c:3006 'oe', see commit 28c7b1f7b7 ("fast-import: add a
get-mark command", 2015-07-01).
Remove the 'self-initialised' variable constructs noted above.
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file. Do the same for read_sha1_file_extended.
Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id. Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)
@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert read_object_with_reference to take pointers to struct object_id.
Update the internals of the function accordingly.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names. Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:
@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(&E1, E2)
@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename struct sha1file to struct hashfile, along with all of its related
functions.
The transformation in this commit was made by global search-and-replace.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Switch various uses of explicit calls to SHA-1 to use the_hash_algo.
Convert various uses of 20 and the GIT_SHA1 constants as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
All but two of the call sites already have parameters using the hash
parameter of struct object_id, so convert them to take a pointer to the
struct directly. Also convert refs_read_refs_full, the underlying
implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the ref transaction code to use struct object_id. Remove one
NULL pointer check which was previously inserted around a dereference;
since we now pass a pointer to struct object_id directly through, the
code we're calling handles this for us.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The checkpoint command "git fast-import" did not flush updates to
refs and marks unless at least one object was created since the
last checkpoint, which has been corrected, as these things can
happen without any new object getting created.
* er/fast-import-dump-refs-on-checkpoint:
fast-import: checkpoint: dump branches/tags/marks even if object_count==0
The checkpoint command cycles packfiles if object_count != 0, a sensible
test or there would be no pack files to write. Since 820b931012, the
command also dumps branches, tags and marks, but still conditionally.
However, it is possible for a command stream to modify refs or create
marks without creating any new objects.
For example, reset a branch (and keep fast-import running):
$ git fast-import
reset refs/heads/master
from refs/heads/master^
checkpoint
but refs/heads/master remains unchanged.
Other example: a commit command that re-creates an object that already
exists in the object database.
The man page also states that checkpoint "updates the refs" and that
"placing a progress command immediately after a checkpoint will inform
the reader when the checkpoint has been completed and it can safely
access the refs that fast-import updated". This wasn't always true
without this patch.
This fix unconditionally calls dump_{branches,tags,marks}() for all
checkpoint commands. dump_branches() and dump_tags() are cheap to call
in the case of a no-op.
Add tests to t9300 that observe the (non-packfiles) effects of
checkpoint.
Signed-off-by: Eric Rannaud <e@nanocritical.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).
So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:
1. It can do a funny signed/unsigned comparison. If your
"len" is signed (e.g., a size_t) then the compiler will
promote the "-1" to its unsigned variant.
This works out for "!= len" (unless you really were
trying to write the maximum size_t bytes), but is a
bug if you check "< len" (an example of which was fixed
recently in config.c).
We should avoid promoting the mental model that you
need to check the length at all, so that new sites are
not tempted to copy us.
2. Checking for a negative value is shorter to type,
especially when the length is an expression.
3. Linus says so. In d34cf19b89 (Clean up write_in_full()
users, 2007-01-11), right after the write_in_full()
semantics were changed, he wrote:
I really wish every "write_in_full()" user would just
check against "<0" now, but this fixes the nasty and
stupid ones.
Appeals to authority aside, this makes it clear that
writing it this way does not have an intentional
benefit. It's a historical curiosity that we never
bothered to clean up (and which was undoubtedly
cargo-culted into new sites).
So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).
[1] A careful reader may notice there is one way that
write_in_full() can return a different value. If we ask
write() to write N bytes and get a return value that is
_larger_ than N, we could return a larger total. But
besides the fact that this would imply a totally broken
version of write(), it would already invoke undefined
behavior. Our internal remaining counter is an unsigned
size_t, which means that subtracting too many byte will
wrap it around to a very large number. So we'll instantly
begin reading off the end of the buffer, trying to write
gigabytes (or petabytes) of data.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.
Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c. It has a corresponding header packfile.h.
In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
"fast-import" uses a default pack chain depth that is consistent
with other parts of the system.
* mh/fast-import-raise-default-depth:
fast-import: increase the default pack depth to 50
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We often try to open a file for reading whose existence is
optional, and silently ignore errors from open/fopen; report such
errors if they are not due to missing files.
* nd/fopen-errors:
mingw_fopen: report ENOENT for invalid file names
mingw: verify that paths are not mistaken for remote nicknames
log: fix memory leak in open_next_file()
rerere.c: move error_errno() closer to the source system call
print errno when reporting a system call error
wrapper.c: make warn_on_inaccessible() static
wrapper.c: add and use fopen_or_warn()
wrapper.c: add and use warn_on_fopen_errors()
config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too
config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
clone: use xfopen() instead of fopen()
use xfopen() in more places
git_fopen: fix a sparse 'not declared' warning