Commit Graph

49067 Commits

Author SHA1 Message Date
Junio C Hamano
9ddaf86b06 The eighth batch for 2.15
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-19 10:55:19 +09:00
Junio C Hamano
4d46bce6b0 Merge branch 'rk/commit-tree-make-F-verbatim'
Unlike "git commit-tree < file", "git commit-tree -F file" did not
pass the contents of the file verbatim and instead completed an
incomplete line at the end, if exists.  The latter has been updated
to match the behaviour of the former.

* rk/commit-tree-make-F-verbatim:
  commit-tree: do not complete line in -F input
2017-09-19 10:47:57 +09:00
Junio C Hamano
d811ba1897 Merge branch 'rs/strbuf-leakfix'
Many leaks of strbuf have been fixed.

* rs/strbuf-leakfix: (34 commits)
  wt-status: release strbuf after use in wt_longstatus_print_tracking()
  wt-status: release strbuf after use in read_rebase_todolist()
  vcs-svn: release strbuf after use in end_revision()
  utf8: release strbuf on error return in strbuf_utf8_replace()
  userdiff: release strbuf after use in userdiff_get_textconv()
  transport-helper: release strbuf after use in process_connect_service()
  sequencer: release strbuf after use in save_head()
  shortlog: release strbuf after use in insert_one_record()
  sha1_file: release strbuf on error return in index_path()
  send-pack: release strbuf on error return in send_pack()
  remote: release strbuf after use in set_url()
  remote: release strbuf after use in migrate_file()
  remote: release strbuf after use in read_remote_branches()
  refs: release strbuf on error return in write_pseudoref()
  notes: release strbuf after use in notes_copy_from_stdin()
  merge: release strbuf after use in write_merge_heads()
  merge: release strbuf after use in save_state()
  mailinfo: release strbuf on error return in handle_boundary()
  mailinfo: release strbuf after use in handle_from()
  help: release strbuf on error return in exec_woman_emacs()
  ...
2017-09-19 10:47:57 +09:00
Junio C Hamano
17cb5f85d0 Merge branch 'jk/shortlog-ident-cleanup'
Code clean-up.

* jk/shortlog-ident-cleanup:
  shortlog: skip format/parse roundtrip for internal traversal
2017-09-19 10:47:56 +09:00
Junio C Hamano
07f0542da3 Merge branch 'mh/packed-ref-transactions'
Implement transactional update to the packed-ref representation of
references.

* mh/packed-ref-transactions:
  files_transaction_finish(): delete reflogs before references
  packed-backend: rip out some now-unused code
  files_ref_store: use a transaction to update packed refs
  t1404: demonstrate two problems with reference transactions
  files_initial_transaction_commit(): use a transaction for packed refs
  prune_refs(): also free the linked list
  files_pack_refs(): use a reference transaction to write packed refs
  packed_delete_refs(): implement method
  packed_ref_store: implement reference transactions
  struct ref_transaction: add a place for backends to store data
  packed-backend: don't adjust the reference count on lock/unlock
2017-09-19 10:47:56 +09:00
Junio C Hamano
6701263956 Merge branch 'kw/merge-recursive-cleanup'
A leakfix and code clean-up.

* kw/merge-recursive-cleanup:
  merge-recursive: change current file dir string_lists to hashmap
  merge-recursive: remove return value from get_files_dirs
  merge-recursive: fix memory leak
2017-09-19 10:47:56 +09:00
Junio C Hamano
0543de438f Merge branch 'sb/merge-commit-msg-hook'
As "git commit" to conclude a conflicted "git merge" honors the
commit-msg hook, "git merge" that recoreds a merge commit that
cleanly auto-merges should, but it didn't.

* sb/merge-commit-msg-hook:
  builtin/merge: honor commit-msg hook for merges
2017-09-19 10:47:56 +09:00
Junio C Hamano
09595ab381 Merge branch 'jk/leak-checkers'
Many of our programs consider that it is OK to release dynamic
storage that is used throughout the life of the program by simply
exiting, but this makes it harder to leak detection tools to avoid
reporting false positives.  Plug many existing leaks and introduce
a mechanism for developers to mark that the region of memory
pointed by a pointer is not lost/leaking to help these tools.

* jk/leak-checkers:
  add UNLEAK annotation for reducing leak false positives
  set_git_dir: handle feeding gitdir to itself
  repository: free fields before overwriting them
  reset: free allocated tree buffers
  reset: make tree counting less confusing
  config: plug user_config leak
  update-index: fix cache entry leak in add_one_file()
  add: free leaked pathspec after add_files_to_cache()
  test-lib: set LSAN_OPTIONS to abort by default
  test-lib: --valgrind should not override --verbose-log
2017-09-19 10:47:55 +09:00
Junio C Hamano
df80c5760c Merge branch 'nm/pull-submodule-recurse-config'
"git -c submodule.recurse=yes pull" did not work as if the
"--recurse-submodules" option was given from the command line.
This has been corrected.

* nm/pull-submodule-recurse-config:
  pull: honor submodule.recurse config option
  pull: fix cli and config option parsing order
2017-09-19 10:47:55 +09:00
Junio C Hamano
daafb5062c Merge branch 'mh/packed-ref-store-prep'
Fix regression to "gitk --bisect" by a recent update.

* mh/packed-ref-store-prep:
  rev-parse: don't trim bisect refnames
2017-09-19 10:47:55 +09:00
Junio C Hamano
c39da2c08e Merge branch 'ma/remove-config-maybe-bool'
Finishing touches to a recent topic.

* ma/remove-config-maybe-bool:
  config: remove git_config_maybe_bool
2017-09-19 10:47:55 +09:00
Junio C Hamano
f2ab3a10b5 Merge branch 'jk/system-path-cleanup'
Code clean-up.

* jk/system-path-cleanup:
  git_extract_argv0_path: do nothing without RUNTIME_PREFIX
  system_path: move RUNTIME_PREFIX to a sub-function
2017-09-19 10:47:55 +09:00
Junio C Hamano
b86e112056 Merge branch 'jh/hashmap-disable-counting'
Our hashmap implementation in hashmap.[ch] is not thread-safe when
adding a new item needs to expand the hashtable by rehashing; add
an API to disable the automatic rehashing to work it around.

* jh/hashmap-disable-counting:
  hashmap: add API to disable item counting when threaded
2017-09-19 10:47:54 +09:00
Junio C Hamano
0517ae0ba6 Merge branch 'bb/doc-eol-dirty'
Doc update.

* bb/doc-eol-dirty:
  Documentation: mention that `eol` can change the dirty status of paths
2017-09-19 10:47:54 +09:00
Junio C Hamano
281b1cf856 Merge branch 'jt/packmigrate'
Remove unneeded file added by a topic already in 'master'.

* jt/packmigrate:
  Remove inadvertently added outgoing/packfile.h
2017-09-19 10:47:53 +09:00
Junio C Hamano
89563ec379 Merge branch 'jk/incore-lockfile-removal'
The long-standing rule that an in-core lockfile instance, once it
is used, must not be freed, has been lifted and the lockfile and
tempfile APIs have been updated to reduce the chance of programming
errors.

* jk/incore-lockfile-removal:
  stop leaking lock structs in some simple cases
  ref_lock: stop leaking lock_files
  lockfile: update lifetime requirements in documentation
  tempfile: auto-allocate tempfiles on heap
  tempfile: remove deactivated list entries
  tempfile: use list.h for linked list
  tempfile: release deactivated strbufs instead of resetting
  tempfile: robustify cleanup handler
  tempfile: factor out deactivation
  tempfile: factor out activation
  tempfile: replace die("BUG") with BUG()
  tempfile: handle NULL tempfile pointers gracefully
  tempfile: prefer is_tempfile_active to bare access
  lockfile: do not rollback lock on failed close
  tempfile: do not delete tempfile on failed close
  always check return value of close_tempfile
  verify_signed_buffer: prefer close_tempfile() to close()
  setup_temporary_shallow: move tempfile struct into function
  setup_temporary_shallow: avoid using inactive tempfile
  write_index_as_tree: cleanup tempfile on error
2017-09-19 10:47:53 +09:00
Junio C Hamano
8a044c7f1d Merge branch 'nd/prune-in-worktree'
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.

* nd/prune-in-worktree:
  refs.c: reindent get_submodule_ref_store()
  refs.c: remove fallback-to-main-store code get_submodule_ref_store()
  rev-list: expose and document --single-worktree
  revision.c: --reflog add HEAD reflog from all worktrees
  files-backend: make reflog iterator go through per-worktree reflog
  revision.c: --all adds HEAD from all worktrees
  refs: remove dead for_each_*_submodule()
  refs.c: move for_each_remote_ref_submodule() to submodule.c
  revision.c: use refs_for_each*() instead of for_each_*_submodule()
  refs: add refs_head_ref()
  refs: move submodule slash stripping code to get_submodule_ref_store
  refs.c: refactor get_submodule_ref_store(), share common free block
  revision.c: --indexed-objects add objects from all worktrees
  revision.c: refactor add_index_objects_to_pending()
  refs.c: use is_dir_sep() in resolve_gitlink_ref()
  revision.h: new flag in struct rev_info wrt. worktree-related refs
2017-09-19 10:47:53 +09:00
Junio C Hamano
dafbe1993e Merge branch 'ma/split-symref-update-fix'
A leakfix.

* ma/split-symref-update-fix:
  refs/files-backend: add `refname`, not "HEAD", to list
  refs/files-backend: correct return value in lock_ref_for_update
  refs/files-backend: fix memory leak in lock_ref_for_update
  refs/files-backend: add longer-scoped copy of string to list
2017-09-19 10:47:53 +09:00
Junio C Hamano
30675f7021 Merge branch 'mh/notes-cleanup'
Code clean-up.

* mh/notes-cleanup:
  load_subtree(): check that `prefix_len` is in the expected range
  load_subtree(): declare some variables to be `size_t`
  hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
  get_oid_hex_segment(): don't pad the rest of `oid`
  load_subtree(): combine some common code
  get_oid_hex_segment(): return 0 on success
  load_subtree(): only consider blobs to be potential notes
  load_subtree(): check earlier whether an internal node is a tree entry
  load_subtree(): separate logic for internal vs. terminal entries
  load_subtree(): fix incorrect comment
  load_subtree(): reduce the scope of some local variables
  load_subtree(): remove unnecessary conditional
  notes: make GET_NIBBLE macro more robust
2017-09-19 10:47:52 +09:00
Junio C Hamano
eb066429e7 Merge branch 'mg/timestamp-t-fix'
A mismerge fix.

* mg/timestamp-t-fix:
  name-rev: change ULONG_MAX to TIME_MAX
2017-09-19 10:47:52 +09:00
Junio C Hamano
c78e182d55 Merge branch 'ma/pkt-line-leakfix'
A leakfix.

* ma/pkt-line-leakfix:
  pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
2017-09-19 10:47:52 +09:00
Junio C Hamano
b0727e2439 Merge branch 'jk/config-lockfile-leak-fix'
A leakfix.

* jk/config-lockfile-leak-fix:
  config: use a static lock_file struct
2017-09-19 10:47:51 +09:00
Junio C Hamano
1f1ea329b9 Merge branch 'dw/diff-highlight-makefile-fix'
Build clean-up.

* dw/diff-highlight-makefile-fix:
  diff-highlight: add clean target to Makefile
2017-09-19 10:47:50 +09:00
Junio C Hamano
cb6ec86d29 Merge branch 'ti/external-sha1dc'
Platforms that ship with a separate sha1 with collision detection
library can link to it instead of using the copy we ship as part of
our source tree.

* ti/external-sha1dc:
  sha1dc: allow building with the external sha1dc library
  sha1dc: build git plumbing code more explicitly
2017-09-19 10:47:50 +09:00
Junio C Hamano
afe2fab72c gc: call fscanf() with %<len>s, not %<len>c, when reading hostname
Earlier in this codepath, we (ab)used "%<len>c" to read the hostname
recorded in the lockfile into locking_host[HOST_NAME_MAX + 1] while
substituting <len> with the actual value of HOST_NAME_MAX.

This turns out to be incorrect, as it is an instruction to read
exactly the specified number of bytes.  Because we are trying to
read at most that many bytes, we should be using "%<len>s" instead.

Helped-by: A. Wilcox <awilfox@adelielinux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-17 13:21:44 +09:00
Max Kirillov
da769d2986 describe: fix matching to actually match all patterns
`git describe --match` with multiple patterns matches only first pattern.
If it fails, next patterns are not tried.

Fix it, add test cases and update existing test which has wrong
expectation.

Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-17 10:21:12 +09:00
Heiko Voigt
c514167df2 add test for bug in git-mv for recursive submodules
When using git-mv with a submodule it will detect that and update the
paths for its configurations (.gitmodules, worktree and gitfile). This
does not work for recursive submodules where a user renames the root
submodule.

We discovered this fact when working on on-demand fetch for renamed
submodules. Lets add a test to document.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-17 09:37:34 +09:00
Nicolas Morey-Chaisemartin
dbba42bb32 imap-send: use curl by default when possible
Set curl as the runtime default when it is available.
When linked against older curl versions (< 7_34_0) or without curl,
use the legacy imap implementation.

The goal is to validate feature parity between the legacy and
the curl implementation, deprecate the legacy implementation
later on and in the long term, hopefully drop it altogether.

Signed-off-by: Nicolas Morey-Chaisemartin <nicolas@morey-chaisemartin.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-15 13:46:19 +09:00
Nicolas Morey-Chaisemartin
19079b3e7c imap_send: setup_curl: retreive credentials if not set in config file
Up to this point, the curl mode only supported getting the username
and password from the gitconfig file while the legacy mode could also
fetch them using the credential API.

Signed-off-by: Nicolas Morey-Chaisemartin <nicolas@morey-chaisemartin.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-15 13:45:37 +09:00
Nicolas Morey-Chaisemartin
690307f3d1 imap-send: add wrapper to get server credentials if needed
Signed-off-by: Nicolas Morey-Chaisemartin <nicolas@morey-chaisemartin.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-15 13:32:02 +09:00
Nicolas Morey-Chaisemartin
200bc38bf5 imap-send: return with error if curl failed
curl_append_msgs_to_imap always returned 0, whether curl failed or not.
Return a proper status so git imap-send will exit with an error code
if something wrong happened.

Signed-off-by: Nicolas Morey-Chaisemartin <nicolas@morey-chaisemartin.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-15 13:31:45 +09:00
Kaartic Sivaraam
8c4b1a3593 commit-template: change a message to be more intuitive
It's not good to use the phrase 'do not touch' to convey the
information that the cut-line should not be modified or removed as
it could possibly be mis-interpreted by a person who doesn't know
that the word 'touch' has the meaning of 'tamper with'. Further, it
could make translations a little difficult as it might not have the
intended meaning in a few languages when translated as such.

So, use more intuitive terms in the sentence.

Signed-off-by: Kaartic Sivaraam <kaarticsivaraam91196@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-15 12:34:33 +09:00
Ramsay Jones
21dac1deee test-lib: don't use ulimit in test prerequisites on cygwin
On cygwin (and MinGW), the 'ulimit' built-in bash command does not have
the desired effect of limiting the resources of new processes, at least
for the stack and file descriptors. However, it always returns success
and leads to several test prerequisites being erroneously set to true.

Add a check for cygwin and MinGW to the prerequisite expressions, using
a 'test_have_prereq !MINGW,!CYGWIN' clause, to guard against using ulimit.
This affects the prerequisite expressions for the ULIMIT_STACK_SIZE,
CMDLINE_LIMIT and ULIMIT_FILE_DESCRIPTORS prerequisites.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-15 11:52:00 +09:00
Michael Haggerty
a8811695e3 read_packed_refs(): make parsing of the header line more robust
The old code parsed the traits in the `packed-refs` header by looking
for the string " trait " (i.e., the name of the trait with a space on
either side) in the header line. This is fragile, because if any other
implementation of Git forgets to write the trailing space, the last
trait would silently be ignored (and the error might never be
noticed).

So instead, use `string_list_split_in_place()` to split the traits
into tokens then use `unsorted_string_list_has_string()` to look for
the tokens we are interested in. This means that we can read the
traits correctly even if the header line is missing a trailing
space (or indeed, if it is missing the space after the colon, or if it
has multiple spaces somewhere).

However, older Git clients (and perhaps other Git implementations)
still require the surrounding spaces, so we still have to output the
header with a trailing space.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
36f23534ae read_packed_refs(): only check for a header at the top of the file
This tightens up the parsing a bit; previously, stray header-looking
lines would have been processed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
49a03ef466 read_packed_refs(): use mmap to read the packed-refs file
It's still done in a pretty stupid way, involving more data copying
than necessary. That will improve in future commits.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
735267aa10 die_unterminated_line(), die_invalid_line(): new functions
Extract some helper functions for reporting errors. While we're at it,
prevent them from spewing unlimited output to the terminal. These
functions will soon have more callers.

These functions accept the problematic line as a `(ptr, len)` pair
rather than a NUL-terminated string, and `die_invalid_line()` checks
for an EOL itself, because these calling conventions will be
convenient for future callers. (Efficiency is not a concern here
because these functions are only ever called if the `packed-refs` file
is corrupt.)

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
f0a7dc86d2 packed_ref_cache: add a backlink to the associated packed_ref_store
It will prove convenient in upcoming patches.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Jeff King
157113c614 prefix_ref_iterator: break when we leave the prefix
If the underlying iterator is ordered, then `prefix_ref_iterator` can
stop as soon as it sees a refname that comes after the prefix. This
will rarely make a big difference now, because `ref_cache_iterator`
only iterates over the directory containing the prefix (and usually
the prefix will span a whole directory anyway). But if *hint, hint* a
future reference backend doesn't itself know where to stop the
iteration, then this optimization will be a big win.

Note that there is no guarantee that the underlying iterator doesn't
include output preceding the prefix, so we have to skip over any
unwanted references before we get to the ones that we want.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
8738a8a4df ref_iterator: keep track of whether the iterator output is ordered
References are iterated over in order by refname, but reflogs are not.
Some consumers of reference iteration care about the difference. Teach
each `ref_iterator` to keep track of whether its output is ordered.

`overlay_ref_iterator` is one of the picky consumers. Add a sanity
check in `overlay_ref_iterator_begin()` to verify that its inputs are
ordered.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Jeff King
f48ecd38cb read_pack_header: handle signed/unsigned comparison in read result
The result of read_in_full() may be -1 if we saw an error.
But in comparing it to a sizeof() result, that "-1" will be
promoted to size_t. In fact, the largest possible size_t
which is much bigger than our struct size. This means that
our "< sizeof(header)" error check won't trigger.

In practice, we'd go on to read uninitialized memory and
compare it to the PACK signature, which is likely to fail.
But we shouldn't get there.

We can fix this by making a direct "!=" comparison to the
requested size, rather than "<". This means that errors get
lumped in with short reads, but that's sufficient for our
purposes here. There's no PH_ERROR tp represent our case.
And anyway, this function reads from pipes and network
sockets. A network error may racily appear as EOF to us
anyway if there's data left in the socket buffers.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:18:00 +09:00
Jeff King
d9bd4cbb9c config: flip return value of store_write_*()
The store_write_section() and store_write_pairs() functions
are basically high-level wrappers around write(). But their
return values are flipped from our usual convention, using
"1" for success and "0" for failure.

Let's flip them to follow the usual write() conventions and
update all callers. As these are local to config.c, it's
unlikely that we'd have new callers in any topics in flight
(which would be silently broken by our change). But just to
be on the safe side, let's rename them to just
write_section() and write_pairs().  That also accentuates
their relationship with write().

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:18:00 +09:00
Jeff King
634eb82b1a notes-merge: use ssize_t for write_in_full() return value
We store the return value of write_in_full() in a long,
though the return is actually an ssize_t. This probably
doesn't matter much in practice (since the buffer size is
alredy an unsigned long), but it might if the size if
between what can be represented in "long" and "unsigned
long", and if your size_t is larger than a "long" (as it is
on 64-bit Windows).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Jeff King
4c95e3dd28 pkt-line: check write_in_full() errors against "< 0"
As with the previous two commits, we prefer to check
write_in_full()'s return value to see if it is negative,
rather than comparing it to the input length.

These cases actually flip the logic to check for success,
making conversion a little different than in other cases. We
could of course write:

  if (write_in_full(...) >= 0)
          return 0;
  return error(...);

But our usual method of spelling write() error checks is
just "< 0". So let's flip the logic for each of these
conditionals to our usual style.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Jeff King
564bde9ae6 convert less-trivial versions of "write_in_full() != len"
The prior commit converted many sites to check the return
value of write_in_full() for negativity, rather than a
mismatch with the input length. This patch covers similar
cases, but where the return value is stored in an
intermediate variable. These should get the same treatment,
but they need to be reviewed more carefully since it would
be a bug if the return value is stored in an unsigned type
(which indeed, it is in one of the cases).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Jeff King
06f46f237a avoid "write_in_full(fd, buf, len) != len" pattern
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).

So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:

  1. It can do a funny signed/unsigned comparison. If your
     "len" is signed (e.g., a size_t) then the compiler will
     promote the "-1" to its unsigned variant.

     This works out for "!= len" (unless you really were
     trying to write the maximum size_t bytes), but is a
     bug if you check "< len" (an example of which was fixed
     recently in config.c).

     We should avoid promoting the mental model that you
     need to check the length at all, so that new sites are
     not tempted to copy us.

  2. Checking for a negative value is shorter to type,
     especially when the length is an expression.

  3. Linus says so. In d34cf19b89 (Clean up write_in_full()
     users, 2007-01-11), right after the write_in_full()
     semantics were changed, he wrote:

       I really wish every "write_in_full()" user would just
       check against "<0" now, but this fixes the nasty and
       stupid ones.

     Appeals to authority aside, this makes it clear that
     writing it this way does not have an intentional
     benefit. It's a historical curiosity that we never
     bothered to clean up (and which was undoubtedly
     cargo-culted into new sites).

So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).

[1] A careful reader may notice there is one way that
    write_in_full() can return a different value. If we ask
    write() to write N bytes and get a return value that is
    _larger_ than N, we could return a larger total. But
    besides the fact that this would imply a totally broken
    version of write(), it would already invoke undefined
    behavior. Our internal remaining counter is an unsigned
    size_t, which means that subtracting too many byte will
    wrap it around to a very large number. So we'll instantly
    begin reading off the end of the buffer, trying to write
    gigabytes (or petabytes) of data.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Jeff King
68a423ab3e get-tar-commit-id: check write_in_full() return against 0
We ask to write 41 bytes and make sure that the return value
is at least 41. This is the same "dangerous" pattern that
was fixed in the prior commit (wherein a negative return
value is promoted to unsigned), though it is not dangerous
here because our "41" is a constant, not an unsigned
variable.

But we should convert it anyway to avoid modeling a
dangerous construct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:16:21 +09:00
Jeff King
efacf609c8 config: avoid "write_in_full(fd, buf, len) < len" pattern
The return type of write_in_full() is a signed ssize_t,
because we may return "-1" on failure (even if we succeeded
in writing some bytes). But "len" itself is may be an
unsigned type (the function takes a size_t, but of course we
may have something else in the calling function). So while
it seems like:

  if (write_in_full(fd, buf, len) < len)
	die_errno("write error");

would trigger on error, it won't if "len" is unsigned.  The
compiler sees a signed/unsigned comparison and promotes the
signed value, resulting in (size_t)-1, the highest possible
size_t (or again, whatever type the caller has). This cannot
possibly be smaller than "len", and so the conditional can
never trigger.

I scoured the code base for cases of this, but it turns out
that these two in git_config_set_multivar_in_file_gently()
are the only ones. Here our "len" is the difference between
two size_t variables, making the result an unsigned size_t.
We can fix this by just checking for a negative return value
directly, as write_in_full() will never return any value
except -1 or the full count.

There's no addition to the test suite here, since you need
to convince write() to fail in order to see the problem. The
simplest reproduction recipe I came up with is to trigger
ENOSPC:

  # make a limited-size filesystem
  dd if=/dev/zero of=small.disk bs=1M count=1
  mke2fs small.disk
  mkdir mnt
  sudo mount -o loop small.disk mnt
  cd mnt
  sudo chown $USER:$USER .

  # make a config file with some content
  git config --file=config one.key value
  git config --file=config two.key value

  # now fill up the disk
  dd if=/dev/zero of=fill

  # and try to delete a key, which requires copying the rest
  # of the file to config.lock, and will fail on write()
  git config --file=config --unset two.key

That final command should (and does after this patch)
produce an error message due to the failed write, and leave
the file intact. Instead, it silently ignores the failure
and renames config.lock into place, leaving you with a
totally empty config file!

Reported-by: demerphq <demerphq@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:10:45 +09:00
Evan Zacks
be94568bc7 doc: fix minor typos (extra/duplicated words)
Following are several fixes for duplicated words ("of of") and one
case where an extra article ("a") slipped in.

Signed-off-by: Evan Zacks <zackse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:09:02 +09:00
René Scharfe
4318094047 archive: don't add empty directories to archives
While git doesn't track empty directories, git archive can be tricked
into putting some into archives.  One way is to construct an empty tree
object, as t5004 does.  While that is supported by the object database,
it can't be represented in the index and thus it's unlikely to occur in
the wild.

Another way is using the literal name of a directory in an exclude
pathspec -- its contents are are excluded, but the directory stub is
included.  That's inconsistent: exclude pathspecs containing wildcards
don't leave empty directories in the archive.

Yet another way is have a few levels of nested subdirectories (e.g.
d1/d2/d3/file1) and ignoring the entries at the leaves (e.g. file1).
The directories with the ignored content are ignored as well (e.g. d3),
but their empty parents are included (e.g. d2).

As empty directories are not supported by git, they should also not be
written into archives.  If an empty directory is really needed then it
can be tracked and archived by placing an empty .gitignore file in it.

There already is a mechanism in place for suppressing empty directories.
When read_tree_recursive() encounters a directory excluded by a pathspec
then it enters it anyway because it might contain included entries.  It
calls the callback function before it is able to decide if the directory
is actually needed.  For that reason git archive adds directories to a
queue and writes entries for them only when it encounters the first
child item -- but currently only if pathspecs with wildcards are used.

Queue *all* directories, no matter if there even are pathspecs present.
This prevents git archive from writing entries for empty directories in
all cases.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:08:22 +09:00