Commit Graph

23 Commits

Author SHA1 Message Date
Jeff King
e4da43b1f0 prefix_filename: return newly allocated string
The prefix_filename() function returns a pointer to static
storage, which makes it easy to use dangerously. We already
fixed one buggy caller in hash-object recently, and the
calls in apply.c are suspicious (I didn't dig in enough to
confirm that there is a bug, but we call the function once
in apply_all_patches() and then again indirectly from
parse_chunk()).

Let's make it harder to get wrong by allocating the return
value. For simplicity, we'll do this even when the prefix is
empty (and we could just return the original file pointer).
That will cause us to allocate sometimes when we wouldn't
otherwise need to, but this function isn't called in
performance critical code-paths (and it already _might_
allocate on any given call, so a caller that cares about
performance is questionable anyway).

The downside is that the callers need to remember to free()
the result to avoid leaking. Most of them already used
xstrdup() on the result, so we know they are OK. The
remainder have been converted to use free() as appropriate.

I considered retaining a prefix_filename_unsafe() for cases
where we know the static lifetime is OK (and handling the
cleanup is awkward). This is only a handful of cases,
though, and it's not worth the mental energy in worrying
about whether the "unsafe" variant is OK to use in any
situation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-21 11:18:41 -07:00
Jeff King
116fb64e43 prefix_filename: drop length parameter
This function takes the prefix as a ptr/len pair, but in
every caller the length is exactly strlen(ptr). Let's
simplify the interface and just take the string. This saves
callers specifying it (and in some cases handling a NULL
prefix).

In a handful of cases we had the length already without
calling strlen, so this is technically slower. But it's not
likely to matter (after all, if the prefix is non-empty
we'll allocate and copy it into a buffer anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-21 11:12:53 -07:00
Jeff King
a1be47e4ca hash-object: fix buffer reuse with --path in a subdirectory
The hash-object command uses prefix_filename() without
duplicating its return value. Since that function returns a
static buffer, the value is overwritten by subsequent calls.

This can cause incorrect results when we use --path along
with hashing a file by its relative path, both of which need
to call prefix_filename(). We overwrite the filename
computed for --path, effectively ignoring it.

We can fix this by calling xstrdup on the return value. Note
that we don't bother freeing the "vpath" instance, as it
remains valid until the program exit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-21 11:12:52 -07:00
Jeff King
0e94ee9415 hash-object: always try to set up the git repository
When "hash-object" is run without "-w", we don't need to be
in a git repository at all; we can just hash the object and
write its sha1 to stdout. However, if we _are_ in a git
repository, we would want to know that so we can follow the
normal rules for respecting config, .gitattributes, etc.

This happens to work at the top-level of a git repository
because we blindly read ".git/config", but as the included
test shows, it does not work when you are in a subdirectory.

The solution is to just do a "gentle" setup in this case. We
already take care to use prefix_filename() on any filename
arguments we get (to handle the "-w" case), so we don't need
to do anything extra to handle the side effects of repo
setup.

An alternative would be to specify RUN_SETUP_GENTLY for this
command in git.c, and then die if "-w" is set but we are not
in a repository. However, the error messages generated at
the time of setup_git_directory() are more detailed, so it's
better to find out which mode we are in, and then call the
appropriate function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Junio C Hamano
722c924445 Merge branch 'jk/options-cleanup'
Various clean-ups to the command line option parsing.

* jk/options-cleanup:
  apply, ls-files: simplify "-z" parsing
  checkout-index: disallow "--no-stage" option
  checkout-index: handle "--no-index" option
  checkout-index: handle "--no-prefix" option
  checkout-index: simplify "-z" option parsing
  give "nbuf" strbuf a more meaningful name
2016-02-10 14:20:08 -08:00
Jeff King
0d4cc1b45b give "nbuf" strbuf a more meaningful name
It's a common pattern in our code to read paths from stdin,
separated either by newlines or NULs, and unquote as
necessary. In each of these five cases we use "nbuf" to
temporarily store the unquoted value. Let's give it the more
meaningful name "unquoted", which makes it easier to
understand the purpose of the variable.

While we're at it, let's also static-initialize all of our
strbufs. It's not wrong to call strbuf_init, but it
increases the cognitive load on the reader, who might wonder
"do we sometimes avoid initializing them?  why?".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-01 13:43:02 -08:00
Junio C Hamano
c0353c78e8 hash-object: read --stdin-paths with strbuf_getline()
The list of paths could have been written with a DOS editor.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:24:34 -08:00
Junio C Hamano
8f309aeb82 strbuf: introduce strbuf_getline_{lf,nul}()
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago.  No useful caller that uses other value has emerged.

By now, it is clear that the interface is overly broad without a
good reason.  Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.

This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them.  The changes contained in this patch are:

 * introduction of these two functions in strbuf.[ch]

 * mechanical conversion of all callers to strbuf_getline() with
   either '\n' or '\0' as the third parameter to instead call the
   respective thin wrapper.

After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller.  An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15 10:12:51 -08:00
Junio C Hamano
33e8fc8740 usage: do not insist that standard input must come from a file
The synopsys text and the usage string of subcommands that read list
of things from the standard input are often shown like this:

	git gostak [--distim] < <list-of-doshes>

This is problematic in a number of ways:

 * The way to use these commands is more often to feed them the
   output from another command, not feed them from a file.

 * Manual pages outside Git, commands that operate on the data read
   from the standard input, e.g "sort", "grep", "sed", etc., are not
   described with such a "< redirection-from-file" in their synopsys
   text.  Our doing so introduces inconsistency.

 * We do not insist on where the output should go, by saying

	git gostak [--distim] < <list-of-doshes> > <output>

 * As it is our convention to enclose placeholders inside <braket>,
   the redirection operator followed by a placeholder filename
   becomes very hard to read, both in the documentation and in the
   help text.

Let's clean them all up, after making sure that the documentation
clearly describes the modes that take information from the standard
input and what kind of things are expected on the input.

[jc: stole example for fmt-merge-msg from Jonathan]

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-16 15:27:52 -07:00
Junio C Hamano
051086b947 Merge branch 'jc/hash-object'
"hash-object --literally" introduced in v2.2 was not prepared to
take a really long object type name.

* jc/hash-object:
  write_sha1_file(): do not use a separate sha1[] array
  t1007: add hash-object --literally tests
  hash-object --literally: fix buffer overrun with extra-long object type
  git-hash-object.txt: document --literally option
2015-05-11 14:23:59 -07:00
Eric Sunshine
0c3db67cc8 hash-object --literally: fix buffer overrun with extra-long object type
"hash-object" learned in 5ba9a93 (hash-object: add --literally
option, 2014-09-11) to allow crafting a corrupt/broken object of
unknown type.

When the user-provided type is particularly long, however, it can
overflow the relatively small stack-based character array handed to
write_sha1_file_prepare() by hash_sha1_file() and write_sha1_file(),
leading to stack corruption (and crash).  Introduce a custom helper
to allow arbitrarily long typenames just for "hash-object --literally".

[jc: Eric's original used a strbuf in the more common codepaths, and
I rewrote it to avoid penalizing the non-literally code. Bugs are mine]

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05 10:14:18 -07:00
Alex Henrie
9c9b4f2f8b standardize usage info string format
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:

- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14 09:32:04 -08:00
Junio C Hamano
5ba9a93b39 hash-object: add --literally option
This allows "hash-object --stdin" to just hash any garbage into a
"loose object" that may not pass the standard object parsing check
or fsck, so that different kind of corrupt objects we may encounter
in the field can be imitated in our test suite.  That would in turn
allow us to test features that catch these corrupt objects.

Note that "cat-file" may need to learn "--literally" option to allow
us peek into a truly broken object.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11 14:23:51 -07:00
Junio C Hamano
17b787f603 hash-object: pass 'write_object' as a flag
Instead of forcing callers of lower level functions write
(write_object ? HASH_WRITE_OBJECT : 0), prepare the flag to be
passed down in the callchain from the command line parser.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11 12:48:01 -07:00
Junio C Hamano
b64a984606 hash-object: reduce file-scope statics
Most of the knobs that affect helper functions called from
cmd_hash_object() were passed to them as parameters already, and the
only effect of having them as file-scope statics was to make the
reader wonder if the parameters are hiding the file-scope global
values by accident.  Adjust their initialisation and make them
function-local variables.

The only exception was no_filters hash_stdin_paths() peeked from the
file-scope global, which was converted to a parameter to the helper
function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11 12:23:42 -07:00
Stefan Beller
c83e8c1768 hash-object: replace stdin parsing OPT_BOOLEAN by OPT_COUNTUP
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). hash-object is a plumbing layer command, so better
not change the input/output behavior for now.

Unfortunately we have these lines relying on the count up mechanism of
OPT_BOOLEAN:

	if (hashstdin > 1)
		errstr = "Multiple --stdin arguments are not supported";

Using OPT_BOOL will make "git hash-object --stdin --stdin" the same
as "git hash-object --stdin", resulting in just one object, which
will surprise users with an expectation to see two objects hashed.

Because it is not good to silently succeed and give an unexpected
result, even when the expectation is unrealistic, we use COUNTUP to
explicitly catch such an error.

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-07 08:30:55 -07:00
Stefan Beller
d5d09d4754 Replace deprecated OPT_BOOLEAN by OPT_BOOL
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.

This patch does not change semantics of any command intentionally.

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05 11:32:19 -07:00
Nguyễn Thái Ngọc Duy
483bbf41ca i18n: hash-object: mark parseopt strings for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-20 12:23:17 -07:00
Junio C Hamano
c4ce46fc7a index_fd(): turn write_object and format_check arguments into one flag
The "format_check" parameter tucked after the existing parameters is too
ugly an afterthought to live in any reasonable API.

Combine it with the other boolean parameter "write_object" into a single
"flags" parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09 11:58:19 -07:00
Stephen Boyd
c2e86addb8 Fix sparse warnings
Fix warnings from 'make check'.

 - These files don't include 'builtin.h' causing sparse to complain that
   cmd_* isn't declared:

   builtin/clone.c:364, builtin/fetch-pack.c:797,
   builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
   builtin/merge-index.c:69, builtin/merge-recursive.c:22
   builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
   builtin/notes.c:822, builtin/pack-redundant.c:596,
   builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
   builtin/remote.c:1512, builtin/remote-ext.c:240,
   builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
   builtin/unpack-file.c:25, builtin/var.c:75

 - These files have symbols which should be marked static since they're
   only file scope:

   submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
   submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
   unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
   url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48

 - These files redeclare symbols to be different types:

   builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
   usage.c:49, usage.c:58, usage.c:63, usage.c:72

 - These files use a literal integer 0 when they really should use a NULL
   pointer:

   daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362

While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 10:16:54 -07:00
Nguyễn Thái Ngọc Duy
c879daa237 Make hash-object more robust against malformed objects
Commits, trees and tags have structure. Don't let users feed git
with malformed ones. Sooner or later git will die() when
encountering them.

Note that this patch does not check semantics. A tree that points
to non-existent objects is perfectly OK (and should be so, users
may choose to add commit first, then its associated tree for example).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-07 15:05:25 -08:00
Junio C Hamano
2e0e8b68e3 Merge branch 'lt/deepen-builtin-source'
* lt/deepen-builtin-source:
  Move 'builtin-*' into a 'builtin/' subdirectory

Conflicts:
	Makefile
2010-03-10 15:25:18 -08:00
Linus Torvalds
81b50f3ce4 Move 'builtin-*' into a 'builtin/' subdirectory
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of

	[torvalds@nehalem git]$ em buil<tab>
	Display all 180 possibilities? (y or n)
	[torvalds@nehalem git]$ em builtin-sh
	builtin-shortlog.c     builtin-show-branch.c  builtin-show-ref.c
	builtin-shortlog.o     builtin-show-branch.o  builtin-show-ref.o
	[torvalds@nehalem git]$ em builtin-shor<tab>
	builtin-shortlog.c  builtin-shortlog.o
	[torvalds@nehalem git]$ em builtin-shortlog.c

you get

	[torvalds@nehalem git]$ em buil<tab>		[type]
	builtin/   builtin.h
	[torvalds@nehalem git]$ em builtin		[auto-completes to]
	[torvalds@nehalem git]$ em builtin/sh<tab>	[type]
	shortlog.c     shortlog.o     show-branch.c  show-branch.o  show-ref.c     show-ref.o
	[torvalds@nehalem git]$ em builtin/sho		[auto-completes to]
	[torvalds@nehalem git]$ em builtin/shor<tab>	[type]
	shortlog.c  shortlog.o
	[torvalds@nehalem git]$ em builtin/shortlog.c

which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.

NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead.  I think bash has some cut-off
around 100 choices or something.

So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion.  But you can
simulate that by using 'ls' instead, or something similar.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-22 14:29:41 -08:00