Commit Graph

334 Commits

Author SHA1 Message Date
Junio C Hamano
11529ecec9 Merge branch 'jk/tighten-alloc'
Update various codepaths to avoid manually-counted malloc().

* jk/tighten-alloc: (22 commits)
  ewah: convert to REALLOC_ARRAY, etc
  convert ewah/bitmap code to use xmalloc
  diff_populate_gitlink: use a strbuf
  transport_anonymize_url: use xstrfmt
  git-compat-util: drop mempcpy compat code
  sequencer: simplify memory allocation of get_message
  test-path-utils: fix normalize_path_copy output buffer size
  fetch-pack: simplify add_sought_entry
  fast-import: simplify allocation in start_packfile
  write_untracked_extension: use FLEX_ALLOC helper
  prepare_{git,shell}_cmd: use argv_array
  use st_add and st_mult for allocation size computation
  convert trivial cases to FLEX_ARRAY macros
  use xmallocz to avoid size arithmetic
  convert trivial cases to ALLOC_ARRAY
  convert manual allocations to argv_array
  argv-array: add detach function
  add helpers for allocating flex-array structs
  harden REALLOC_ARRAY and xcalloc against size_t overflow
  tree-diff: catch integer overflow in combine_diff_path allocation
  ...
2016-02-26 13:37:16 -08:00
Junio C Hamano
e6a6a768ca Merge branch 'nd/dwim-wildcards-as-pathspecs'
"git show 'HEAD:Foo[BAR]Baz'" did not interpret the argument as a
rev, i.e. the object named by the the pathname with wildcard
characters in a tree object.

* nd/dwim-wildcards-as-pathspecs:
  get_sha1: don't die() on bogus search strings
  check_filename: tighten dwim-wildcard ambiguity
  checkout: reorder check_filename conditional
2016-02-24 13:25:52 -08:00
Jeff King
50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Junio C Hamano
fb795323ce Merge branch 'wp/sha1-name-negative-match'
A new "<branch>^{/!-<pattern>}" notation can be used to name a
commit that is reachable from <branch> that does not match the
given <pattern>.

* wp/sha1-name-negative-match:
  object name: introduce '^{/!-<negative pattern>}' notation
  test for '!' handling in rev-parse's named commits
2016-02-10 14:20:10 -08:00
Jeff King
aac4fac168 get_sha1: don't die() on bogus search strings
The get_sha1() function generally returns an error code
rather than dying, and we sometimes speculatively call it
with something that may be a revision or a pathspec, in
order to see which one it might be.

If it sees a bogus ":/" search string, though, it complains,
without giving the caller the opportunity to recover. We can
demonstrate this in t6133 by looking for ":/*.t", which
should mean "*.t at the root of the tree", but instead dies
because of the invalid regex (the "*" has nothing to operate
on).

We can fix this by returning an error rather than calling
die(). Unfortunately, the tradeoff is that the error message
is slightly worse in cases where we _do_ know we have a rev.
E.g., running "git log ':/*.t' --" before yielded:

  fatal: Invalid search pattern: *.t

and now we get only:

  fatal: bad revision ':/*.t'

There's not a simple way to fix this short of passing a
"quiet" flag all the way through the get_sha1() stack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-10 13:53:21 -08:00
Will Palmer
0769854f3d object name: introduce '^{/!-<negative pattern>}' notation
To name a commit, you can now use the :/!-<negative pattern> regex
style, and consequentially, say

    $ git rev-parse HEAD^{/!-foo}

and it will return the hash of the first commit reachable from HEAD,
whose commit message does not contain "foo". This is the opposite of the
existing <rev>^{/<pattern>} syntax.

The specific use-case this is intended for is to perform an operation,
excluding the most-recent commits containing a particular marker. For
example, if you tend to make "work in progress" commits, with messages
beginning with "WIP", you work, then it could be useful to diff against
"the most recent commit which was not a WIP commit". That sort of thing
now possible, via commands such as:

    $ git diff @^{/!-^WIP}

The leader '/!-', rather than simply '/!', to denote a negative match,
is chosen to leave room for additional modifiers in the future.

Signed-off-by: Will Palmer <wmpalmer@gmail.com>
Signed-off-by: Stephen P. Smith <ischis2@cox.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-01 13:40:37 -08:00
brian m. carlson
ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
f2fd0760f6 Convert struct object to object_id
struct object is one of the major data structures dealing with object
IDs.  Convert it to use struct object_id instead of an unsigned char
array.  Convert get_object_hash to refer to the new member as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Jeff King
43bb66ae0b diagnose_invalid_index_path: use strbuf to avoid strcpy/strcat
We dynamically allocate a buffer and then strcpy and strcat
into it. This isn't buggy, but we'd prefer to avoid these
suspicious functions.

This would be a good candidate for converstion to xstrfmt,
but we need to record the length for dealing with index
entries. A strbuf handles that for us.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
c3bb0ac796 find_short_object_filename: convert sprintf to xsnprintf
We use sprintf() to format some hex data into a buffer. The
buffer is clearly long enough, and using snprintf here is
not necessary. And in fact, it does not really make anything
easier to audit, as the size we feed to snprintf accounts
for the magic extra 42 bytes found in each alt->name field
of struct alternate_object_database (which is there exactly
to do this formatting).

Still, it is nice to remove an sprintf call and replace it
with an xsnprintf and explanatory comment, which makes it
easier to audit the code base for overflows.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
af49c6d091 add reentrant variants of sha1_to_hex and find_unique_abbrev
The sha1_to_hex and find_unique_abbrev functions always
write into reusable static buffers. There are a few problems
with this:

  - future calls overwrite our result. This is especially
    annoying with find_unique_abbrev, which does not have a
    ring of buffers, so you cannot even printf() a result
    that has two abbreviated sha1s.

  - if you want to put the result into another buffer, we
    often strcpy, which looks suspicious when auditing for
    overflows.

This patch introduces sha1_to_hex_r and find_unique_abbrev_r,
which write into a user-provided buffer. Of course this is
just punting on the overflow-auditing, as the buffer
obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is
much easier to audit, since that is a well-known size.

We retain the non-reentrant forms, which just become thin
wrappers around the reentrant ones. This patch also adds a
strbuf variant of find_unique_abbrev, which will be handy in
later patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
a5481a6c94 convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.

Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor.  However, the tricky case is where we use the
enum labels as constants, like:

  show_date(t, tz, DATE_NORMAL);

Ideally we could say:

  show_date(t, tz, &{ DATE_NORMAL });

but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:

  1. Manually add a "struct date_mode d = { DATE_NORMAL }"
     definition to each caller, and pass "&d". This makes
     the callers uglier, because they sometimes do not even
     have their own scope (e.g., they are inside a switch
     statement).

  2. Provide a pre-made global "date_normal" struct that can
     be passed by address. We'd also need "date_rfc2822",
     "date_iso8601", and so forth. But at least the ugliness
     is defined in one place.

  3. Provide a wrapper that generates the correct struct on
     the fly. The big downside is that we end up pointing to
     a single global, which makes our wrapper non-reentrant.
     But show_date is already not reentrant, so it does not
     matter.

This patch implements 3, along with a minor macro to keep
the size of the callers sane.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 11:39:07 -07:00
Junio C Hamano
5455ee0573 Merge branch 'bc/object-id'
for_each_ref() callback functions were taught to name the objects
not with "unsigned char sha1[20]" but with "struct object_id".

* bc/object-id: (56 commits)
  struct ref_lock: convert old_sha1 member to object_id
  warn_if_dangling_symref(): convert local variable "junk" to object_id
  each_ref_fn_adapter(): remove adapter
  rev_list_insert_ref(): remove unneeded arguments
  rev_list_insert_ref_oid(): new function, taking an object_oid
  mark_complete(): remove unneeded arguments
  mark_complete_oid(): new function, taking an object_oid
  clear_marks(): rewrite to take an object_id argument
  mark_complete(): rewrite to take an object_id argument
  send_ref(): convert local variable "peeled" to object_id
  upload-pack: rewrite functions to take object_id arguments
  find_symref(): convert local variable "unused" to object_id
  find_symref(): rewrite to take an object_id argument
  write_one_ref(): rewrite to take an object_id argument
  write_refs_to_temp_dir(): convert local variable sha1 to object_id
  submodule: rewrite to take an object_id argument
  shallow: rewrite functions to take object_id arguments
  handle_one_ref(): rewrite to take an object_id argument
  add_info_ref(): rewrite to take an object_id argument
  handle_one_reflog(): rewrite to take an object_id argument
  ...
2015-06-05 12:17:37 -07:00
Junio C Hamano
c4a8354bc1 Merge branch 'jk/at-push-sha1'
Introduce <branch>@{push} short-hand to denote the remote-tracking
branch that tracks the branch at the remote the <branch> would be
pushed to.

* jk/at-push-sha1:
  for-each-ref: accept "%(push)" format
  for-each-ref: use skip_prefix instead of starts_with
  sha1_name: implement @{push} shorthand
  sha1_name: refactor interpret_upstream_mark
  sha1_name: refactor upstream_mark
  remote.c: add branch_get_push
  remote.c: return upstream name from stat_tracking_info
  remote.c: untangle error logic in branch_get_upstream
  remote.c: report specific errors from branch_get_upstream
  remote.c: introduce branch_get_upstream helper
  remote.c: hoist read_config into remote_get_1
  remote.c: provide per-branch pushremote name
  remote.c: hoist branch.*.remote lookup out of remote_get_1
  remote.c: drop "remote" pointer from "struct branch"
  remote.c: refactor setup of branch->merge list
  remote.c: drop default_remote_name variable
2015-06-05 12:17:36 -07:00
Junio C Hamano
67f0b6f3b2 Merge branch 'dt/cat-file-follow-symlinks'
"git cat-file --batch(-check)" learned the "--follow-symlinks"
option that follows an in-tree symbolic link when asked about an
object via extended SHA-1 syntax, e.g. HEAD:RelNotes that points at
Documentation/RelNotes/2.5.0.txt.  With the new option, the command
behaves as if HEAD:Documentation/RelNotes/2.5.0.txt was given as
input instead.

* dt/cat-file-follow-symlinks:
  cat-file: add --follow-symlinks to --batch
  sha1_name: get_sha1_with_context learns to follow symlinks
  tree-walk: learn get_tree_entry_follow_symlinks
2015-06-01 12:45:16 -07:00
Michael Haggerty
9c5fe0b846 handle_one_ref(): rewrite to take an object_id argument
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:35 -07:00
Michael Haggerty
2b2a5be394 each_ref_fn: change to take an object_id parameter
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".

To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().

This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:27 -07:00
Jeff King
adfe5d0434 sha1_name: implement @{push} shorthand
In a triangular workflow, each branch may have two distinct
points of interest: the @{upstream} that you normally pull
from, and the destination that you normally push to. There
isn't a shorthand for the latter, but it's useful to have.

For instance, you may want to know which commits you haven't
pushed yet:

  git log @{push}..

Or as a more complicated example, imagine that you normally
pull changes from origin/master (which you set as your
@{upstream}), and push changes to your own personal fork
(e.g., as myfork/topic). You may push to your fork from
multiple machines, requiring you to integrate the changes
from the push destination, rather than upstream. With this
patch, you can just do:

  git rebase @{push}

rather than typing out the full name.

The heavy lifting is all done by branch_get_push; here we
just wire it up to the "@{push}" syntax.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-22 09:33:08 -07:00
Jeff King
48c58471c2 sha1_name: refactor interpret_upstream_mark
Now that most of the logic for our local get_upstream_branch
has been pushed into the generic branch_get_upstream, we can
fold the remainder into interpret_upstream_mark.

Furthermore, what remains is generic to any branch-related
"@{foo}" we might add in the future, and there's enough
boilerplate that we'd like to reuse it. Let's parameterize
the two operations (parsing the mark and computing its
value) so that we can reuse this for "@{push}" in the near
future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-22 09:33:08 -07:00
Jeff King
a1ad0eb0cb sha1_name: refactor upstream_mark
We will be adding new mark types in the future, so separate
the suffix data from the logic.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-22 09:33:08 -07:00
Jeff King
3a429d0af3 remote.c: report specific errors from branch_get_upstream
When the previous commit introduced the branch_get_upstream
helper, there was one call-site that could not be converted:
the one in sha1_name.c, which gives detailed error messages
for each possible failure.

Let's teach the helper to optionally report these specific
errors. This lets us convert another callsite, and means we
can use the helper in other locations that want to give the
same error messages.

The logic and error messages come straight from sha1_name.c,
with the exception that we start each error with a lowercase
letter, as is our usual style (note that a few tests need
updated as a result).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-21 11:07:46 -07:00
René Scharfe
dbe44faadb use file_exists() to check if a file exists in the worktree
Call file_exists() instead of open-coding it.  That's shorter, simpler
and the intent becomes clearer.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 13:49:10 -07:00
David Turner
c4ec96774b sha1_name: get_sha1_with_context learns to follow symlinks
Wire up get_sha1_with_context to call get_tree_entry_follow_symlinks
when GET_SHA1_FOLLOW_SYMLINKS is passed in flags. G_S_FOLLOW_SYMLINKS
is incompatible with G_S_ONLY_TO_DIE because the diagnosis
that ONLY_TO_DIE triggers does not at present consider symlinks, and
it would be a significant amount of additional code to allow it to
do so.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 13:46:13 -07:00
Junio C Hamano
daea6fca35 Merge branch 'rs/use-isxdigit'
Code cleanup.

* rs/use-isxdigit:
  use isxdigit() for checking if a character is a hexadecimal digit
2015-03-20 13:11:52 -07:00
René Scharfe
6f75d45b24 use isxdigit() for checking if a character is a hexadecimal digit
Use the standard function isxdigit() to make the intent clearer and
avoid using magic constants.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-10 15:44:41 -07:00
Junio C Hamano
8a6444d50e Merge branch 'rs/simple-cleanups'
Code cleanups.

* rs/simple-cleanups:
  sha1_name: use strlcpy() to copy strings
  pretty: use starts_with() to check for a prefix
  for-each-ref: use skip_prefix() to avoid duplicate string comparison
  connect: use strcmp() for string comparison
2015-03-05 12:45:42 -08:00
René Scharfe
2ce63e9fac sha1_name: use strlcpy() to copy strings
Use strlcpy() instead of calling strncpy() and then setting the last
byte of the target buffer to NUL explicitly.  This shortens and
simplifies the code a bit.

Signed-of-by: Rene Scharfe <l.s.r@web.de>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-22 12:01:38 -08:00
Junio C Hamano
098501527f Merge branch 'jc/merge-bases'
The get_merge_bases*() API was easy to misuse by careless
copy&paste coders, leaving object flags tainted in the commits that
needed to be traversed.

* jc/merge-bases:
  get_merge_bases(): always clean-up object flags
  bisect: clean flags after checking merge bases
2015-01-07 12:55:05 -08:00
Junio C Hamano
5109f2aaab Merge branch 'mh/find-uniq-abbrev'
The code to abbreviate an object name to its short unique prefix
has been optimized when no abbreviation was requested.

* mh/find-uniq-abbrev:
  sha1_name: avoid unnecessary sha1 lookup in find_unique_abbrev
2014-12-22 12:26:58 -08:00
Mike Hommey
61e704e38a sha1_name: avoid unnecessary sha1 lookup in find_unique_abbrev
An example where this happens is when doing an ls-tree on a tree that
contains a commit link. In that case, find_unique_abbrev is called
to get a non-abbreviated hex sha1, but still, a lookup is done as
to whether the sha1 is in the repository (which ends up looking for
a loose object in .git/objects), while the result of that lookup is
not used when returning a non-abbreviated hex sha1.

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-26 10:51:05 -08:00
Junio C Hamano
2ce406ccb8 get_merge_bases(): always clean-up object flags
The callers of get_merge_bases() can choose to leave object flags
used during the merge-base traversal by passing cleanup=0 as a
parameter, but in practice a very few callers can afford to do so
(namely, "git merge-base"), as they need to compute merge base in
preparation for other processing of their own and they need to see
the object without contaminate flags.

Change the function signature of get_merge_bases_many() and
get_merge_bases() to drop the cleanup parameter, so that the
majority of the callers do not have to say ", 1" at the end.

Give a new get_merge_bases_many_dirty() API to support only a few
callers that know they do not need to spend cycles cleaning up the
object flags.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-30 12:51:10 -07:00
David Aguilar
c41a87dd80 refs: make rev-parse --quiet actually quiet
When a reflog is deleted, e.g. when "git stash" clears its stashes,
"git rev-parse --verify --quiet" dies:

	fatal: Log for refs/stash is empty.

The reason is that the get_sha1() code path does not allow us
to suppress this message.

Pass the flags bitfield through get_sha1_with_context() so that
read_ref_at() can suppress the message.

Use get_sha1_with_context1() instead of get_sha1() in rev-parse
so that the --quiet flag is honored.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-19 10:46:15 -07:00
Junio C Hamano
294792326a Merge branch 'rs/list-optim'
Fix a couple of "accumulate into a sorted list" to "accumulate and
then sort the list".

* rs/list-optim:
  walker: avoid quadratic list insertion in mark_complete
  sha1_name: avoid quadratic list insertion in handle_one_ref
2014-09-11 10:33:35 -07:00
René Scharfe
e8d1dfe639 sha1_name: avoid quadratic list insertion in handle_one_ref
Similar to 16445242 (fetch-pack: avoid quadratic list insertion in
mark_complete), sort only after all refs are collected instead of while
inserting.  The result is the same, but it's more efficient that way.
The difference will only be measurable in repositories with a large
number of refs.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-25 10:27:52 -07:00
Junio C Hamano
ad524f834a Merge branch 'jk/misc-fixes-maint'
* jk/misc-fixes-maint:
  apply: avoid possible bogus pointer
  fix memory leak parsing core.commentchar
  transport: fix leaks in refs_from_alternate_cb
  free ref string returned by dwim_ref
  receive-pack: don't copy "dir" parameter
2014-07-28 11:30:41 -07:00
Jeff King
28b3563241 free ref string returned by dwim_ref
A call to "dwim_ref(name, len, flags, &ref)" will allocate a
new string in "ref" to return the exact ref we found. We do
not consistently free it in all code paths, leading to small
leaks. The worst is in get_sha1_basic, which may be called
many times (e.g., by "cat-file --batch"), though it is
relatively unlikely, as it only triggers on a bogus reflog
specification.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-24 13:57:49 -07:00
René Scharfe
e992d1eb39 use strbuf_addbuf for adding strbufs
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-10 14:06:45 -07:00
Junio C Hamano
3b8e8af187 Merge branch 'jk/xstrfmt'
* jk/xstrfmt:
  setup_git_env(): introduce git_path_from_env() helper
  unique_path: fix unlikely heap overflow
  walker_fetch: fix minor memory leak
  merge: use argv_array when spawning merge strategy
  sequencer: use argv_array_pushf
  setup_git_env: use git_pathdup instead of xmalloc + sprintf
  use xstrfmt to replace xmalloc + strcpy/strcat
  use xstrfmt to replace xmalloc + sprintf
  use xstrdup instead of xmalloc + strcpy
  use xstrfmt in favor of manual size calculations
  strbuf: add xstrfmt helper
2014-07-09 11:34:05 -07:00
Junio C Hamano
e91ae32a01 Merge branch 'jk/skip-prefix'
* jk/skip-prefix:
  http-push: refactor parsing of remote object names
  imap-send: use skip_prefix instead of using magic numbers
  use skip_prefix to avoid repeated calculations
  git: avoid magic number with skip_prefix
  fetch-pack: refactor parsing in get_ack
  fast-import: refactor parsing of spaces
  stat_opt: check extra strlen call
  daemon: use skip_prefix to avoid magic numbers
  fast-import: use skip_prefix for parsing input
  use skip_prefix to avoid repeating strings
  use skip_prefix to avoid magic numbers
  transport-helper: avoid reading past end-of-string
  fast-import: fix read of uninitialized argv memory
  apply: use skip_prefix instead of raw addition
  refactor skip_prefix to return a boolean
  avoid using skip_prefix as a boolean
  daemon: mark some strings as const
  parse_diff_color_slot: drop ofs parameter
2014-07-09 11:33:28 -07:00
Jeff King
95b567c7c3 use skip_prefix to avoid repeating strings
It's a common idiom to match a prefix and then skip past it
with strlen, like:

  if (starts_with(foo, "bar"))
	  foo += strlen("bar");

This avoids magic numbers, but means we have to repeat the
string (and there is no compiler check that we didn't make a
typo in one of the strings).

We can use skip_prefix to handle this case without repeating
ourselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:44:45 -07:00
Jeff King
b2724c8787 use xstrfmt to replace xmalloc + strcpy/strcat
It's easy to get manual allocation calculations wrong, and
the use of strcpy/strcat raise red flags for people looking
for buffer overflows (though in this case each site was
fine).

It's also shorter to use xstrfmt, and the printf-format
tends to be easier for a reader to see what the final string
will look like.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 15:20:54 -07:00
Jeff King
8597ea3afe commit: record buffer length in cache
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.

For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.

This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:09:38 -07:00
Jeff King
ba41c1c93f use get_commit_buffer to avoid duplicate code
For both of these sites, we already do the "fallback to
read_sha1_file" trick. But we can shorten the code by just
using get_commit_buffer.

Note that the error cases are slightly different when
read_sha1_file fails. get_commit_buffer will die() if the
object cannot be loaded, or is a non-commit.

For get_sha1_oneline, this will almost certainly never
happen, as we will have just called parse_object (and if it
does, it's probably worth complaining about).

For record_author_date, the new behavior is probably better;
we notify the user of the error instead of silently ignoring
it. And because it's used only for sorting by author-date,
somebody examining a corrupt repo can fallback to the
regular traversal order.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:08:17 -07:00
Junio C Hamano
b407d40933 Merge branch 'nd/log-show-linear-break'
Attempts to show where a single-strand-of-pearls break in "git log"
output.

* nd/log-show-linear-break:
  log: add --show-linear-break to help see non-linear history
  object.h: centralize object flag allocation
2014-04-03 12:38:11 -07:00
Nguyễn Thái Ngọc Duy
208acbfb82 object.h: centralize object flag allocation
While the field "flags" is mainly used by the revision walker, it is
also used in many other places. Centralize the whole flag allocation to
one place for a better overview (and easier to move flags if we have
too).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-25 15:09:24 -07:00
Junio C Hamano
4e9f9320e3 Merge branch 'jk/interpret-branch-name-fix'
Fix a handful of bugs around interpreting $branch@{upstream}
notation and its lookalike, when $branch part has interesting
characters, e.g. "@", and ":".

* jk/interpret-branch-name-fix:
  interpret_branch_name: find all possible @-marks
  interpret_branch_name: avoid @{upstream} past colon
  interpret_branch_name: always respect "namelen" parameter
  interpret_branch_name: rename "cp" variable to "at"
  interpret_branch_name: factor out upstream handling
2014-01-27 10:44:21 -08:00
Jeff King
9892d5d454 interpret_branch_name: find all possible @-marks
When we parse a string like "foo@{upstream}", we look for
the first "@"-sign, and check to see if it is an upstream
mark. However, since branch names can contain an @, we may
also see "@foo@{upstream}". In this case, we check only the
first @, and ignore the second. As a result, we do not find
the upstream.

We can solve this by iterating through all @-marks in the
string, and seeing if any is a legitimate upstream or
empty-at mark.

Another strategy would be to parse from the right-hand side
of the string. However, that does not work for the
"empty_at" case, which allows "@@{upstream}". We need to
find the left-most one in this case (and we then recurse as
"HEAD@{upstream}").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 12:51:14 -08:00
Jeff King
3f6eb30f1d interpret_branch_name: avoid @{upstream} past colon
get_sha1() cannot currently parse a valid object name like
"HEAD:@{upstream}" (assuming that such an oddly named file
exists in the HEAD commit). It takes two passes to parse the
string:

  1. It first considers the whole thing as a ref, which
     results in looking for the upstream of "HEAD:".

  2. It finds the colon, parses "HEAD" as a tree-ish, and then
     finds the path "@{upstream}" in the tree.

For a path that looks like a normal reflog (e.g.,
"HEAD:@{yesterday}"), the first pass is a no-op. We try to
dwim_ref("HEAD:"), that returns zero refs, and we proceed
with colon-parsing.

For "HEAD:@{upstream}", though, the first pass ends up in
interpret_upstream_mark, which tries to find the branch
"HEAD:". When it sees that the branch does not exist, it
actually dies rather than returning an error to the caller.
As a result, we never make it to the second pass.

One obvious way of fixing this would be to teach
interpret_upstream_mark to simply report "no, this isn't an
upstream" in such a case. However, that would make the
error-reporting for legitimate upstream cases significantly
worse. Something like "bogus@{upstream}" would simply report
"unknown revision: bogus@{upstream}", while the current code
diagnoses a wide variety of possible misconfigurations (no
such branch, branch exists but does not have upstream, etc).

However, we can take advantage of the fact that a branch
name cannot contain a colon. Therefore even if we find an
upstream mark, any prefix with a colon must mean that
the upstream mark we found is actually a pathname, and
should be disregarded completely. This patch implements that
logic.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 12:43:29 -08:00
Jeff King
8cd4249c4c interpret_branch_name: always respect "namelen" parameter
interpret_branch_name gets passed a "name" buffer to parse,
along with a "namelen" parameter representing its length. If
"namelen" is zero, we fallback to the NUL-terminated
string-length of "name".

However, it does not necessarily follow that if we have
gotten a non-zero "namelen", it is the NUL-terminated
string-length of "name". E.g., when get_sha1() is parsing
"foo:bar", we will be asked to operate only on the first
three characters.

Yet in interpret_branch_name and its helpers, we use string
functions like strchr() to operate on "name", looking past
the length we were given.  This can result in us mis-parsing
object names.  We should instead be limiting our search to
"namelen" bytes.

There are three distinct types of object names this patch
addresses:

  - The intrepret_empty_at helper uses strchr to find the
    next @-expression after our potential empty-at.  In an
    expression like "@:foo@bar", it erroneously thinks that
    the second "@" is relevant, even if we were asked only
    to look at the first character. This case is easy to
    trigger (and we test it in this patch).

  - When finding the initial @-mark for @{upstream}, we use
    strchr.  This means we might treat "foo:@{upstream}" as
    the upstream for "foo:", even though we were asked only
    to look at "foo". We cannot test this one in practice,
    because it is masked by another bug (which is fixed in
    the next patch).

  - The interpret_nth_prior_checkout helper did not receive
    the name length at all. This turns out not to be a
    problem in practice, though, because its parsing is so
    limited: it always starts from the far-left of the
    string, and will not tolerate a colon (which is
    currently the only way to get a smaller-than-strlen
    "namelen"). However, it's still worth fixing to make the
    code more obviously correct, and to future-proof us
    against callers with more exotic buffers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 12:41:03 -08:00
Jeff King
f278f40f09 interpret_branch_name: rename "cp" variable to "at"
In the original version of this function, "cp" acted as a
pointer to many different things. Since the refactoring in
the last patch, it only marks the at-sign in the string.
Let's use a more descriptive variable name.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 12:38:47 -08:00
Jeff King
a39c14af82 interpret_branch_name: factor out upstream handling
This function checks a few different @{}-constructs. The
early part checks for and dispatches us to helpers for each
construct, but the code for handling @{upstream} is inline.

Let's factor this out into its own function. This makes
interpret_branch_name more readable, and will make it much
simpler to further refactor the function in future patches.

While we're at it, let's also break apart the refactored
code into a few helper functions. These will be useful if we
eventually implement similar @{upstream}-like constructs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-15 12:38:30 -08:00
Junio C Hamano
0a8cb03555 Merge branch 'br/sha1-name-40-hex-no-disambiguation'
When parsing a 40-hex string into the object name, the string is
checked to see if it can be interpreted as a ref so that a warning
can be given for ambiguity. The code kicked in even when the
core.warnambiguousrefs is set to false to squelch this warning, in
which case the cycles spent to look at the ref namespace were an
expensive no-op, as the result was discarded without being used.

* br/sha1-name-40-hex-no-disambiguation:
  sha1_name: don't resolve refs when core.warnambiguousrefs is false
2014-01-13 11:33:29 -08:00
Brodie Rao
832cf74c07 sha1_name: don't resolve refs when core.warnambiguousrefs is false
When seeing a full 40-hex object name, get_sha1_basic()
unconditionally checks if the string can also be interpreted as a
refname, but the result will not be used unless warn_ambiguous_refs
is in effect.

Omitting this unnecessary ref resolution provides a substantial
performance improvement, especially when passing many hashes to a
command (like "git rev-list --stdin") and core.warnambiguousrefs is
set to false.  The check incurs 6 stat()s for every hash supplied,
which can be costly over NFS.

Signed-off-by: Brodie Rao <brodie@sf.io>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-07 09:51:56 -08:00
Junio C Hamano
ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
Christian Couder
5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Junio C Hamano
5bb62059f2 Merge branch 'jk/robustify-parse-commit'
* jk/robustify-parse-commit:
  checkout: do not die when leaving broken detached HEAD
  use parse_commit_or_die instead of custom message
  use parse_commit_or_die instead of segfaulting
  assume parse_commit checks for NULL commit
  assume parse_commit checks commit->object.parsed
  log_tree_diff: die when we fail to parse a commit
2013-12-05 12:54:01 -08:00
Felipe Contreras
57b15ead77 sha1-name: trivial style cleanup
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-31 13:47:19 -07:00
Jeff King
5e7d4d3e93 assume parse_commit checks for NULL commit
The parse_commit function will check whether it was passed a
NULL commit pointer, and if so, return an error. There is no
need for callers to check this separately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-24 15:43:50 -07:00
Junio C Hamano
f406140baa Merge branch 'fc/at-head'
Instead of typing four capital letters "HEAD", you can say "@" now,
e.g. "git log @".

* fc/at-head:
  Add new @ shortcut for HEAD
  sha1-name: pass len argument to interpret_branch_name()
2013-09-20 12:38:10 -07:00
Junio C Hamano
638924fec2 Merge branch 'rh/peeling-tag-to-tag'
Make "foo^{tag}" to peel a tag to itself, i.e. no-op., and fail if
"foo" is not a tag.  "git rev-parse --verify v1.0^{tag}" would be a
more convenient way to say "test $(git cat-file -t v1.0) = tag".

* rh/peeling-tag-to-tag:
  peel_onion: do not assume length of x_type globals
  peel_onion(): add support for <rev>^{tag}
2013-09-20 12:27:18 -07:00
Felipe Contreras
9ba89f484e Add new @ shortcut for HEAD
Typing 'HEAD' is tedious, especially when we can use '@' instead.

The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.

So now we can use 'git show @~1', and all that goody goodness.

Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-12 14:39:34 -07:00
Richard Hansen
a8a5406ab3 use 'commit-ish' instead of 'committish'
Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.

The only remaining instances of 'committish' are:
  * variable, function, and macro names
  * "(also committish)" in the definition of commit-ish in
    gitglossary[7]

Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-04 15:03:03 -07:00
Jeff King
c969b6a18d peel_onion: do not assume length of x_type globals
When we are parsing "rev^{foo}", we check "foo" against the
various global type strings, like "commit_type",
"tree_type", etc. This is nicely abstracted, but then we
destroy the abstraction completely by using magic numbers
that must match the length of the type strings.

We could avoid these magic numbers by using skip_prefix. But
taking a step back, we can realize that using the
"commit_type" global is not really buying us anything. It is
not ever going to change from being "commit" without causing
severe breakage to existing uses. And even if it did change
for some crazy reason, we would want to evaluate its effects
on the "rev^{}" syntax, anyway.

Let's just switch these to using a custom string literal, as
we do for "rev^{object}". The resulting code is more robust
to changes in the type strings, and is more readable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 13:45:38 -07:00
Richard Hansen
75aa26d34c peel_onion(): add support for <rev>^{tag}
Complete the <rev>^{<type>} family of object descriptors by having
<rev>^{tag} dereference <rev> until a tag object is found (or fail if
unable).

At first glance this may not seem very useful, as commits, trees, and
blobs cannot be peeled to a tag, and a tag would just peel to itself.
However, this can be used to ensure that <rev> names a tag object:

    $ git rev-parse --verify v1.8.4^{tag}
    04f013dc38d7512eadb915eba22efc414f18b869
    $ git rev-parse --verify master^{tag}
    error: master^{tag}: expected tag type, but the object dereferences to tree type
    fatal: Needed a single revision

Users can already ensure that <rev> is a tag object by checking the
output of 'git cat-file -t <rev>', but:
  * users may expect <rev>^{tag} to exist given that <rev>^{commit},
    <rev>^{tree}, and <rev>^{blob} all exist
  * this syntax is more convenient/natural in some circumstances

Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 13:09:17 -07:00
Felipe Contreras
cf99a761d3 sha1-name: pass len argument to interpret_branch_name()
This is useful to make sure we don't step outside the boundaries of what
we are interpreting at the moment. For example while interpreting
foobar@{u}~1, the job of interpret_branch_name() ends right before ~1,
but there's no way to figure that out inside the function, unless the
len argument is passed.

So let's do that.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:33:00 -07:00
Junio C Hamano
2c2b6646c2 Revert "Add new @ shortcut for HEAD"
This reverts commit cdfd94837b, as it
does not just apply to "@" (and forms with modifiers like @{u}
applied to it), but also affects e.g. "refs/heads/@/foo", which it
shouldn't.

The basic idea of giving a short-hand might be good, and the topic
can be retried later, but let's revert to avoid affecting existing
use cases for now for the upcoming release.
2013-08-14 15:04:24 -07:00
Thomas Rast
8dc84fdc48 Rename advice.object_name_warning to objectNameWarning
We spell config variables in camelCase instead of with_underscores.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-31 15:20:07 -07:00
Junio C Hamano
0def7126fd Merge branch 'ob/typofixes'
* ob/typofixes:
  typofix: in-code comments
  typofix: documentation
  typofix: release notes
2013-07-24 19:23:01 -07:00
Junio C Hamano
356df9bd8d Merge branch 'jk/cat-file-batch-optim'
If somebody wants to only know on-disk footprint of an object
without having to know its type or payload size, we can bypass a
lot of code to cheaply learn it.

* jk/cat-file-batch-optim:
  Fix some sparse warnings
  sha1_object_info_extended: pass object_info to helpers
  sha1_object_info_extended: make type calculation optional
  packed_object_info: make type lookup optional
  packed_object_info: hoist delta type resolution to helper
  sha1_loose_object_info: make type lookup optional
  sha1_object_info_extended: rename "status" to "type"
  cat-file: disable object/refname ambiguity check for batch mode
2013-07-24 19:21:21 -07:00
Ondřej Bílka
749f763dbb typofix: in-code comments
Signed-off-by: Ondřej Bílka <neleai@seznam.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22 16:06:49 -07:00
Junio C Hamano
d3aeb31dc4 Merge branch 'nd/const-struct-cache-entry'
* nd/const-struct-cache-entry:
  Convert "struct cache_entry *" to "const ..." wherever possible
2013-07-22 11:24:01 -07:00
Jeff King
25fba78d36 cat-file: disable object/refname ambiguity check for batch mode
A common use of "cat-file --batch-check" is to feed a list
of objects from "rev-list --objects" or a similar command.
In this instance, all of our input objects are 40-byte sha1
ids. However, cat-file has always allowed arbitrary revision
specifiers, and feeds the result to get_sha1().

Fortunately, get_sha1() recognizes a 40-byte sha1 before
doing any hard work trying to look up refs, meaning this
scenario should end up spending very little time converting
the input into an object sha1. However, since 798c35f
(get_sha1: warn about full or short object names that look
like refs, 2013-05-29), when we encounter this case, we
spend the extra effort to do a refname lookup anyway, just
to print a warning. This is further exacerbated by ca91993
(get_packed_ref_cache: reload packed-refs file when it
changes, 2013-06-20), which makes individual ref lookup more
expensive by requiring a stat() of the packed-refs file for
each missing ref.

With no patches, this is the time it takes to run:

  $ git rev-list --objects --all >objects
  $ time git cat-file --batch-check='%(objectname)' <objects

on the linux.git repository:

  real    1m13.494s
  user    0m25.924s
  sys     0m47.532s

If we revert ca91993, the packed-refs up-to-date check, it
gets a little better:

  real    0m54.697s
  user    0m21.692s
  sys     0m32.916s

but we are still spending quite a bit of time on ref lookup
(and we would not want to revert that patch, anyway, which
has correctness issues).  If we revert 798c35f, disabling
the warning entirely, we get a much more reasonable time:

  real    0m7.452s
  user    0m6.836s
  sys     0m0.608s

This patch does the moral equivalent of this final case (and
gets similar speedups). We introduce a global flag that
callers of get_sha1() can use to avoid paying the price for
the warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12 10:09:56 -07:00
Junio C Hamano
ee6e5843c1 Merge branch 'nd/warn-ambiguous-object-name' into jk/cat-file-batch-optim
* nd/warn-ambiguous-object-name:
  get_sha1: warn about full or short object names that look like refs
2013-07-12 10:09:50 -07:00
Junio C Hamano
eb40e51597 Merge branch 'jc/t1512-fix'
A test that should have failed but didn't revealed a bug that needs
to be corrected.

* jc/t1512-fix:
  get_short_sha1(): correctly disambiguate type-limited abbreviation
  t1512: correct leftover constants from earlier edition
2013-07-11 13:06:11 -07:00
Nguyễn Thái Ngọc Duy
9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Junio C Hamano
94d75d1ed5 get_short_sha1(): correctly disambiguate type-limited abbreviation
One test in t1512 that expects a failure incorrectly passed.  The
test prepares a commit whose object name begins with ten "0"s, and
also prepares a tag that points at the commit.  The object name of
the tag also begins with ten "0"s.  There is no other commit-ish
object in the repository whose name begins with such a prefix.

Ideally, in such a repository:

    $ git rev-parse --verify 0000000000^{commit}

should yield that commit.  If 0000000000 is taken as the commit
0000000000e4f, peeling it to a commmit yields that commit itself,
and if 0000000000 is taken as the tag 0000000000f8f, peeling it to a
commit also yields the same commit, so in that twisted sense, the
extended SHA-1 expression 0000000000^{commit} is unambigous.  The
test that expects a failure is to check the above command.

The reason the test expects a failure is that we did not implement
such a "unification" of two candidate objects.  What we did (or at
least, meant to) implement was to recognise that a commit-ish is
required to expand 0000000000, and notice that there are two succh
commit-ish, and diagnose the request as ambiguous.

However, there was a bug in the logic to check the candidate
objects.  When the code saw 0000000000f8f (a tag) that shared the
shortened prefix (ten "0"s), it tried to make sure that the tag is a
commit-ish by looking at the tag object.  Because it incorrectly
used lookup_object() when the tag has not been parsed, however, we
incorrectly declared that the tag is _not_ a commit-ish, leaving the
sole commit in the repository, 0000000000e4f, that has the required
prefix as "unique match", causing the test to pass when it shouldn't.

This fixes the logic to inspect the type of the object a tag refers
to, to make the test that is expected to fail correctly fail.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-01 21:54:45 -07:00
Junio C Hamano
bb1c8fbcc8 Merge branch 'fc/at-head'
Instead of typing four capital letters "HEAD", you can say "@"
instead.

* fc/at-head:
  sha1_name: compare variable with constant, not constant with variable
  Add new @ shortcut for HEAD
  sha1_name: refactor reinterpret()
  sha1_name: check @{-N} errors sooner
  sha1_name: reorganize get_sha1_basic()
  sha1_name: don't waste cycles in the @-parsing loop
  sha1_name: remove unnecessary braces
  sha1_name: remove no-op
  tests: at-combinations: @{N} versus HEAD@{N}
  tests: at-combinations: increase coverage
  tests: at-combinations: improve nonsense()
  tests: at-combinations: check ref names directly
  tests: at-combinations: simplify setup
2013-06-11 13:31:23 -07:00
Junio C Hamano
f4c52a0527 Merge branch 'nd/warn-ambiguous-object-name'
"git cmd <name>", when <name> happens to be a 40-hex string,
directly uses the 40-hex string as an object name, even if a ref
"refs/<some hierarchy>/<name>" exists.  This disambiguation order
is unlikely to change, but we should warn about the ambiguity just
like we warn when more than one refs/ hierachies share the same
name.

* nd/warn-ambiguous-object-name:
  get_sha1: warn about full or short object names that look like refs
2013-06-11 13:31:07 -07:00
Junio C Hamano
03b1558208 Merge branch 'rr/die-on-missing-upstream'
When a reflog notation is used for implicit "current branch", we
did not say which branch and worse said "branch ''".

* rr/die-on-missing-upstream:
  sha1_name: fix error message for @{<N>}, @{<date>}
  sha1_name: fix error message for @{u}
2013-06-11 13:29:59 -07:00
Ramkumar Ramachandra
305ebea06d sha1_name: fix error message for @{<N>}, @{<date>}
Currently, when we try to resolve @{<N>} or @{<date>} when the reflog
doesn't go back far enough, we get errors like:

  # on branch master
  $ git show @{10000}
  fatal: Log for '' only has 7 entries.

  $ git show @{10000.days.ago}
  warning: Log for '' only goes back to Tue, 21 May 2013 14:14:45 +0530.
  ...

  # detached HEAD case
  $ git show @{10000}
  fatal: Log for '' only has 2005 entries.

  $ git show master@{10000}
  fatal: Log for 'master' only has 7 entries.

The empty string '' is confusing and does not convey information
about whose logs we are inspecting.  Change this so that we get:

  # on branch master
  $ git show @{10000}
  fatal: Log for 'master' only has 7 entries.

  $ git show @{10000.days.ago}
  warning: Log for 'master' only goes back to Tue, 21 May 2013 14:14:45 +0530.
  ...

  # detached HEAD case
  $ git show @{10000}
  fatal: Log for 'HEAD' only has 2005 entries.

  $ git show master@{10000}
  fatal: Log for 'master' only has 7 entries.

Also one of the message strings given to die() now points into
real_ref that was not used in that fashion, so stop freeing the
underlying storage for it.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Bug-spotted-and-fixed-by: Thomas Rast
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02 12:05:36 -07:00
Nguyễn Thái Ngọc Duy
798c35fcd8 get_sha1: warn about full or short object names that look like refs
When we get 40 hex digits, we immediately assume it's an SHA-1. This
is the right thing to do because we have no way else to specify an
object. If there is a ref with the same object name, it will be
ignored. Warn the user about this case because the ref with full
object name is likely a mistake, for example

    git checkout -b $empty_var $(git rev-parse something)

advice.object_name_warning is not documented because frankly people
should not be aware about it until they encounter this situation.

While at there, warn about ambiguation with abbreviated SHA-1 too.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-29 11:31:36 -07:00
Ramkumar Ramachandra
17bf4ff3cd sha1_name: fix error message for @{u}
Currently, when no (valid) upstream is configured for a branch, you get
an error like:

  $ git show @{u}
  error: No upstream configured for branch 'upstream-error'
  error: No upstream configured for branch 'upstream-error'
  fatal: ambiguous argument '@{u}': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

The "error: " line actually appears twice, and the rest of the error
message is useless.  In sha1_name.c:interpret_branch_name(), there is
really no point in processing further if @{u} couldn't be resolved, and
we might as well die() instead of returning an error().  After making
this change, you get:

  $ git show @{u}
  fatal: No upstream configured for branch 'upstream-error'

Also tweak a few tests in t1507 to expect this output.

This only turns error() that may be called after we know we are
dealing with an @{upstream} marker into die(), without touching
silent error returns "return -1" from the function.  Any caller that
wants to handle an error condition itself will not be hurt by this
change, unless they want to see the message from error() and then
exit silently without giving its own message, which needs to be
fixed anyway.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-22 12:46:02 -07:00
Junio C Hamano
84cf246670 strbuf_branchname(): do not double-expand @{-1}~22
If you were on 'frotz' branch before you checked out your current
branch, "git merge @{-1}~22" means the same as "git merge frotz~22".

The strbuf_branchname() function, when interpret_branch_name() gives
up resolving "@{-1}~22" fully, returns "frotz" and tells the caller
that it only resolved "@{-1}" part of the input, mistakes this as a
total failure, and appends the whole thing to the result, yielding
"frotz@{-1}~22", which does not make any sense.

Inspect the return value from interpret_branch_name() a bit more
carefully.  When it errored out without consuming anything, we will
get -1 and we should return the whole thing.  Otherwise, we should
append the remainder (i.e. "~22" in the earlier example) to the
partially resolved name (i.e. "frotz").

The test suite adds enough number of checkout to make @{-12} in the
last test in t0100 that tried to check "we haven't flipped branches
that many times" error case succeed; raise the number to a hundred.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16 12:53:59 -07:00
Felipe Contreras
1f27e7d56b sha1_name: compare variable with constant, not constant with variable
And restructure the if/else to factor out the common "is len positive?"
test into a single conditional.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 12:13:12 -07:00
Felipe Contreras
cdfd94837b Add new @ shortcut for HEAD
Typing 'HEAD' is tedious, especially when we can use '@' instead.

The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.

So now we can use 'git show @~1', and all that goody goodness.

Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 12:13:12 -07:00
Felipe Contreras
7a0a49a7ca sha1_name: refactor reinterpret()
This code essentially replaces part of ref with another ref, for example
'@{-1}@{u}' is replaced with 'master@{u}', but this can be reused for
other purposes other than nth prior checkouts.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 12:13:12 -07:00
Ramkumar Ramachandra
83d16bc7be sha1_name: check @{-N} errors sooner
It's trivial to check for them in the @{N} parsing loop.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 12:13:12 -07:00
Felipe Contreras
128fd54dae sha1_name: reorganize get_sha1_basic()
Through the years the functionality to handle @{-N} and @{u} has moved
around the code, and as a result, code that once made sense, doesn't any
more.

There is no need to call this function recursively with the branch of
@{-N} substituted because dwim_{ref,log} already replaces it.

However, there's one corner-case where @{-N} resolves to a detached
HEAD, in which case we wouldn't get any ref back.

So we parse the nth-prior manually, and deal with it depending on
whether it's a SHA-1, or a ref.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 12:13:04 -07:00
Ramkumar Ramachandra
e883a057a9 sha1_name: don't waste cycles in the @-parsing loop
The @-parsing loop unnecessarily checks for the sequence "@{" from
(len - 2) unnecessarily.  We can safely check from (len - 4).

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 09:15:37 -07:00
Felipe Contreras
b5f769a0d7 sha1_name: remove unnecessary braces
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 09:15:37 -07:00
Felipe Contreras
1ef2d8dacc sha1_name: remove no-op
'at' is always 0, since we can reach this point only if
!len && reflog_len, and len=at when reflog is assigned.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-08 09:15:37 -07:00
Junio C Hamano
1b7b22bfd0 Merge branch 'jc/sha1-name-object-peeler'
There was no good way to ask "I have a random string that came from
outside world. I want to turn it into a 40-hex object name while
making sure such an object exists".  A new peeling suffix ^{object}
can be used for that purpose, together with "rev-parse --verify".

* jc/sha1-name-object-peeler:
  peel_onion(): teach $foo^{object} peeler
  peel_onion: disambiguate to favor tree-ish when we know we want a tree-ish
2013-04-03 09:34:54 -07:00
Junio C Hamano
a6a3f2cc07 peel_onion(): teach $foo^{object} peeler
A string that names an object can be suffixed with ^{type} peeler to
say "I have this object name; peel it until you get this type. If
you cannot do so, it is an error".  v1.8.2^{commit} asks for a commit
that is pointed at an annotated tag v1.8.2; v1.8.2^{tree} unwraps it
further to the top-level tree object.  A special suffix ^{} (i.e. no
type specified) means "I do not care what it unwraps to; just peel
annotated tag until you get something that is not a tag".

When you have a random user-supplied string, you can turn it to a
bare 40-hex object name, and cause it to error out if such an object
does not exist, with:

	git rev-parse --verify "$userstring^{}"

for most objects, but this does not yield the tag object name when
$userstring refers to an annotated tag.

Introduce a new suffix, ^{object}, that only makes sure the given
name refers to an existing object.  Then

	git rev-parse --verify "$userstring^{object}"

becomes a way to make sure $userstring refers to an existing object.

This is necessary because the plumbing "rev-parse --verify" is only
about "make sure the argument is something we can feed to get_sha1()
and turn it into a raw 20-byte object name SHA-1" and is not about
"make sure that 20-byte object name SHA-1 refers to an object that
exists in our object store".  When the given $userstring is already
a 40-hex, by definition "rev-parse --verify $userstring" can turn it
into a raw 20-byte object name.  With "$userstring^{object}", we can
make sure that the 40-hex string names an object that exists in our
object store before "--verify" kicks in.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-31 15:57:42 -07:00
Junio C Hamano
ed1ca6025f peel_onion: disambiguate to favor tree-ish when we know we want a tree-ish
The function already knows when interpreting $foo^{commit} to tell
the underlying get_sha1_1() to expect a commit-ish while evaluating
$foo.  Teach it to do the same when asked for $foo^{tree}; we are
expecting a tree-ish and $foo should be disambiguated in favor of a
tree-ish, discarding a possible ambiguous match with a blob object.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-31 15:19:52 -07:00
Junio C Hamano
6beb484f25 Merge branch 'jc/reflog-reverse-walk'
An internal function used to implement "git checkout @{-1}" was
hard to use correctly.

* jc/reflog-reverse-walk:
  refs.c: fix fread error handling
  reflog: add for_each_reflog_ent_reverse() API
  for_each_recent_reflog_ent(): simplify opening of a reflog file
  for_each_reflog_ent(): extract a helper to process a single entry
2013-03-26 13:15:56 -07:00
René Scharfe
b2981d0622 sha1_name: pass object name length to diagnose_invalid_sha1_path()
The only caller of diagnose_invalid_sha1_path() extracts a substring from
an object name by creating a NUL-terminated copy of the interesting part.
Add a length parameter to the function and thus avoid the need for an
allocation, thereby simplifying the code.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-17 00:10:51 -07:00
Junio C Hamano
98f85ff4b6 reflog: add for_each_reflog_ent_reverse() API
"git checkout -" is a short-hand for "git checkout @{-1}" and the
"@{nth}" notation for a negative number is to find nth previous
checkout in the reflog of the HEAD to determine the name of the
branch the user was on.  We would want to find the nth most recent
reflog entry that matches "checkout: moving from X to Y" for this.

Unfortunately, reflog is implemented as an append-only file, and the
API to iterate over its entries, for_each_reflog_ent(), reads the
file in order, giving the entries from the oldest to newer.  For the
purpose of finding nth most recent one, this API forces us to record
the last n entries in a rotating buffer and give the result out only
after we read everything.  To optimize for a common case of finding
the nth most recent one for a small value of n, we also have a side
API for_each_recent_reflog_ent() that starts reading near the end of
the file, but it still has to read the entries in the "wrong" order.
The implementation of understanding @{-1} uses this interface.

This all becomes unnecessary if we add an API to let us iterate over
reflog entries in the reverse order, from the newest to older.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 14:00:22 -08:00
Junio C Hamano
0958a24d73 Merge branch 'jc/sha1-name-more'
Teaches the object name parser things like a "git describe" output
is always a commit object, "A" in "git log A" must be a committish,
and "A" and "B" in "git log A...B" both must be committish, etc., to
prolong the lifetime of abbreviated object names.

* jc/sha1-name-more: (27 commits)
  t1512: match the "other" object names
  t1512: ignore whitespaces in wc -l output
  rev-parse --disambiguate=<prefix>
  rev-parse: A and B in "rev-parse A..B" refer to committish
  reset: the command takes committish
  commit-tree: the command wants a tree and commits
  apply: --build-fake-ancestor expects blobs
  sha1_name.c: add support for disambiguating other types
  revision.c: the "log" family, except for "show", takes committish
  revision.c: allow handle_revision_arg() to take other flags
  sha1_name.c: introduce get_sha1_committish()
  sha1_name.c: teach lookup context to get_sha1_with_context()
  sha1_name.c: many short names can only be committish
  sha1_name.c: get_sha1_1() takes lookup flags
  sha1_name.c: get_describe_name() by definition groks only commits
  sha1_name.c: teach get_short_sha1() a commit-only option
  sha1_name.c: allow get_short_sha1() to take other flags
  get_sha1(): fix error status regression
  sha1_name.c: restructure disambiguation of short names
  sha1_name.c: correct misnamed "canonical" and "res"
  ...
2012-07-22 12:55:07 -07:00
Junio C Hamano
957d74062c rev-parse --disambiguate=<prefix>
The new option allows you to feed an ambiguous prefix and enumerate
all the objects that share it as a prefix of their object names.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:23 -07:00