Commit Graph

1026 Commits

Author SHA1 Message Date
Junio C Hamano
fa2f83c654 Merge branch 'jk/suppress-clang-warning'
* jk/suppress-clang-warning:
  fix clang -Wunused-value warnings for error functions
2013-01-23 21:19:00 -08:00
Junio C Hamano
d82dd26964 Merge branch 'cr/push-force-tag-update'
Regression fix to stop "git push" complaining "target ref already
exists", when it is not the real reason the command rejected the
request (e.g. non-fast-forward).

* cr/push-force-tag-update:
  push: fix "refs/tags/ hierarchy cannot be updated without --force"
2013-01-23 21:16:49 -08:00
Junio C Hamano
256b9d70a4 push: fix "refs/tags/ hierarchy cannot be updated without --force"
When pushing to update a branch with a commit that is not a
descendant of the commit at the tip, a wrong message "already
exists" was given, instead of the correct "non-fast-forward", if we
do not have the object sitting in the destination repository at the
tip of the ref we are updating.

The primary cause of the bug is that the check in a new helper
function is_forwardable() assumed both old and new objects are
available and can be checked, which is not always the case.

The way the caller uses the result of this function is also wrong.
If the helper says "we do not want to let this push go through", the
caller unconditionally translates it into "we blocked it because the
destination already exists", which is not true at all in this case.

Fix this by doing these three things:

 * Remove unnecessary not_forwardable from "struct ref"; it is only
   used inside set_ref_status_for_push();

 * Make "refs/tags/" the only hierarchy that cannot be replaced
   without --force;

 * Remove the misguided attempt to force that everything that
   updates an existing ref has to be a commit outside "refs/tags/"
   hierarchy.

The policy last one tried to implement may later be resurrected and
extended to ensure fast-forwardness (defined as "not losing
objects", extending from the traditional "not losing commits from
the resulting history") when objects that are not commit are
involved (e.g. an annotated tag in hierarchies outside refs/tags),
but such a logic belongs to "is this a fast-forward?" check that is
done by ref_newer(); is_forwardable(), which is now removed, was not
the right place to do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 13:03:57 -08:00
Max Horn
5ded807f7c fix clang -Wunused-value warnings for error functions
Commit a469a10 wraps some error calls in macros to give the
compiler a chance to do more static analysis on their
constant -1 return value.  We limit the use of these macros
to __GNUC__, since gcc is the primary beneficiary of the new
information, and because we use GNU features for handling
variadic macros.

However, clang also defines __GNUC__, but generates warnings
with -Wunused-value when these macros are used in a void
context, because the constant "-1" ends up being useless.
Gcc does not complain about this case (though it is unclear
if it is because it is smart enough to see what we are
doing, or too dumb to realize that the -1 is unused).  We
can squelch the warning by just disabling these macros when
clang is in use.

Signed-off-by: Max Horn <max@quendi.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 12:47:46 -08:00
Junio C Hamano
971e829cd8 Merge branch 'jk/pathspec-literal'
Allow scripts to feed literal paths to commands that take
pathspecs, by disabling wildcard globbing.

* jk/pathspec-literal:
  add global --literal-pathspecs option

Conflicts:
	dir.c
2013-01-05 23:42:07 -08:00
Junio C Hamano
29fb151525 Merge branch 'jk/error-const-return'
Help compilers' flow analysis by making it more explicit that
error() always returns -1, to reduce false "variable used
uninitialized" warnings.  Looks somewhat ugly but not too much.

* jk/error-const-return:
  silence some -Wuninitialized false positives
  make error()'s constant return value more visible
2013-01-05 23:42:00 -08:00
Junio C Hamano
3a3100a889 Merge branch 'jk/mailmap-from-blob'
Allow us to read, and default to read, mailmap files from the tip
of the history in bare repositories.  This will help running tools
like shortlog in server settings.

* jk/mailmap-from-blob:
  mailmap: default mailmap.blob in bare repositories
  mailmap: fix some documentation loose-ends for mailmap.blob
  mailmap: clean up read_mailmap error handling
  mailmap: support reading mailmap from blobs
  mailmap: refactor mailmap parsing for non-file sources
2013-01-05 23:41:42 -08:00
Junio C Hamano
9a2c83d24c Merge branch 'cr/push-force-tag-update'
Require "-f" for push to update a tag, even if it is a fast-forward.

* cr/push-force-tag-update:
  push: allow already-exists advice to be disabled
  push: rename config variable for more general use
  push: cleanup push rules comment
  push: clarify rejection of update to non-commit-ish
  push: require force for annotated tags
  push: require force for refs under refs/tags/
  push: flag updates that require force
  push: keep track of "update" state separately
  push: add advice for rejected tag reference
  push: return reject reasons as a bitset
2013-01-05 23:41:34 -08:00
Junio C Hamano
990a4fea96 Merge branch 'nd/pathspec-wildcard'
Optimize matching paths with common forms of pathspecs that contain
wildcard characters.

* nd/pathspec-wildcard:
  tree_entry_interesting: do basedir compare on wildcard patterns when possible
  pathspec: apply "*.c" optimization from exclude
  pathspec: do exact comparison on the leading non-wildcard part
  pathspec: save the non-wildcard length part
2013-01-05 23:40:15 -08:00
Junio C Hamano
f470e901f2 Merge branch 'mh/ceiling'
An element on GIT_CEILING_DIRECTORIES list that does not name the
real path to a directory (i.e. a symbolic link) could have caused
the GIT_DIR discovery logic to escape the ceiling.

* mh/ceiling:
  string_list_longest_prefix(): remove function
  setup_git_directory_gently_1(): resolve symlinks in ceiling paths
  longest_ancestor_length(): require prefix list entries to be normalized
  longest_ancestor_length(): take a string_list argument for prefixes
  longest_ancestor_length(): use string_list_split()
  Introduce new function real_path_if_valid()
  real_path_internal(): add comment explaining use of cwd
  Introduce new static function real_path_internal()
2013-01-02 10:36:59 -08:00
Jeff King
823ab40fd4 add global --literal-pathspecs option
Git takes pathspec arguments in many places to limit the
scope of an operation. These pathspecs are treated not as
literal paths, but as glob patterns that can be fed to
fnmatch. When a user is giving a specific pattern, this is a
nice feature.

However, when programatically providing pathspecs, it can be
a nuisance. For example, to find the latest revision which
modified "$foo", one can use "git rev-list -- $foo". But if
"$foo" contains glob characters (e.g., "f*"), it will
erroneously match more entries than desired. The caller
needs to quote the characters in $foo, and even then, the
results may not be exactly the same as with a literal
pathspec. For instance, the depth checks in
match_pathspec_depth do not kick in if we match via fnmatch.

This patch introduces a global command-line option (i.e.,
one for "git" itself, not for specific commands) to turn
this behavior off. It also has a matching environment
variable, which can make it easier if you are a script or
porcelain interface that is going to issue many such
commands.

This option cannot turn off globbing for particular
pathspecs. That could eventually be done with a ":(noglob)"
magic pathspec prefix. However, that level of granularity is
more cumbersome to use for many cases, and doing ":(noglob)"
right would mean converting the whole codebase to use
"struct pathspec", as the usual "const char **pathspec"
cannot represent extra per-item flags.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-19 14:58:59 -08:00
Jeff King
a469a10193 silence some -Wuninitialized false positives
There are a few error functions that simply wrap error() and
provide a standardized message text. Like error(), they
always return -1; knowing that can help the compiler silence
some false positive -Wuninitialized warnings.

One strategy would be to just declare these as inline in the
header file so that the compiler can see that they always
return -1. However, gcc does not always inline them (e.g.,
it will not inline opterror, even with -O3), which renders
our change pointless.

Instead, let's follow the same route we did with error() in
the last patch, and define a macro that makes the constant
return value obvious to the compiler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-15 10:45:59 -08:00
Jeff King
086109006f mailmap: support reading mailmap from blobs
In a bare repository, there isn't a simple way to respect an
in-tree mailmap without extracting it to a temporary file.
This patch provides a config variable, similar to
mailmap.file, which reads the mailmap from a blob in the
repository.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-12 11:12:35 -08:00
Chris Rorvick
dbfeddb12e push: require force for refs under refs/tags/
References are allowed to update from one commit-ish to another if the
former is an ancestor of the latter.  This behavior is oriented to
branches which are expected to move with commits.  Tag references are
expected to be static in a repository, though, thus an update to
something under refs/tags/ should be rejected unless the update is
forced.

Signed-off-by: Chris Rorvick <chris@rorvick.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-02 01:44:34 -08:00
Chris Rorvick
8c5f6f717d push: flag updates that require force
Add a flag for indicating an update to a reference requires force.
Currently the `nonfastforward` flag is used for this when generating the
status message.  A separate flag insulates dependent logic from the
details of set_ref_status_for_push().

Signed-off-by: Chris Rorvick <chris@rorvick.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-02 01:44:15 -08:00
Chris Rorvick
ffe81ef2ac push: keep track of "update" state separately
If the reference exists on the remote and it is not being removed, then
mark as an update.  This is in preparation for handling tags (lightweight
and annotated) exceptionally.

Signed-off-by: Chris Rorvick <chris@rorvick.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-02 01:43:28 -08:00
Chris Rorvick
b24e6047a8 push: add advice for rejected tag reference
Advising the user to fetch and merge only makes sense if the rejected
reference is a branch.  If none of the rejections are for branches, just
tell the user the reference already exists.

Signed-off-by: Chris Rorvick <chris@rorvick.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-02 01:39:50 -08:00
Nguyễn Thái Ngọc Duy
8c6abbcd27 pathspec: apply "*.c" optimization from exclude
When a pattern contains only a single asterisk as wildcard,
e.g. "foo*bar", after literally comparing the leading part "foo" with
the string, we can compare the tail of the string and make sure it
matches "bar", instead of running fnmatch() on "*bar" against the
remainder of the string.

-O2 build on linux-2.6, without the patch:

$ time git rev-list --quiet HEAD -- '*.c'

real    0m40.770s
user    0m40.290s
sys     0m0.256s

With the patch

$ time ~/w/git/git rev-list --quiet HEAD -- '*.c'

real    0m34.288s
user    0m33.997s
sys     0m0.205s

The above command is not supposed to be widely popular. It's chosen
because it exercises pathspec matching a lot. The point is it cuts
down matching time for popular patterns like *.c, which could be used
as pathspec in other places.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-26 11:13:13 -08:00
Nguyễn Thái Ngọc Duy
170260ae90 pathspec: save the non-wildcard length part
We mark pathspec with wildcards with the field use_wildcard. We
could do better by saving the length of the non-wildcard part, which
can be used for optimizations such as f9f6e2c (exclude: do strcmp as
much as possible before fnmatch - 2012-06-07).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-19 13:08:28 -08:00
Jeff King
d6991ceedc ident: keep separate "explicit" flags for author and committer
We keep track of whether the user ident was given to us
explicitly, or if we guessed at it from system parameters
like username and hostname. However, we kept only a single
variable. This covers the common cases (because the author
and committer will usually come from the same explicit
source), but can miss two cases:

  1. GIT_COMMITTER_* is set explicitly, but we fallback for
     GIT_AUTHOR. We claim the ident is explicit, even though
     the author is not.

  2. GIT_AUTHOR_* is set and we ask for author ident, but
     not committer ident. We will claim the ident is
     implicit, even though it is explicit.

This patch uses two variables instead of one, updates both
when we set the "fallback" values, and updates them
individually when we read from the environment.

Rather than keep user_ident_sufficiently_given as a
compatibility wrapper, we update the only two callers to
check the committer_ident, which matches their intent and
what was happening already.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-15 17:47:24 -08:00
Jeff King
452802309c ident: make user_ident_explicitly_given static
In v1.5.6-rc0~56^2 (2008-05-04) "user_ident_explicitly_given"
was introduced as a global for communication between config,
ident, and builtin-commit.  In v1.7.0-rc0~72^2 (2010-01-07)
readers switched to using the common wrapper
user_ident_sufficiently_given().  After v1.7.11-rc1~15^2~18
(2012-05-21), the var is only written in ident.c.

Now we can make it static, which will enable further
refactoring without worrying about upsetting other code.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-15 17:47:24 -08:00
Nguyễn Thái Ngọc Duy
4914c9629c Move setup_diff_pager to libgit.a
This is used by diff-no-index.c, part of libgit.a while it stays in
builtin/diff.c. Move it to diff.c so that we won't get undefined
reference if a program that uses libgit.a happens to pull it in.

While at it, move check_pager from git.c to pager.c. It makes more
sense there and pager.c is also part of libgit.a

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 03:08:30 -04:00
Nguyễn Thái Ngọc Duy
db699a8a1f Move try_merge_command and checkout_fast_forward to libgit.a
These functions are called in sequencer.c, which is part of
libgit.a. This makes libgit.a potentially require builtin/merge.c for
external git commands.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 03:08:30 -04:00
Michael Haggerty
31171d9e45 longest_ancestor_length(): take a string_list argument for prefixes
Change longest_ancestor_length() to take the prefixes argument as a
string_list rather than as a colon-separated string.  This will make
it easier for the caller to alter the entries before calling
longest_ancestor_length().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 02:34:58 -04:00
Michael Haggerty
e3e46cdbd4 Introduce new function real_path_if_valid()
The function is like real_path(), except that it returns NULL on error
instead of dying.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 02:34:58 -04:00
Junio C Hamano
dad148c359 ident.c: mark private file-scope symbols as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:21 -07:00
Junio C Hamano
cbfb93a12b trace.c: mark a private file-scope symbol as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:21 -07:00
Junio C Hamano
357e9c69c9 read-cache.c: mark a private file-scope symbol as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:21 -07:00
Junio C Hamano
72f3196a2d symlinks.c: mark private file-scope symbols as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:21 -07:00
Junio C Hamano
3e06f5ff38 Merge branch 'jc/maint-config-exit-status'
The exit status code from "git config" was way overspecified while
being incorrect.  Update the implementation to give the documented
status for a case that was documented, and introduce a new code for
"all other errors".

* jc/maint-config-exit-status:
  config: "git config baa" should exit with status 1
2012-09-03 15:53:07 -07:00
Junio C Hamano
97349a2a74 Merge branch 'jc/capabilities'
Some capabilities were asked by fetch-pack even when upload-pack did
not advertise that they are available.  Fix fetch-pack not to do so.

* jc/capabilities:
  fetch-pack: mention server version with verbose output
  parse_feature_request: make it easier to see feature values
  fetch-pack: do not ask for unadvertised capabilities
  do not send client agent unless server does first
  send-pack: fix capability-sending logic
  include agent identifier in capability string
2012-08-29 14:50:07 -07:00
Jeff King
9442710801 parse_feature_request: make it easier to see feature values
We already take care to parse key/value capabilities like
"foo=bar", but the code does not provide a good way of
actually finding out what is on the right-hand side of the
"=".

A server using "parse_feature_request" could accomplish this
with some extra parsing. You must skip past the "key"
portion manually, check for "=" versus NUL or space, and
then find the length by searching for the next space (or
NUL).  But clients can't even do that, since the
"server_supports" interface does not even return the
pointer.

Instead, let's have our parser share more information by
providing a pointer to the value and its length. The
"parse_feature_value" function returns a pointer to the
feature's value portion, along with the length of the value.
If the feature is missing, NULL is returned. If it does not
have an "=", then a zero-length value is returned.

Similarly, "server_feature_value" behaves in the same way,
but always checks the static server_feature_list variable.

We can then implement "server_supports" in terms of
"server_feature_value". We cannot implement the original
"parse_feature_request" in terms of our new function,
because it returned a pointer to the beginning of the
feature. However, no callers actually cared about the value
of the returned pointer, so we can simplify it to a boolean
just as we do for "server_supports".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-13 21:52:36 -07:00
Junio C Hamano
9409c7a5b3 config: "git config baa" should exit with status 1
We instead failed with an undocumented exit status 255.
Also define a "catch-all" status and document it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-30 08:51:26 -07:00
Junio C Hamano
30ea575876 Merge branch 'tg/ce-namelen-field'
Split lower bits of ce_flags field and creates a new ce_namelen
field in the in-core index structure.

* tg/ce-namelen-field:
  Strip namelen out of ce_flags into a ce_namelen field
2012-07-23 20:55:21 -07:00
Junio C Hamano
0958a24d73 Merge branch 'jc/sha1-name-more'
Teaches the object name parser things like a "git describe" output
is always a commit object, "A" in "git log A" must be a committish,
and "A" and "B" in "git log A...B" both must be committish, etc., to
prolong the lifetime of abbreviated object names.

* jc/sha1-name-more: (27 commits)
  t1512: match the "other" object names
  t1512: ignore whitespaces in wc -l output
  rev-parse --disambiguate=<prefix>
  rev-parse: A and B in "rev-parse A..B" refer to committish
  reset: the command takes committish
  commit-tree: the command wants a tree and commits
  apply: --build-fake-ancestor expects blobs
  sha1_name.c: add support for disambiguating other types
  revision.c: the "log" family, except for "show", takes committish
  revision.c: allow handle_revision_arg() to take other flags
  sha1_name.c: introduce get_sha1_committish()
  sha1_name.c: teach lookup context to get_sha1_with_context()
  sha1_name.c: many short names can only be committish
  sha1_name.c: get_sha1_1() takes lookup flags
  sha1_name.c: get_describe_name() by definition groks only commits
  sha1_name.c: teach get_short_sha1() a commit-only option
  sha1_name.c: allow get_short_sha1() to take other flags
  get_sha1(): fix error status regression
  sha1_name.c: restructure disambiguation of short names
  sha1_name.c: correct misnamed "canonical" and "res"
  ...
2012-07-22 12:55:07 -07:00
Junio C Hamano
b856ad623e Merge branch 'tb/sanitize-decomposed-utf-8-pathname'
Teaches git to normalize pathnames read from readdir(3) and all
arguments from the command line into precomposed UTF-8 (assuming
that they come as decomposed UTF-8) to work around issues on Mac OS.

I think there still are other places that need conversion
(e.g. paths that are read from stdin for some commands), but this
should be a good first step in the right direction.

* tb/sanitize-decomposed-utf-8-pathname:
  git on Mac OS and precomposed unicode
2012-07-13 15:37:51 -07:00
Thomas Gummerer
b60e188c51 Strip namelen out of ce_flags into a ce_namelen field
Strip the name length from the ce_flags field and move it
into its own ce_namelen field in struct cache_entry. This
will both give us a tiny bit of a performance enhancement
when working with long pathnames and is a refactoring for
more readability of the code.

It enhances readability, by making it more clear what
is a flag, and where the length is stored and make it clear
which functions use stages in comparisions and which only
use the length.

It also makes CE_NAMEMASK private, so that users don't
mistakenly write the name length in the flags.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-11 09:42:45 -07:00
Junio C Hamano
957d74062c rev-parse --disambiguate=<prefix>
The new option allows you to feed an ambiguous prefix and enumerate
all the objects that share it as a prefix of their object names.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:23 -07:00
Junio C Hamano
daba53aeaf sha1_name.c: add support for disambiguating other types
This teaches the revision parser that in "$name:$path" (used for a
blob object name), "$name" must be a tree-ish.

There are many more places where we know what types of objects are
called for.  This patch adds support for "commit", "treeish", "tree",
and "blob", which could be used in the following contexts:

 - "git apply --build-fake-ancestor" reads the "index" lines from
   the patch; they must name blob objects (not even "blob-ish");

 - "git commit-tree" reads a tree object name (not "tree-ish"), and
   zero or more commit object names (not "committish");

 - "git reset $rev" wants a committish; "git reset $rev -- $path"
   wants a treeish.

They will come in later patches in the series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano
cd74e4733d sha1_name.c: introduce get_sha1_committish()
Many callers know that the user meant to name a committish by
syntactical positions where the object name appears.  Calling this
function allows the machinery to disambiguate shorter-than-unique
abbreviated object names between committish and others.

Note that this does NOT error out when the named object is not a
committish. It is merely to give a hint to the disambiguation
machinery.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano
33bd598c39 sha1_name.c: teach lookup context to get_sha1_with_context()
The function takes user input string and returns the object name
(binary SHA-1) with mode bits and path when the object was looked
up in a tree.

Additionally give hints to help disambiguation of abbreviated object
names when the caller knows what it is looking for.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano
e2643617d7 sha1_name.c: many short names can only be committish
We know that the token "$name" that appear in "$name^{commit}",
"$name^4", "$name~4" etc. can only name a committish (either a
commit or a tag that peels to a commit).  Teach get_short_sha1() to
take advantage of that knowledge when disambiguating an abbreviated
SHA-1 given as an object name.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-09 16:42:22 -07:00
Junio C Hamano
d02d7ac303 Merge branch 'mm/config-xdg'
Teach git to read various information from $XDG_CONFIG_HOME/git/ to allow
the user to avoid cluttering $HOME.

* mm/config-xdg:
  config: write to $XDG_CONFIG_HOME/git/config file when appropriate
  Let core.attributesfile default to $XDG_CONFIG_HOME/git/attributes
  Let core.excludesfile default to $XDG_CONFIG_HOME/git/ignore
  config: read (but not write) from $XDG_CONFIG_HOME/git/config file
2012-07-09 09:00:36 -07:00
Torsten Bögershausen
76759c7dff git on Mac OS and precomposed unicode
Mac OS X mangles file names containing unicode on file systems HFS+,
VFAT or SAMBA.  When a file using unicode code points outside ASCII
is created on a HFS+ drive, the file name is converted into
decomposed unicode and written to disk. No conversion is done if
the file name is already decomposed unicode.

Calling open("\xc3\x84", ...) with a precomposed "Ä" yields the same
result as open("\x41\xcc\x88",...) with a decomposed "Ä".

As a consequence, readdir() returns the file names in decomposed
unicode, even if the user expects precomposed unicode.  Unlike on
HFS+, Mac OS X stores files on a VFAT drive (e.g. an USB drive) in
precomposed unicode, but readdir() still returns file names in
decomposed unicode.  When a git repository is stored on a network
share using SAMBA, file names are send over the wire and written to
disk on the remote system in precomposed unicode, but Mac OS X
readdir() returns decomposed unicode to be compatible with its
behaviour on HFS+ and VFAT.

The unicode decomposition causes many problems:

- The names "git add" and other commands get from the end user may
  often be precomposed form (the decomposed form is not easily input
  from the keyboard), but when the commands read from the filesystem
  to see what it is going to update the index with already is on the
  filesystem, readdir() will give decomposed form, which is different.

- Similarly "git log", "git mv" and all other commands that need to
  compare pathnames found on the command line (often but not always
  precomposed form; a command line input resulting from globbing may
  be in decomposed) with pathnames found in the tree objects (should
  be precomposed form to be compatible with other systems and for
  consistency in general).

- The same for names stored in the index, which should be
  precomposed, that may need to be compared with the names read from
  readdir().

NFS mounted from Linux is fully transparent and does not suffer from
the above.

As Mac OS X treats precomposed and decomposed file names as equal,
we can

 - wrap readdir() on Mac OS X to return the precomposed form, and

 - normalize decomposed form given from the command line also to the
   precomposed form,

to ensure that all pathnames used in Git are always in the
precomposed form.  This behaviour can be requested by setting
"core.precomposedunicode" configuration variable to true.

The code in compat/precomposed_utf8.c implements basically 4 new
functions: precomposed_utf8_opendir(), precomposed_utf8_readdir(),
precomposed_utf8_closedir() and precompose_argv().  The first three
are to wrap opendir(3), readdir(3), and closedir(3) functions.

The argv[] conversion allows to use the TAB filename completion done
by the shell on command line.  It tolerates other tools which use
readdir() to feed decomposed file names into git.

When creating a new git repository with "git init" or "git clone",
"core.precomposedunicode" will be set "false".

The user needs to activate this feature manually.  She typically
sets core.precomposedunicode to "true" on HFS and VFAT, or file
systems mounted via SAMBA.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-08 22:03:46 -07:00
Junio C Hamano
60ad08bfdf Merge branch 'th/diff-no-index-fixes'
"git diff --no-index" did not correctly handle relative paths and
did not give correct exit codes when run under "--quiet" option.

* th/diff-no-index-fixes:
  diff-no-index: exit(1) if 'diff --quiet <repo file> <external file>' finds changes
  diff: handle relative paths in no-index
2012-07-04 23:40:38 -07:00
Junio C Hamano
aa1dec9ef6 sha1_name.c: teach get_short_sha1() a commit-only option
When the caller knows that the parameter is meant to name a commit,
e.g. "56789a" in describe name "v1.2.3-4-g56789a", pass that as a
hint so that lower level can use it to disambiguate objects when
there is only one commit whose name begins with 56789a even if there
are objects of other types whose names share the same prefix.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-03 11:17:59 -07:00
Junio C Hamano
37c00e5590 sha1_name.c: allow get_short_sha1() to take other flags
Instead of a separate "int quietly" argument, make it take "unsigned
flags" so that we can pass other options to it.

The bit assignment of this flag word is exposed in cache.h because
the mechanism will be exposed to callers of the higher layer in
later commits in this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-03 11:17:59 -07:00
Junio C Hamano
249c8f4a16 sha1_name.c: get rid of get_sha1_with_mode()
There are only two callers, and they will benefit from being able to
pass disambiguation hints to underlying get_sha1_with_context() API
once it happens.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-03 10:24:11 -07:00
Junio C Hamano
8c135ea260 sha1_name.c: get rid of get_sha1_with_mode_1()
The only external caller is setup.c that tries to give a nicer error
message when an object name is misspelt (e.g. "HEAD:cashe.h").
Retire it and give the caller a dedicated and more intuitive API
function maybe_die_on_misspelt_object_name().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-03 10:22:37 -07:00
Junio C Hamano
f01cc14c3c sha1_name.c: hide get_sha1_with_context_1() ugliness
There is no outside caller that cares about the "only-to-die" ugliness.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-02 11:22:57 -07:00