Commit Graph

179 Commits

Author SHA1 Message Date
Duy Nguyen
9db9eecfe5 attr: avoid calling find_basename() twice per path
find_basename() is only used inside collect_all_attrs(), called once
in prepare_attr_stack, then again after prepare_attr_stack()
returns. Both calls return exact same value. Reorder the code to do
the same task once. Also avoid strlen() because we knows the length
after finding basename.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 11:08:55 -08:00
Nguyễn Thái Ngọc Duy
712efb1a42 attr: make it build with DEBUG_ATTR again
Commit 82dce99 (attr: more matching optimizations from .gitignore -
2012-10-15) changed match_attr structure but it did not update
DEBUG_ATTR-specific code. This fixes it.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-15 10:02:08 -08:00
Nguyễn Thái Ngọc Duy
711536bd4b attr: fix off-by-one directory component length calculation
94bc671 (Add directory pattern matching to attributes - 2012-12-08)
uses find_basename() to calculate the length of directory part in
prepare_attr_stack. This function expects the directory without the
trailing slash (as "origin" field in match_attr struct is without the
trailing slash). find_basename() includes the trailing slash and
confuses push/pop algorithm.

Consider path = "abc/def" and the push down code:

	while (1) {
		len = strlen(attr_stack->origin);
		if (dirlen <= len)
			break;
		cp = memchr(path + len + 1, '/', dirlen - len - 1);
		if (!cp)
			cp = path + dirlen;

dirlen is 4, not 3, without this patch. So when attr_stack->origin is
"abc", it'll miss the exit condition because 4 <= 3 is wrong. It'll
then try to push "abc/" down the attr stack (because "cp" would be
NULL). So we have both "abc" and "abc/" in the stack.

Next time when "abc/ghi" is checked, "abc/" is popped out because of
the off-by-one dirlen, only to be pushed back in again by the above
code. This repeats for all files in the same directory. Which means
at least one failed open syscall per file, or more if .gitattributes
exists.

This is the perf result with 10 runs on git.git:

Test                                     94bc671^          94bc671                   HEAD
----------------------------------------------------------------------------------------------------------
7810.1: grep worktree, cheap regex       0.02(0.01+0.04)   0.05(0.03+0.05) +150.0%   0.02(0.01+0.04) +0.0%
7810.2: grep worktree, expensive regex   0.25(0.94+0.01)   0.26(0.94+0.02) +4.0%     0.25(0.93+0.02) +0.0%
7810.3: grep --cached, cheap regex       0.11(0.10+0.00)   0.12(0.10+0.02) +9.1%     0.10(0.10+0.00) -9.1%
7810.4: grep --cached, expensive regex   0.61(0.60+0.01)   0.62(0.61+0.01) +1.6%     0.61(0.60+0.00) +0.0%

Reported-by: Ross Lagerwall <rosslagerwall@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-15 08:17:23 -08:00
Junio C Hamano
d912b0e44f Merge branch 'as/dir-c-cleanup'
Refactor and generally clean up the directory traversal API
implementation.

* as/dir-c-cleanup:
  dir.c: rename free_excludes() to clear_exclude_list()
  dir.c: refactor is_path_excluded()
  dir.c: refactor is_excluded()
  dir.c: refactor is_excluded_from_list()
  dir.c: rename excluded() to is_excluded()
  dir.c: rename excluded_from_list() to is_excluded_from_list()
  dir.c: rename path_excluded() to is_path_excluded()
  dir.c: rename cryptic 'which' variable to more consistent name
  Improve documentation and comments regarding directory traversal API
  api-directory-listing.txt: update to match code
2013-01-10 13:47:25 -08:00
Adam Spiers
6d24e7a807 dir.c: rename excluded() to is_excluded()
Continue adopting clearer names for exclude functions.  This is_*
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-28 12:07:46 -08:00
Jean-Noël AVILA
94bc671a1f Add directory pattern matching to attributes
The manpage of gitattributes says: "The rules how the pattern
matches paths are the same as in .gitignore files" and the gitignore
pattern matching has a pattern ending with / for directory matching.

This rule is specifically relevant for the 'export-ignore' rule used
for git archive.

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-17 22:07:23 -08:00
Jeff King
5f836422ab Merge branch 'nd/attr-match-optim-more'
Start laying the foundation to build the "wildmatch" after we can
agree on its desired semantics.

* nd/attr-match-optim-more:
  attr: more matching optimizations from .gitignore
  gitignore: make pattern parsing code a separate function
  exclude: split pathname matching code into a separate function
  exclude: fix a bug in prefix compare optimization
  exclude: split basename matching code into a separate function
  exclude: stricten a length check in EXC_FLAG_ENDSWITH case
2012-11-09 12:42:25 -05:00
Jeff King
70d1825749 Merge branch 'nd/attr-match-optim'
Trivial and obvious optimization for finding attributes that match
a given path.

* nd/attr-match-optim:
  attr: avoid searching for basename on every match
  attr: avoid strlen() on every match
2012-10-25 06:42:36 -04:00
Nguyễn Thái Ngọc Duy
82dce998c2 attr: more matching optimizations from .gitignore
.gitattributes and .gitignore share the same pattern syntax but has
separate matching implementation. Over the years, ignore's
implementation accumulates more optimizations while attr's stays the
same.

This patch reuses the core matching functions that are also used by
excluded_from_list. excluded_from_list and path_matches can't be
merged due to differences in exclude and attr, for example:

* "!pattern" syntax is forbidden in .gitattributes.  As an attribute
  can be unset (i.e. set to a special value "false") or made back to
  unspecified (i.e. not even set to "false"), "!pattern attr" is unclear
  which one it means.

* we support attaching attributes to directories, but git-core
  internally does not currently make use of attributes on
  directories.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-15 14:57:17 -07:00
Nguyễn Thái Ngọc Duy
4742d136e2 attr: avoid searching for basename on every match
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05 12:27:48 -07:00
Nguyễn Thái Ngọc Duy
cd6a0b265e attr: avoid strlen() on every match
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05 12:27:35 -07:00
Junio C Hamano
d6fb62474e Merge branch 'jk/config-warn-on-inaccessible-paths'
The attribute system may be asked for a path that itself or its
leading directories no longer exists in the working tree.  Failure
to open per-directory .gitattributes with error status other than
ENOENT and ENOTDIR are diagnosed.

* jk/config-warn-on-inaccessible-paths:
  attr: failure to open a .gitattributes file is OK with ENOTDIR
2012-09-17 15:55:41 -07:00
Junio C Hamano
e6d29a4b47 Merge branch 'jc/ll-merge-binary-ours'
"git merge -Xtheirs" did not help content-level merge of binary
files; it should just take their version.  Also "*.jpg binary" in
the attributes did not imply they should use the binary ll-merge
driver.

* jc/ll-merge-binary-ours:
  ll-merge: warn about inability to merge binary files only when we can't
  attr: "binary" attribute should choose built-in "binary" merge driver
  merge: teach -Xours/-Xtheirs to binary ll-merge driver
2012-09-14 21:39:56 -07:00
Junio C Hamano
8e950dab86 attr: failure to open a .gitattributes file is OK with ENOTDIR
Often we consult an in-tree .gitattributes file that exists per
directory.  Majority of directories do not usually have such a file,
and it is perfectly fine if we cannot open it because there is no
such file, but we do want to know when there is an I/O or permission
error.  Earlier, we made the codepath warn when we fail to open it
for reasons other than ENOENT for that reason.

We however sometimes have to attempt to open the .gitattributes file
from a directory that does not exist in the commit that is currently
checked out.  "git pack-objects" wants to know if a path is marked
with "-delta" attributes, and "git archive" wants to know about
export-ignore and export-subst attributes.  Both commands may and do
need to ask the attributes system about paths in an arbitrary
commit.  "git diff", after removing an entire directory, may want to
know textconv on paths that used to be in that directory.

Make sure we also ignore a failure to open per-directory attributes
file due to ENOTDIR.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-13 22:15:10 -07:00
Junio C Hamano
155a4b712e attr: "binary" attribute should choose built-in "binary" merge driver
The built-in "binary" attribute macro expands to "-diff -text", so
that textual diff is not produced, and the contents will not go
through any CR/LF conversion ever.  During a merge, it should also
choose the "binary" low-level merge driver, but it didn't.

Make it expand to "-diff -merge -text".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-08 21:28:55 -07:00
Junio C Hamano
55b38a48e2 warn_on_inaccessible(): a helper to warn on inaccessible paths
The previous series introduced warnings to multiple places, but it
could become tiring to see the warning on the same path over and
over again during a single run of Git.  Making just one function
responsible for issuing this warning, we could later choose to keep
track of which paths we issued a warning (it would involve a hash
table of paths after running them through real_path() or something)
in order to reduce noise.

Right now we do not know if the noise reduction is necessary, but it
still would be a good code reduction/sharing anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-21 14:52:07 -07:00
Jeff King
11e50b2736 attr: warn on inaccessible attribute files
Just like config and gitignore files, we silently ignore
missing or inaccessible attribute files. An existent but
inaccessible file is probably a configuration error, so
let's warn the user.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-21 14:47:07 -07:00
Jeff King
f0c1c15c41 attr: make sure we have an xdg path before using it
If we don't have a core.attributesfile configured, we fall
back to checking XDG config, which is usually
$HOME/.config/git/attributes.

However, if $HOME is unset, then home_config_paths will return
NULL, and we end up calling fopen(NULL).

Depending on your system, this may or may not cause the
accompanying test to fail (e.g., on Linux and glibc, the
address will go straight to open, which will return EFAULT).
However, valgrind will reliably notice the error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-24 08:59:07 -07:00
Huynh Khoi Nguyen Nguyen
684e40f657 Let core.attributesfile default to $XDG_CONFIG_HOME/git/attributes
This gives the default value for the core.attributesfile variable
following the exact same logic of the previous change for the
core.excludesfile setting.

Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr>
Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr>
Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr>
Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr>
Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-25 09:06:15 -07:00
Junio C Hamano
afb6b561e3 Merge branch 'maint-1.7.6' into maint-1.7.7
* maint-1.7.6:
  attr: fix leak in free_attr_elem
  t2203: fix wrong commit command
2012-01-11 19:11:00 -08:00
Jeff King
37475f97d1 attr: fix leak in free_attr_elem
This function frees the individual "struct match_attr"s we
have allocated, but forgot to free the array holding their
pointers, leading to a minor memory leak (but it can add up
after checking attributes for paths in many directories).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-11 19:07:23 -08:00
Junio C Hamano
6c65b5ea43 Merge the attributes fix in from maint-1.6.6 branch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-10 14:14:26 -08:00
Junio C Hamano
c432ef996e attr.c: clarify the logic to pop attr_stack
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-10 12:28:38 -08:00
Junio C Hamano
909ca7b9ac attr.c: make bootstrap_attr_stack() leave early
Thas would de-dent the body of a function that has grown rather large over
time, making it a bit easier to read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-10 12:27:37 -08:00
Jeff King
77f7f82288 attr: drop misguided defensive coding
In prepare_attr_stack, we pop the old elements of the stack
(which were left from a previous lookup and may or may not
be useful to us). Our loop to do so checks that we never
reach the top of the stack. However, the code immediately
afterwards will segfault if we did actually reach the top of
the stack.

Fortunately, this is not an actual bug, since we will never
pop all of the stack elements (we will always keep the root
gitattributes, as well as the builtin ones). So the extra
check in the loop condition simply clutters the code and
makes the intent less clear. Let's get rid of it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-10 11:55:27 -08:00
Jeff King
1afca44495 attr: don't confuse prefixes with leading directories
When we prepare the attribute stack for a lookup on a path,
we start with the cached stack from the previous lookup
(because it is common to do several lookups in the same
directory hierarchy). So the first thing we must do in
preparing the stack is to pop any entries that point to
directories we are no longer interested in.

For example, if our stack contains gitattributes for:

  foo/bar/baz
  foo/bar
  foo

but we want to do a lookup in "foo/bar/bleep", then we want
to pop the top element, but retain the others.

To do this we walk down the stack from the top, popping
elements that do not match our lookup directory. However,
the test do this simply checked strncmp, meaning we would
mistake "foo/bar/baz" as a leading directory of
"foo/bar/baz_plus". We must also check that the character
after our match is '/', meaning we matched the whole path
component.

There are two special cases to consider:

  1. The top of our attr stack has the empty path. So we
     must not check for '/', but rather special-case the
     empty path, which always matches.

  2. Typically when matching paths in this way, you would
     also need to check for a full string match (i.e., the
     character after is '\0'). We don't need to do so in
     this case, though, because our path string is actually
     just the directory component of the path to a file
     (i.e., we know that it terminates with "/", because the
     filename comes after that).

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-10 11:25:40 -08:00
Brandon Casey
6eba6210d9 attr.c: respect core.ignorecase when matching attribute patterns
When core.ignorecase is true, the file globs configured in the
.gitattributes file should be matched case-insensitively against the paths
in the working directory.  Let's do so.

Plus, add some tests.

The last set of tests is performed only on a case-insensitive filesystem.
Those tests make sure that git handles the case where the .gitignore file
resides in a subdirectory and the user supplies a path that does not match
the case in the filesystem.  In that case^H^H^H^Hsituation, part of the
path supplied by the user is effectively interpreted case-insensitively,
and part of it is dependent on the setting of core.ignorecase.  git will
currently only match the portion of the path below the directory holding
the .gitignore file according to the setting of core.ignorecase.

This is also partly future-proofing.  Currently, git builds the attr stack
based on the path supplied by the user, so we don't have to do anything
special (like use strcmp_icase) to handle the parts of that path that don't
match the filesystem with respect to case.  If git instead built the attr
stack by scanning the repository, then the paths in the origin field would
not necessarily match the paths supplied by the user.  If someone makes a
change like that in the future, these tests will notice.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-11 09:43:05 -07:00
Junio C Hamano
64589a03a8 attr: read core.attributesfile from git_default_core_config
This code calls git_config from a helper function to parse the config entry
it is interested in.  Calling git_config in this way may cause a problem if
the helper function can be called after a previous call to git_config by
another function since the second call to git_config may reset some
variable to the value in the config file which was previously overridden.

The above is not a problem in this case since the function passed to
git_config only parses one config entry and the variable it sets is not
assigned outside of the parsing function.  But a programmer who desires
all of the standard config options to be parsed may be tempted to modify
git_attr_config() so that it falls back to git_default_config() and then it
_would_ be vulnerable to the above described behavior.

So, move the call to git_config up into the top-level cmd_* function and
move the responsibility for parsing core.attributesfile into the main
config file parser.

Which is only the logical thing to do ;-)

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-06 13:54:32 -07:00
Brandon Casey
040a655116 cleanup: use internal memory allocation wrapper functions everywhere
The "x"-prefixed versions of strdup, malloc, etc. will check whether the
allocation was successful and terminate the process otherwise.

A few uses of malloc were left alone since they already implemented a
graceful path of failure or were in a quasi external library like xdiff.

Additionally, the call to malloc in compat/win32/syslog.c was not modified
since the syslog() implemented there is a die handler and a call to the
x-wrappers within a die handler could result in recursion should memory
allocation fail.  This will have to be addressed separately.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-06 13:54:32 -07:00
Brandon Casey
97410b27e9 attr.c: avoid inappropriate access to strbuf "buf" member
This code sequence performs a strcpy into the buf member of a strbuf
struct.  The strcpy may move the position of the terminating nul of the
string and effectively change the length of string so that it does not
match the len member of the strbuf struct.

Currently, this sequence works since the strbuf was given a hint when it
was initialized to allocate enough space to accomodate the string that will
be strcpy'ed, but this is an implementation detail of strbufs, not a
guarantee.

So, lets rework this sequence so that the strbuf is only manipulated by
strbuf functions, and direct modification of its "buf" member is not
necessary.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-06 13:54:31 -07:00
Junio C Hamano
e5cfcb04e0 Merge branch 'mh/attr'
* mh/attr:
  Unroll the loop over passes
  Change while loop into for loop
  Determine the start of the states outside of the pass loop
  Change parse_attr() to take a pointer to struct attr_state
  Increment num_attr in parse_attr_line(), not parse_attr()
  Document struct match_attr
  Add a file comment
2011-08-28 21:19:12 -07:00
Michael Haggerty
d68e1c183c Unroll the loop over passes
The passes no longer share much code, and the unrolled code is easier
to understand.

Use a new index variable instead of num_attr for the second loop, as
we are no longer counting attributes but rather indexing through them.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:02:01 -07:00
Michael Haggerty
e0a5f9aaae Change while loop into for loop
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:02:00 -07:00
Michael Haggerty
85c4a0d048 Determine the start of the states outside of the pass loop
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:02:00 -07:00
Michael Haggerty
d175129857 Change parse_attr() to take a pointer to struct attr_state
parse_attr() only needs access to the attr_state to which it should
store its results, not to the whole match_attr structure.  This change
also removes the need for it to know num_attr.  Change its signature
accordingly and add a comment.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:01:59 -07:00
Michael Haggerty
4c7517c9cc Increment num_attr in parse_attr_line(), not parse_attr()
num_attr is incremented iff parse_attr() returns non-NULL.  So do the
counting in parse_attr_line() instead of within parse_attr().  This
allows an integer rather than a pointer to an integer to be passed to
parse_attr().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:01:58 -07:00
Michael Haggerty
ba845b7550 Document struct match_attr
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:01:58 -07:00
Michael Haggerty
86ab7f0cca Add a file comment
Consolidate here a few general comments plus links to other
documentation.  Delete a comment with an out-of-date description of
the .gitattributes file format.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-14 15:01:57 -07:00
Michael Haggerty
d932f4eb9f Rename git_checkattr() to git_check_attr()
Suggested by: Junio Hamano <gitster@pobox.com>

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:21 -07:00
Michael Haggerty
ee548df300 Allow querying all attributes on a file
Add a function, git_all_attrs(), that reports on all attributes that
are set on a path.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:18 -07:00
Michael Haggerty
7373eab48e Remove redundant check
bootstrap_attr_stack() also checks whether attr_stack is already set.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:17 -07:00
Michael Haggerty
cd93bffb91 Remove redundant call to bootstrap_attr_stack()
prepare_attr_stack() does the same thing.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:17 -07:00
Michael Haggerty
2d72174492 Extract a function collect_all_attrs()
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:17 -07:00
Michael Haggerty
a872701755 Teach prepare_attr_stack() to figure out dirlen itself
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:17 -07:00
Michael Haggerty
352404ac4c Provide access to the name attribute of git_attr
It will be present in any likely future reimplementation, and its
availability simplifies other code.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:16 -07:00
Michael Haggerty
c0b13b21b8 Disallow the empty string as an attribute name
Previously, it was possible to have a line like "file.txt =foo" in a
.gitattribute file, after which an invocation like "git check-attr ''
-- file.txt" would succeed.  This patch disallows both constructs.

Please note that any existing .gitattributes file that tries to set an
empty attribute will now trigger the error message "error: : not a
valid attribute name" whereas previously the nonsense was allowed
through.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:15 -07:00
Michael Haggerty
d42453ab1a Remove anachronism from comment
Setting attributes to arbitrary values ("attribute=value") is now
supported, so it is no longer necessary for this comment to justify
prohibiting '=' in an attribute name.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:15 -07:00
Ramsay Jones
c51477229e sparse: Fix some "symbol not declared" warnings
In particular, sparse issues the "symbol 'a_symbol' was not declared.
Should it be static?" warnings for the following symbols:

    attr.c:468:12: 'git_etc_gitattributes'
    attr.c:476:5:  'git_attr_system'
    vcs-svn/svndump.c:282:6: 'svndump_read'
    vcs-svn/svndump.c:417:5: 'svndump_init'
    vcs-svn/svndump.c:432:6: 'svndump_deinit'
    vcs-svn/svndump.c:445:6: 'svndump_reset'

The symbols in attr.c only require file scope, so we add the static
modifier to their declaration.

The symbols in vcs-svn/svndump.c are external symbols, and they
already have extern declarations in the "svndump.h" header file,
so we simply include the header in svndump.c.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-22 10:04:27 -07:00
Jonathan Nieder
e91b6c5049 gitattributes: drop support for GIT_ATTR_NOGLOBAL
test-lib sets $HOME to protect against pollution from user settings,
so setting GIT_ATTR_NOGLOBAL would be redundant.  Simplify by
eliminating support for that environment variable altogether.

GIT_ATTR_NOGLOBAL was introduced in v1.7.4-rc0~208^2 (Add global and
system-wide gitattributes, 2010-09-01) as an undocumented feature for
use by the test suite.  It never ended up being used (neither within
git.git nor in other projects).

This patch does not affect GIT_ATTR_NOSYSTEM, which should still be
useful.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-15 12:23:29 -07:00
Petr Onderka
6df42ab984 Add global and system-wide gitattributes
Allow gitattributes to be set globally and system wide. This way, settings
for particular file types can be set in one place and apply for all user's
repositories.

The location of system-wide attributes file is $(prefix)/etc/gitattributes.
The location of the global file can be configured by setting
core.attributesfile.

Some parts of the code were copied from the implementation of the same
functionality in config.c.

Signed-off-by: Petr Onderka <gsvick@gmail.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-01 12:19:36 -07:00
Junio C Hamano
d5cff17eda Merge branch 'eb/core-eol'
* eb/core-eol:
  Add "core.eol" config variable
  Rename the "crlf" attribute "text"
  Add per-repository eol normalization
  Add tests for per-repository eol normalization

Conflicts:
	Documentation/config.txt
	Makefile
2010-06-21 06:02:49 -07:00
Eyvind Bernhardsen
5ec3e67052 Rename the "crlf" attribute "text"
As discussed on the list, "crlf" is not an optimal name.  Linus
suggested "text", which is much better.

Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-19 20:42:34 -07:00
Henrik Grubbström
ec775c41dc attr: Expand macros immediately when encountered.
When using macros it is otherwise hard to know whether an
attribute set by the macro should override an already set
attribute. Consider the following .gitattributes file:

[attr]mybinary	binary -ident
*		ident
foo.bin		mybinary
bar.bin		mybinary ident

Without this patch both foo.bin and bar.bin will have
the ident attribute set, which is probably not what
the user expects. With this patch foo.bin will have an
unset ident attribute, while bar.bin will have it set.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-10 18:36:00 -07:00
Henrik Grubbström
969f9d7322 attr: Allow multiple changes to an attribute on the same line.
When using macros it isn't inconceivable to have an attribute
being set by a macro, and then being reset explicitly.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-10 18:36:00 -07:00
Henrik Grubbström
426c27b7c0 attr: Fixed debug output for macro expansion.
When debug_set() was called during macro expansion, it
received a pointer to a struct git_attr rather than a
string.

Signed-off-by: Henrik Grubbström <grubba@grubba.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-10 18:35:59 -07:00
Junio C Hamano
7fb0eaa289 git_attr(): fix function signature
The function took (name, namelen) as its arguments, but all the public
callers wanted to pass a full string.

Demote the counted-string interface to an internal API status, and allow
public callers to just pass the string to the function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-16 20:39:59 -08:00
René Scharfe
d4c985653a attr: plug minor memory leak
Free the memory allocated for struct strbuf pathbuf when we're done.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-30 16:12:24 -07:00
Linus Torvalds
48fb7deb5b Fix big left-shifts of unsigned char
Shifting 'unsigned char' or 'unsigned short' left can result in sign
extension errors, since the C integer promotion rules means that the
unsigned char/short will get implicitly promoted to a signed 'int' due to
the shift (or due to other operations).

This normally doesn't matter, but if you shift things up sufficiently, it
will now set the sign bit in 'int', and a subsequent cast to a bigger type
(eg 'long' or 'unsigned long') will now sign-extend the value despite the
original expression being unsigned.

One example of this would be something like

	unsigned long size;
	unsigned char c;

	size += c << 24;

where despite all the variables being unsigned, 'c << 24' ends up being a
signed entity, and will get sign-extended when then doing the addition in
an 'unsigned long' type.

Since git uses 'unsigned char' pointers extensively, we actually have this
bug in a couple of places.

I may have missed some, but this is the result of looking at

	git grep '[^0-9 	][ 	]*<<[ 	][a-z]' -- '*.c' '*.h'
	git grep '<<[   ]*24'

which catches at least the common byte cases (shifting variables by a
variable amount, and shifting by 24 bits).

I also grepped for just 'unsigned char' variables in general, and
converted the ones that most obviously ended up getting implicitly cast
immediately anyway (eg hash_name(), encode_85()).

In addition to just avoiding 'unsigned char', this patch also tries to use
a common idiom for the delta header size thing. We had three different
variations on it: "& 0x7fUL" in one place (getting the sign extension
right), and "& ~0x80" and "& 0x7f" in two other places (not getting it
right). Apart from making them all just avoid using "unsigned char" at
all, I also unified them to then use a simple "& 0x7f".

I considered making a sparse extension which warns about doing implicit
casts from unsigned types to signed types, but it gets rather complex very
quickly, so this is just a hack.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-18 09:22:46 -07:00
Felipe Contreras
4b25d091ba Fix a bunch of pointer declarations (codestyle)
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-01 15:17:31 -07:00
Nguyễn Thái Ngọc Duy
4191e80a3e attr: add GIT_ATTR_INDEX "direction"
This instructs attr mechanism, not to look into working .gitattributes
at all. Needed by tools that does not handle working directory, such
as "git archive".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-17 21:05:49 -07:00
Junio C Hamano
06f33c1735 Read attributes from the index that is being checked out
Traditionally we used .gitattributes file from the work tree if exists,
and otherwise read from the index as a fallback.  When switching to a
branch that has an updated .gitattributes file, and entries in it give
different attributes to other paths being checked out, we should instead
read from the .gitattributes in the index.

This breaks a use case of fixing incorrect entries in the .gitattributes
in the work tree (without adding it to the index) and checking other paths
out, though.

    $ edit .gitattributes ;# mark foo.dat as binary
    $ rm foo.dat
    $ git checkout foo.dat

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-13 22:51:43 -07:00
Dmitry Potapov
f66cf96d7c Fix buffer overflow in prepare_attr_stack
If PATH_MAX on your system is smaller than a path stored in the git repo,
it may cause the buffer overflow in prepare_attr_stack.

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-16 14:05:50 -07:00
René Scharfe
2d35d556e2 Ignore .gitattributes in bare repositories
Attributes can be specified at three different places: the internal
table of default values, the file $GIT_DIR/info/attributes and files
named .gitattributes in the work tree.  Since bare repositories don't
have a work tree, git should ignore any .gitattributes files there.

This patch makes git do that, so the only way left for a user to specify
attributes in a bare repository is the file info/attributes (in addition
to changing the defaults and recompiling).

In addition, git-check-attr is now allowed to run without a work tree.
Like any user of the code in attr.c, it ignores the .gitattributes files
when run in a bare repository.  It can still read from info/attributes.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-09 15:08:26 -07:00
Matthew Ogilvie
82881b3823 gitattributes: Fix subdirectory attributes specified from root directory
Signed-off-by: Matthew Ogilvie <mmogilvi_git@miniinfo.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-22 21:12:37 -07:00
Junio C Hamano
cf94ccda35 gitattributes: fix relative path matching
There was an embarrassing pair of off-by-one miscounting that
failed to match path "a/b/c" when "a/.gitattributes" tried to
name it with relative path "b/c".

This fixes it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-07 00:04:50 -08:00
Shawn O. Pearce
f5bf6feb05 Merge branch 'maint'
* maint:
  Further 1.5.3.5 fixes described in release notes
  Avoid invoking diff drivers during git-stash
  attr: fix segfault in gitattributes parsing code
  Define NI_MAXSERV if not defined by operating system
  Ensure we add directories in the correct order
  Avoid scary errors about tagged trees/blobs during git-fetch
2007-10-19 01:18:55 -04:00
Steffen Prohaska
d7b0a09316 attr: fix segfault in gitattributes parsing code
git may segfault if gitattributes contains an invalid
entry. A test is added to t0020 that triggers the segfault.
The parsing code is fixed to avoid the crash.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2007-10-18 21:11:27 -04:00
Pierre Habouzit
182af8343c Use xmemdupz() in many places.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-09-18 17:42:17 -07:00
Junio C Hamano
1a9d7e9b48 attr.c: read .gitattributes from index as well.
This makes .gitattributes files to be read from the index when
they are not checked out to the work tree.  This is in line with
the way we always allowed low-level tools to operate in sparsely
checked out work tree in a reasonable way.

It swaps the order of new file creation and converting the blob
to work tree representation; otherwise when we are in the middle
of checking out .gitattributes we would notice an empty but
unwritten .gitattributes file in the work tree and will ignore
the copy in the index.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 23:19:10 -07:00
Junio C Hamano
a44131181a attr.c: refactoring
This splits out a common routine that parses a single line of
attribute file and adds it to the attr_stack.  It should not
change any behaviour, other than attrs array in the attr_stack
structure is now grown with alloc_nr() macro, instead of one by
one, which relied on xrealloc() to give enough slack to be
efficient enough.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 23:19:06 -07:00
Alex Riesen
4629795816 Fix crash in t0020 (crlf conversion)
Reallocated wrong size.
Noticed on Ubuntu 7.04 probably because it has some malloc diagnostics in libc:
"git-read-tree --reset -u HEAD" aborted in the test. Valgrind sped up the
debugging greatly: took me 10 minutes.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 10:44:56 -07:00
Junio C Hamano
a5e92abde6 Fix funny types used in attribute value representation
It was bothering me a lot that I abused small integer values
casted to (void *) to represent non string values in
gitattributes.  This corrects it by making the type of attribute
values (const char *), and using the address of a few statically
allocated character buffer to denote true/false.  Unset attributes
are represented as having NULLs as their values.

Added in-header documentation to explain how git_checkattr()
routine should be called.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-18 16:17:13 -07:00
Junio C Hamano
515106fa13 Allow more than true/false to attributes.
This allows you to define three values (and possibly more) to
each attribute: true, false, and unset.

Typically the handlers that notice and act on attribute values
treat "unset" attribute to mean "do your default thing"
(e.g. crlf that is unset would trigger "guess from contents"),
so being able to override a setting to an unset state is
actually useful.

 - If you want to set the attribute value to true, have an entry
   in .gitattributes file that mentions the attribute name; e.g.

	*.o	binary

 - If you want to set the attribute value explicitly to false,
   use '-'; e.g.

	*.a	-diff

 - If you want to make the attribute value _unset_, perhaps to
   override an earlier entry, use '!'; e.g.

	*.a	-diff
	c.i.a	!diff

This also allows string values to attributes, with the natural
syntax:

	attrname=attrvalue

but you cannot use it, as nobody takes notice and acts on
it yet.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-17 01:04:59 -07:00
Junio C Hamano
e4aee10a2e Change attribute negation marker from '!' to '-'.
At the same time, we do not want to allow arbitrary strings for
attribute names, as we are likely to want to extend the syntax
later.  Allow only alnum, dash, underscore and dot for now.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 15:49:41 -07:00
Junio C Hamano
fc2d07b05f Define a built-in attribute macro "binary".
For binary files we would want to disable textual diff
generation and automatic crlf conversion.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 15:49:41 -07:00
Junio C Hamano
f48fd68887 attribute macro support
This adds "attribute macros" (for lack of better name).  So far,
we have low-level attributes such as crlf and diff, which are
defined in operational terms --- setting or unsetting them on a
particular path directly affects what is done to the path.  For
example, in order to decline diffs or crlf conversions on a
binary blob, no diffs on PostScript files, and treat all other
files normally, you would have something like these:

	*		diff crlf
	*.ps		!diff
	proprietary.o	!diff !crlf

That is fine as the operation goes, but gets unwieldy rather
rapidly, when we start adding more low-level attributes that are
defined in operational terms.  A near-term example of such an
attribute would be 'merge-3way' which would control if git
should attempt the usual 3-way file-level merge internally, or
leave merging to a specialized external program of user's
choice.  When it is added, we do _not_ want to force the users
to update the above to:

	*		diff crlf merge-3way
	*.ps		!diff
	proprietary.o	!diff !crlf !merge-3way

The way this patch solves this issue is to realize that the
attributes the user is assigning to paths are not defined in
terms of operations but in terms of what they are.

All of the three low-level attributes usually make sense for
most of the files that sane SCM users have git operate on (these
files are typically called "text').  Only a few cases, such as
binary blob, need exception to decline the "usual treatment
given to text files" -- and people mark them as "binary".

So this allows the $GIT_DIR/info/alternates and .gitattributes
at the toplevel of the project to also specify attributes that
assigns other attributes.  The syntax is '[attr]' followed by an
attribute name followed by a list of attribute names:

	[attr] binary	!diff !crlf !merge-3way

When "binary" attribute is set to a path, if the path has not
got diff/crlf/merge-3way attribute set or unset by other rules,
this rule unsets the three low-level attributes.

It is expected that the user level .gitattributes will be
expressed mostly in terms of attributes based on what the files
are, and the above sample would become like this:

	(built-in attribute configuration)
	[attr] binary	!diff !crlf !merge-3way
	*		diff crlf merge-3way

	(project specific .gitattributes)
	proprietary.o	binary

	(user preference $GIT_DIR/info/attributes)
	*.ps		!diff

There are a few caveats.

 * As described above, you can define these macros only in
   $GIT_DIR/info/attributes and toplevel .gitattributes.

 * There is no attempt to detect circular definition of macro
   attributes, and definitions are evaluated from bottom to top
   as usual to fill in other attributes that have not yet got
   values.  The following would work as expected:

	[attr] text	diff crlf
	[attr] ps	text !diff
	*.ps	ps

   while this would most likely not (I haven't tried):

	[attr] ps	text !diff
	[attr] text	diff crlf
	*.ps	ps

 * When a macro says "[attr] A B !C", saying that a path does
   not have attribute A does not let you tell anything about
   attributes B or C.  That is, given this:

	[attr] text	diff crlf
	[attr] ps	text !diff
	*.txt !ps

  path hello.txt, which would match "*.txt" pattern, would have
  "ps" attribute set to zero, but that does not make text
  attribute of hello.txt set to false (nor diff attribute set to
  true).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 15:43:06 -07:00
Junio C Hamano
8c701249d2 Teach 'diff' about 'diff' attribute.
This makes paths that explicitly unset 'diff' attribute not to
produce "textual" diffs from 'git-diff' family.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00
Junio C Hamano
35ebfd6a0c Define 'crlf' attribute.
This defines the semantics of 'crlf' attribute as an example.
When a path has this attribute unset (i.e. '!crlf'), autocrlf
line-end conversion is not applied.

Eventually we would want to let users to build a pipeline of
processing to munge blob data to filesystem format (and in the
other direction) based on combination of attributes, and at that
point the mechanism in convert_to_{git,working_tree}() that
looks at 'crlf' attribute needs to be enhanced.  Perhaps the
existing 'crlf' would become the first step in the input chain,
and the last step in the output chain.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00
Junio C Hamano
d0bfd026a8 Add basic infrastructure to assign attributes to paths
This adds the basic infrastructure to assign attributes to
paths, in a way similar to what the exclusion mechanism does
based on $GIT_DIR/info/exclude and .gitignore files.

An attribute is just a simple string that does not contain any
whitespace.  They can be specified in $GIT_DIR/info/attributes
file, and .gitattributes file in each directory.

Each line in these files defines a pattern matching rule.
Similar to the exclusion mechanism, a later match overrides an
earlier match in the same file, and entries from .gitattributes
file in the same directory takes precedence over the ones from
parent directories.  Lines in $GIT_DIR/info/attributes file are
used as the lowest precedence default rules.

A line is either a comment (an empty line, or a line that begins
with a '#'), or a rule, which is a whitespace separated list of
tokens.  The first token on the line is a shell glob pattern.
The rest are names of attributes, each of which can optionally
be prefixed with '!'.  Such a line means "if a path matches this
glob, this attribute is set (or unset -- if the attribute name
is prefixed with '!').  For glob matching, the same "if the
pattern does not have a slash in it, the basename of the path is
matched with fnmatch(3) against the pattern, otherwise, the path
is matched with the pattern with FNM_PATHNAME" rule as the
exclusion mechanism is used.

This does not define what an attribute means.  Tying an
attribute to various effects it has on git operation for paths
that have it will be specified separately.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00