Commit Graph

859 Commits

Author SHA1 Message Date
Junio C Hamano
a35d78c0f4 Merge branch 'jc/zlib-wrap' into maint
* jc/zlib-wrap:
  zlib: allow feeding more than 4GB in one go
  zlib: zlib can only process 4GB at a time
  zlib: wrap deflateBound() too
  zlib: wrap deflate side of the API
  zlib: wrap inflateInit2 used to accept only for gzip format
  zlib: wrap remaining calls to direct inflate/inflateEnd
  zlib wrapper: refactor error message formatter
2011-08-16 11:23:26 -07:00
Junio C Hamano
ef49a7a012 zlib: zlib can only process 4GB at a time
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.

But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.

In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.

Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:52:15 -07:00
Junio C Hamano
225a6f1068 zlib: wrap deflateBound() too
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:18:17 -07:00
Junio C Hamano
55bb5c9147 zlib: wrap deflate side of the API
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().

There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:10:29 -07:00
Junio C Hamano
5e86c1fb86 zlib: wrap inflateInit2 used to accept only for gzip format
http-backend.c uses inflateInit2() to tell the library that it wants to
accept only gzip format. Wrap it in a helper function so that readers do
not have to wonder what the magic numbers 15 and 16 are for.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 10:51:02 -07:00
Junio C Hamano
1c6e3514d0 Merge branch 'jk/maint-config-alias-fix' into maint
* jk/maint-config-alias-fix:
  handle_options(): do not miscount how many arguments were used
  config: always parse GIT_CONFIG_PARAMETERS during git_config
  git_config: don't peek at global config_parameters
  config: make environment parsing routines static
2011-06-01 14:05:22 -07:00
Junio C Hamano
1f9a980636 Merge branch 'jk/maint-config-alias-fix'
* jk/maint-config-alias-fix:
  handle_options(): do not miscount how many arguments were used
  config: always parse GIT_CONFIG_PARAMETERS during git_config
  git_config: don't peek at global config_parameters
  config: make environment parsing routines static

Conflicts:
	config.c
2011-05-30 20:19:14 -07:00
Junio C Hamano
5590fe762f Merge branch 'jk/git-connection-deadlock-fix' into maint
* jk/git-connection-deadlock-fix:
  test core.gitproxy configuration
  send-pack: avoid deadlock on git:// push with failed pack-objects
  connect: let callers know if connection is a socket
  connect: treat generic proxy processes like ssh processes

Conflicts:
	connect.c
2011-05-26 09:33:25 -07:00
Junio C Hamano
5cfe4256d9 Merge branch 'jc/bigfile'
* jc/bigfile:
  Bigfile: teach "git add" to send a large file straight to a pack
  index_fd(): split into two helper functions
  index_fd(): turn write_object and format_check arguments into one flag
2011-05-25 16:23:26 -07:00
Jeff King
3ddf0968c2 config: make environment parsing routines static
Nobody outside of git_config_from_parameters should need
to use the GIT_CONFIG_PARAMETERS parsing functions, so let's
make them private.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-24 16:20:48 -07:00
Junio C Hamano
6bb696c304 Merge branch 'mg/config-symbolic-constants'
* mg/config-symbolic-constants:
  config: Give error message when not changing a multivar
  config: define and document exit codes
2011-05-23 09:59:05 -07:00
Junio C Hamano
be5ab43566 Merge branch 'jc/magic-pathspec'
* jc/magic-pathspec:
  setup.c: Fix some "symbol not declared" sparse warnings
  t3703: Skip tests using directory name ":" on Windows
  revision.c: leave a note for "a lone :" enhancement
  t3703, t4208: add test cases for magic pathspec
  rev/path disambiguation: further restrict "misspelled index entry" diag
  fix overslow :/no-such-string-ever-existed diagnostics
  fix overstrict :<path> diagnosis
  grep: use get_pathspec() correctly
  pathspec: drop "lone : means no pathspec" from get_pathspec()
  Revert "magic pathspec: add ":(icase)path" to match case insensitively"
  magic pathspec: add ":(icase)path" to match case insensitively
  magic pathspec: futureproof shorthand form
  magic pathspec: add tentative ":/path/from/top/level" pathspec support
2011-05-23 09:58:35 -07:00
Junio C Hamano
61d7503da1 Merge branch 'jc/replacing'
* jc/replacing:
  read_sha1_file(): allow selective bypassing of replacement mechanism
  inline lookup_replace_object() calls
  read_sha1_file(): get rid of read_sha1_file_repl() madness
  t6050: make sure we test not just commit replacement
  Declare lookup_replace_object() in cache.h, not in commit.h

Conflicts:
	environment.c
2011-05-19 20:37:21 -07:00
Junio C Hamano
a66fae3827 Merge branch 'jk/git-connection-deadlock-fix'
* jk/git-connection-deadlock-fix:
  test core.gitproxy configuration
  send-pack: avoid deadlock on git:// push with failed pack-objects
  connect: let callers know if connection is a socket
  connect: treat generic proxy processes like ssh processes

Conflicts:
	connect.c
2011-05-19 20:37:20 -07:00
Michael J Gruber
7a39741999 config: define and document exit codes
The return codes of git_config_set() and friends are magic numbers right
in the source. #define them in cache.h where the functions are declared,
and use the constants in the source.

Also, mention the resulting exit codes of "git config" in its man page
(and complete the list).

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-17 21:01:17 -07:00
Jeff King
7ffe853b10 connect: let callers know if connection is a socket
They might care because they want to do a half-duplex close.
With pipes, that means simply closing the output descriptor;
with a socket, you must actually call shutdown.

Instead of exposing the magic no_fork child_process struct,
let's encapsulate the test in a function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-16 16:20:01 -07:00
Junio C Hamano
5bf29b9500 read_sha1_file(): allow selective bypassing of replacement mechanism
The way "object replacement" mechanism was tucked to the read_sha1_file()
interface was suboptimal in a couple of ways:

 - Callers that want it to die with useful diagnosis upon seeing a corrupt
   object does not have a way to say that they do not want any object
   replacement.

 - Callers who do not want it to die but want to handle the errors
   themselves are told to arrange to call read_object(), but the function
   does not use the replacement mechanism, and also it is a file scope
   static function that not many callers can call to begin with.

This adds a read_sha1_file_extended() that takes a set of flags; the
callers of read_sha1_file() passes a flag READ_SHA1_FILE_REPLACE to ask
for object replacement mechanism to kick in.

Later, we could add another flag bit to tell the function to return an
error instead of dying and then remove the misguided "call read_object()
yourself".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15 15:23:34 -07:00
Junio C Hamano
e1111cef23 inline lookup_replace_object() calls
In a repository without object replacement, lookup_replace_object() should
be a no-op. Check the flag "read_replace_refs" on the side of the caller,
and bypess a function call when we know we are not dealing with replacement.

Also, even when we are set up to replace objects, if we do not find any
replacement defined, flip that flag off to avoid function call overhead
for all the later object accesses.

As this change the semantics of the flag from "do we need read the
replacement definition?" to "do we need to check with the lookup table?"
the flag needs to be renamed later to something saner, e.g. "use_replace",
when the codebase is calmer, but not now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15 15:23:33 -07:00
Junio C Hamano
4bbf5a2615 read_sha1_file(): get rid of read_sha1_file_repl() madness
Most callers want to silently get a replacement object, and they do not
care what the real name of the replacement object is.  Worse yet, no sane
interface to return the underlying object without replacement is provided.

Remove the function and make only the few callers that want the name of
the replacement object find it themselves.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15 15:23:33 -07:00
Junio C Hamano
fea33a1ef3 Declare lookup_replace_object() in cache.h, not in commit.h
The declaration is misplaced as the replace API is supposed to affect
not just commits, but all types of objects.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15 15:23:31 -07:00
Junio C Hamano
2e83b66c32 fix overslow :/no-such-string-ever-existed diagnostics
"git cmd :/no-such-string-ever-existed" runs an extra round of get_sha1()
since 009fee4 (Detailed diagnosis when parsing an object name fails.,
2009-12-07).  Once without error diagnosis to see there is no commit with
such a string in the log message (hence "it cannot be a ref"), and after
seeing that :/no-such-string-ever-existed is not a filename (hence "it
cannot be a path, either"), another time to give "better diagnosis".

The thing is, the second time it runs, we already know that traversing the
history all the way down to the root will _not_ find any matching commit.

Rename misguided "gently" parameter, which is turned off _only_ when the
"detailed diagnosis" codepath knows that it cannot be a ref and making the
call only for the caller to die with a message.  Flip its meaning (and
adjust the callers) and call it "only_to_die", which is not a great name,
but it describes far more clearly what the codepaths that switches their
behaviour based on this variable do.

On my box, the command spends ~1.8 seconds without the patch to make the
report; with the patch it spends ~1.12 seconds.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-10 12:37:54 -07:00
Junio C Hamano
ec70f52f6f convert: rename the "eol" global variable to "core_eol"
Yes, it is clear that "eol" wants to mean some sort of end-of-line thing,
but as the name of a global variable, it is way too short to describe what
kind of end-of-line thing it wants to represent. Besides, there are many
codepaths that want to use their own local "char *eol" variable to point
at the end of the current line they are processing.

This global variable holds what we read from core.eol configuration
variable. Name it as such.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09 14:58:52 -07:00
Junio C Hamano
c4ce46fc7a index_fd(): turn write_object and format_check arguments into one flag
The "format_check" parameter tucked after the existing parameters is too
ugly an afterthought to live in any reasonable API.

Combine it with the other boolean parameter "write_object" into a single
"flags" parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09 11:58:19 -07:00
Junio C Hamano
efa67bfd16 Merge branch 'im/hashcmp-optim'
* im/hashcmp-optim:
  hashcmp(): inline memcmp() by hand to optimize
2011-05-06 11:00:36 -07:00
Junio C Hamano
1273738f05 Merge branch 'nd/struct-pathspec'
* nd/struct-pathspec:
  pathspec: rename per-item field has_wildcard to use_wildcard
  Improve tree_entry_interesting() handling code
  Convert read_tree{,_recursive} to support struct pathspec
  Reimplement read_tree_recursive() using tree_entry_interesting()
2011-05-06 10:50:06 -07:00
Junio C Hamano
f28d2e33c6 Merge branch 'jc/pack-objects-bigfile' into maint
* jc/pack-objects-bigfile:
  Teach core.bigfilethreashold to pack-objects
2011-05-04 14:57:38 -07:00
Ingo Molnar
1a812f3a70 hashcmp(): inline memcmp() by hand to optimize
This is reported to speed "git gc" by 18%.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-28 13:18:30 -07:00
Junio C Hamano
9cedd16c62 Merge branch 'jc/pack-objects-bigfile'
* jc/pack-objects-bigfile:
  Teach core.bigfilethreashold to pack-objects
2011-04-27 11:36:41 -07:00
Junio C Hamano
15366280c2 Teach core.bigfilethreashold to pack-objects
The pack-objects command should take notice of the object file and
refrain from attempting to delta large ones, to be consistent with
the fast-import command.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-05 20:25:49 -07:00
Junio C Hamano
33e0f62ba9 pathspec: rename per-item field has_wildcard to use_wildcard
As the point of the last change is to allow use of strings as
literals no matter what characters are in them, "has_wildcard"
does not match what we use this field for anymore.

It is used to decide if the wildcard matching should be used, so
rename it to match the usage better.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-05 09:30:36 -07:00
Junio C Hamano
c4b2ce6953 Merge branch 'nd/init-gitdir'
* nd/init-gitdir:
  init, clone: support --separate-git-dir for .git file
  git-init.txt: move description section up

Conflicts:
	builtin/clone.c
2011-04-01 17:57:37 -07:00
Junio C Hamano
149971badc Merge branch 'jc/index-update-if-able'
* jc/index-update-if-able:
  update $GIT_INDEX_FILE when there are racily clean entries
  diff/status: refactor opportunistic index update
2011-03-26 20:13:16 -07:00
Junio C Hamano
ad7bb2f68c Merge branch 'jc/maint-rerere-in-workdir'
* jc/maint-rerere-in-workdir:
  rerere: make sure it works even in a workdir attached to a young repository
2011-03-26 20:13:16 -07:00
Junio C Hamano
90a6464b4a rerere: make sure it works even in a workdir attached to a young repository
The git-new-workdir script in contrib/ makes a new work tree by sharing
many subdirectories of the .git directory with the original repository.
When rerere.enabled is set in the original repository, but the user has
not encountered any conflicts yet, the original repository may not yet
have .git/rr-cache directory.

When rerere wants to run in a new work tree created from such a young
original repository, it fails to mkdir(2) .git/rr-cache that is a symlink
to a yet-to-be-created directory.

There are three possible approaches to this:

 - A naive solution is not to create a symlink in the git-new-workdir
   script to a directory the original does not have (yet).  This is not a
   solution, as we tend to lazily create subdirectories of .git/, and
   having rerere.enabled configuration set is a strong indication that the
   user _wants_ to have this lazy creation to happen;

 - We could always create .git/rr-cache upon repository creation.  This is
   tempting but will not help people with existing repositories.

 - Detect this case by seeing that mkdir(2) failed with EEXIST, checking
   that the path is a symlink, and try running mkdir(2) on the link
   target.

This patch solves the issue by doing the third one.

Strictly speaking, this is incomplete.  It does not attempt to handle
relative symbolic link that points into the original repository, but this
is good enough to help people who use contrib/workdir/git-new-workdir
script.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-23 16:05:44 -07:00
Junio C Hamano
da2584243e Merge branch 'lt/default-abbrev'
* lt/default-abbrev:
  Rename core.abbrevlength back to core.abbrev
  Make the default abbrev length configurable
2011-03-23 14:55:40 -07:00
Junio C Hamano
50aaeca008 Merge branch 'jn/test-sanitize-git-env'
* jn/test-sanitize-git-env:
  tests: scrub environment of GIT_* variables
  config: drop support for GIT_CONFIG_NOGLOBAL
  gitattributes: drop support for GIT_ATTR_NOGLOBAL
  tests: suppress system gitattributes
  tests: stop worrying about obsolete environment variables
2011-03-22 21:38:12 -07:00
Junio C Hamano
ccdc4ec304 diff/status: refactor opportunistic index update
When we had to refresh the index internally before running diff or status,
we opportunistically updated the $GIT_INDEX_FILE so that later invocation
of git can use the lstat(2) we already did in this invocation.

Make them share a helper function to do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-21 12:43:10 -07:00
Junio C Hamano
1e239079f7 Merge branch 'ab/i18n-basic'
* ab/i18n-basic:
  i18n: "make distclean" should clean up after "make pot"
  i18n: Makefile: "pot" target to extract messages marked for translation
  i18n: add stub Q_() wrapper for ngettext
  i18n: do not poison translations unless GIT_GETTEXT_POISON envvar is set
  i18n: add GETTEXT_POISON to simulate unfriendly translator
  i18n: add no-op _() and N_() wrappers
  commit, status: use status_printf{,_ln,_more} helpers
  commit: refer to commit template as s->fp
  wt-status: add helpers for printing wt-status lines

Conflicts:
	builtin/commit.c
2011-03-19 23:24:42 -07:00
Junio C Hamano
0d7f242110 Merge branch 'jk/trace-sifter'
* jk/trace-sifter:
  trace: give repo_setup trace its own key
  add packet tracing debug code
  trace: add trace_strbuf
  trace: factor out "do we want to trace" logic
  trace: refactor to support multiple env variables
  trace: add trace_vprintf
2011-03-19 23:24:12 -07:00
Nguyễn Thái Ngọc Duy
b57fb80a7d init, clone: support --separate-git-dir for .git file
--separate-git-dir tells git to create git dir at the specified
location, instead of where it is supposed to be. A .git file that
points to that location will be put in place so that it appears normal
to repo discovery process.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-19 21:48:19 -07:00
Carlos Martín Nieto
e2a57aac8a Name make_*_path functions more accurately
Rename the make_*_path functions so it's clearer what they do, in
particlar make clear what the differnce between make_absolute_path and
make_nonrelative_path is by renaming them real_path and absolute_path
respectively. make_relative_path has an understandable name and is
renamed to relative_path to maintain the name convention.

The function calls have been replaced 1-to-1 in their usage.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-17 16:08:30 -07:00
Junio C Hamano
b2f6eab402 Merge branch 'maint'
* maint:
  Prepare draft release notes to 1.7.4.2
  gitweb: highlight: replace tabs with spaces
  make_absolute_path: return the input path if it points to our buffer
  valgrind: ignore SSE-based strlen invalid reads
  diff --submodule: split into bite-sized pieces
  cherry: split off function to print output lines
  branch: split off function that writes tracking info and commit subject
  standardize brace placement in struct definitions
  compat: make gcc bswap an inline function
  enums: omit trailing comma for portability

Conflicts:
	RelNotes
2011-03-16 16:59:30 -07:00
Junio C Hamano
61a6f1faec Merge branch 'jh/push-default-upstream-configname' into maint
* jh/push-default-upstream-configname:
  push.default: Rename 'tracking' to 'upstream'
2011-03-16 16:47:26 -07:00
Jonathan Nieder
c9b6782a08 enums: omit trailing comma for portability
Since v1.7.2-rc0~23^2~2 (Add per-repository eol normalization,
2010-05-19), building with gcc -std=gnu89 -pedantic produces warnings
like the following:

 convert.c:21:11: warning: comma at end of enumerator list [-pedantic]

gcc is right to complain --- these commas are not permitted in C89.
In the spirit of v1.7.2-rc0~32^2~16 (2010-05-14), remove them.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16 12:31:32 -07:00
Junio C Hamano
276e017f2f Merge branch 'nd/struct-pathspec'
* nd/struct-pathspec:
  declare 1-bit bitfields to be unsigned
2011-03-16 00:17:05 -07:00
Jonathan Nieder
9ddf17268c declare 1-bit bitfields to be unsigned
As "gcc -pedantic" notices, a two's complement 1-bit signed integer
cannot represent the value '1'.

 dir.c: In function 'init_pathspec':
 dir.c:1291:4: warning: overflow in implicit constant conversion [-Woverflow]

In the spirit of v1.7.1-rc1~10 (2010-04-06), 'unsigned' is what was
intended, so let's make the flags unsigned.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-15 22:24:29 -07:00
Junio C Hamano
674ef90904 Merge branch 'sp/maint-fd-limit'
* sp/maint-fd-limit:
  sha1_file.c: Don't retain open fds on small packs
  mingw: add minimum getrlimit() compatibility stub
  Limit file descriptors used by packs
2011-03-15 14:22:23 -07:00
Jonathan Nieder
8f323c00dd config: drop support for GIT_CONFIG_NOGLOBAL
Now that test-lib sets $HOME to protect against pollution from user
settings, GIT_CONFIG_NOGLOBAL is not needed for use by the test
suite any more.  And as luck would have it, a quick code search
reveals no other users in the wild.

This patch does not affect GIT_CONFIG_NOSYSTEM, which is still
needed.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-15 12:23:30 -07:00
Linus Torvalds
dce9648916 Make the default abbrev length configurable
The default of 7 comes from fairly early in git development, when
seven hex digits was a lot (it covers about 250+ million hash
values). Back then I thought that 65k revisions was a lot (it was what
we were about to hit in BK), and each revision tends to be about 5-10
new objects or so, so a million objects was a big number.

These days, the kernel isn't even the largest git project, and even
the kernel has about 220k revisions (_much_ bigger than the BK tree
ever was) and we are approaching two million objects. At that point,
seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number
of objects and the hash size, there _will_ be collisions in truncated
hash values. It's no longer even close to unrealistic - it happens all
the time.

We should both increase the default abbrev that was unrealistically
small, _and_ add a way for people to set their own default per-project
in the git config file.

This is the first step to first make it configurable; the default of 7
is not raised yet.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-11 14:42:54 -08:00
Junio C Hamano
adfe4e1ff2 Merge branch 'maint'
* maint:
  Revert "core.abbrevguard: Ensure short object names stay unique a bit longer"
2011-03-10 22:45:49 -08:00