In a workload other than "git log" (without pathspec nor any option that
causes us to inspect trees and blobs), the recency pack order is said to
cause the access jump around quite a bit. Add a hook to allow us observe
how bad it is.
"git config core.logpackaccess /var/tmp/pal.txt" will give you the log
in the specified file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add support for dividing the refs of a single repository into multiple
namespaces, each of which can have its own branches, tags, and HEAD.
Git can expose each namespace as an independent repository to pull from
and push to, while sharing the object store, and exposing all the refs
to operations such as git-gc.
Storing multiple repositories as namespaces of a single repository
avoids storing duplicate copies of the same objects, such as when
storing multiple branches of the same source. The alternates mechanism
provides similar support for avoiding duplicates, but alternates do not
prevent duplication between new objects added to the repositories
without ongoing maintenance, while namespaces do.
To specify a namespace, set the GIT_NAMESPACE environment variable to
the namespace. For each ref namespace, git stores the corresponding
refs in a directory under refs/namespaces/. For example,
GIT_NAMESPACE=foo will store refs under refs/namespaces/foo/. You can
also specify namespaces via the --namespace option to git.
Note that namespaces which include a / will expand to a hierarchy of
namespaces; for example, GIT_NAMESPACE=foo/bar will store refs under
refs/namespaces/foo/refs/namespaces/bar/. This makes paths in
GIT_NAMESPACE behave hierarchically, so that cloning with
GIT_NAMESPACE=foo/bar produces the same result as cloning with
GIT_NAMESPACE=foo and cloning from that repo with GIT_NAMESPACE=bar. It
also avoids ambiguity with strange namespace paths such as
foo/refs/heads/, which could otherwise generate directory/file conflicts
within the refs directory.
Add the infrastructure for ref namespaces: handle the GIT_NAMESPACE
environment variable and --namespace option, and support iterating over
refs in a namespace.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/replacing:
read_sha1_file(): allow selective bypassing of replacement mechanism
inline lookup_replace_object() calls
read_sha1_file(): get rid of read_sha1_file_repl() madness
t6050: make sure we test not just commit replacement
Declare lookup_replace_object() in cache.h, not in commit.h
Conflicts:
environment.c
In a repository without object replacement, lookup_replace_object() should
be a no-op. Check the flag "read_replace_refs" on the side of the caller,
and bypess a function call when we know we are not dealing with replacement.
Also, even when we are set up to replace objects, if we do not find any
replacement defined, flip that flag off to avoid function call overhead
for all the later object accesses.
As this change the semantics of the flag from "do we need read the
replacement definition?" to "do we need to check with the lookup table?"
the flag needs to be renamed later to something saner, e.g. "use_replace",
when the codebase is calmer, but not now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Yes, it is clear that "eol" wants to mean some sort of end-of-line thing,
but as the name of a global variable, it is way too short to describe what
kind of end-of-line thing it wants to represent. Besides, there are many
codepaths that want to use their own local "char *eol" variable to point
at the end of the current line they are processing.
This global variable holds what we read from core.eol configuration
variable. Name it as such.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The pack-objects command should take notice of the object file and
refrain from attempting to delta large ones, to be consistent with
the fast-import command.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the make_*_path functions so it's clearer what they do, in
particlar make clear what the differnce between make_absolute_path and
make_nonrelative_path is by renaming them real_path and absolute_path
respectively. make_relative_path has an understandable name and is
renamed to relative_path to maintain the name convention.
The function calls have been replaced 1-to-1 in their usage.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default of 7 comes from fairly early in git development, when
seven hex digits was a lot (it covers about 250+ million hash
values). Back then I thought that 65k revisions was a lot (it was what
we were about to hit in BK), and each revision tends to be about 5-10
new objects or so, so a million objects was a big number.
These days, the kernel isn't even the largest git project, and even
the kernel has about 220k revisions (_much_ bigger than the BK tree
ever was) and we are approaching two million objects. At that point,
seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number
of objects and the hash size, there _will_ be collisions in truncated
hash values. It's no longer even close to unrealistic - it happens all
the time.
We should both increase the default abbrev that was unrealistically
small, _and_ add a way for people to set their own default per-project
in the git config file.
This is the first step to first make it configurable; the default of 7
is not raised yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 72a5b561fc, as adding
fixed number of hexdigits more than necessary to make one object name
locally unique does not help in futureproofing the uniqueness of names
we generate today.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/setup: (47 commits)
setup_work_tree: adjust relative $GIT_WORK_TREE after moving cwd
git.txt: correct where --work-tree path is relative to
Revert "Documentation: always respect core.worktree if set"
t0001: test git init when run via an alias
Remove all logic from get_git_work_tree()
setup: rework setup_explicit_git_dir()
setup: clean up setup_discovered_git_dir()
t1020-subdirectory: test alias expansion in a subdirectory
setup: clean up setup_bare_git_dir()
setup: limit get_git_work_tree()'s to explicit setup case only
Use git_config_early() instead of git_config() during repo setup
Add git_config_early()
git-rev-parse.txt: clarify --git-dir
t1510: setup case #31
t1510: setup case #30
t1510: setup case #29
t1510: setup case #28
t1510: setup case #27
t1510: setup case #26
t1510: setup case #25
...
This logic is now only used by cmd_init_db(). setup_* functions do not
rely on it any more. Move all the logic to cmd_init_db() and turn
get_git_work_tree() into a simple function.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_git_work_tree() takes input as core.worktree, core.bare,
GIT_WORK_TREE and decides correct worktree setting.
Unfortunately it does not do its job well. core.worktree and
GIT_WORK_TREE should only be taken into account, if GIT_DIR is set
(which is handled by setup_explicit_git_dir). For other setup cases,
only core.bare matters.
Add a temporary variable setup_explicit to adjust get_git_work_tree()
behavior as such. This variable will be gone once setup_* rework is
done.
Also remove is_bare_repository_cfg check in set_git_work_tree() to
ease the rework. We are going to check for core.bare and core.worktree
early before setting worktree. For example, if core.bare is true, no
need to set worktree.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/thinner-wrapper:
Remove pack file handling dependency from wrapper.o
pack-objects: mark file-local variable static
wrapper: give zlib wrappers their own translation unit
strbuf: move strbuf_branchname to sha1_name.c
path helpers: move git_mkstemp* to wrapper.c
wrapper: move odb_* to environment.c
wrapper: move xmmap() to sha1_file.c
* pn/commit-autosquash:
add tests of commit --squash
commit: --squash option for use with rebase --autosquash
add tests of commit --fixup
commit: --fixup option for use with rebase --autosquash
pretty.c: teach format_commit_message() to reencode the output
commit: helper methods to reduce redundant blocks of code
Conflicts:
Documentation/git-commit.txt
t/t3415-rebase-autosquash.sh
getenv(3) returns not-permanent buffer which may be changed by e.g.
putenv(3) call (*).
In practice I've noticed this when trying to do `git commit -m abc`
inside msysgit under wine, getting
$ git commit -m abc
fatal: could not open 'DIR=.git/COMMIT_EDITMSG': No such file or directory
^^^^
(notice introduced 'DIR=' artifact.)
The problem was showing itself only with -m option, and actually, as
debugging showed, originally
git_dir = getenv("GIT_DIR")
returned pointer to
"GIT_DIR=.git\0"
^
git_dir
, we stored it in git_dir, than, after processing -m git-commit option,
we did setenv("GIT_EDITOR", ":") which as (*) says changed environment
variables memory layout - something like this
"...\0GIT_DIR=.git\0"
^
git_dir
and oops - we got wrong git_dir.
Avoid that by strdupping getenv("GIT_DIR") result like we did in 06f354
(setup: make sure git dir path is in a permanent buffer). Unfortunately
this also shows that other getenv usage inside git needs auditing...
(*) from man 3 getenv:
The implementation of getenv() is not required to be reentrant. The
string pointed to by the return value of getenv() may be statically
allocated, and can be modified by a subsequent call to getenv(),
putenv(3), setenv(3), or unsetenv(3).
Cc: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The odb_mkstemp and odb_pack_keep functions open files under the
$GIT_OBJECT_DIRECTORY directory. This requires access to the git
configuration which very simple programs do not need.
Move these functions to environment.o, closer to their dependencies.
This should make it easier for programs to link to wrapper.o without
linking to environment.o.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* builtin/commit.c: Replace block of code with a one-liner call to
logmsg_reencode().
* commit.c: new function for looking up a comit by name
* pretty.c: helper methods for getting output encodings
Add helpers get_log_output_encoding() and
get_commit_output_encoding() that eliminate some messy and duplicate
if-blocks.
Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though git makes sure that it uses enough hexdigits to show an
abbreviated object name unambiguously, as more objects are added to the
repository over time, a short name that used to be unique will stop being
unique. Git uses this many extra hexdigits that are more than necessary
to make the object name currently unique, in the hope that its output will
stay unique a bit longer.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If setup_git_env() is run before the usual repository discovery
sequence and .git is a file with the text
gitdir: <path>
(with <path> any string) then the in-core git_dir variable is set to
the result of converting <path> to an absolute path using
make_absolute_path().
Unfortunately make_absolute_path() returns its result in a static
buffer that is overwritten by later calls. Such a call could cause
later accesses to git_dir (from git_pathdup(), for example) to read
the wrong path, leaving git very confused.
It is not obvious whether any existing code in git will trigger the
problem, but in any case, it is worth a few dozen bytes to copy the
return value from make_absolute_path() for some added peace of mind.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After v1.6.0-rc0~230^2^ (environment.c: remove unused function,
2008-06-19), git_refs_dir is not used any more.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* kf/askpass-config:
Extend documentation of core.askpass and GIT_ASKPASS.
Allow core.askpass to override SSH_ASKPASS.
Add a new option 'core.askpass'.
Setting this option has the same effect as setting the environment variable
'GIT_ASKPASS'.
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like $GIT_CONFIG, $GIT_CONFIG_PARAMETERS needs to be suppressed by
"git push" and its cousins when running local transport helpers to
imitate remote transport well.
Noticed-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The startup_info struct will collect information managed by the git
setup code, such as the prefix for relative paths passed on the
command line (i.e., path to the starting cwd from the toplevel of
the work tree) and whether a git repository has been found.
In other words, startup_info is intended to be a collection of global
variables with results that were previously returned from setup
functions. This state is global anyway (since the cwd is), even
if it is not currently tracked that way. Letting these values persist
means there is more flexibility in deciding when to run setup.
For now, the struct is empty.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new configuration variable, "core.eol", that allows the user
to set which line endings to use for end-of-line-normalized files in the
working directory. It defaults to "native", which means CRLF on Windows
and LF everywhere else.
Note that "core.autocrlf" overrides core.eol. This means that
[core]
autocrlf = true
puts CRLFs in the working directory even if core.eol is set to "lf".
Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the semantics of the "crlf" attribute so that it enables
end-of-line normalization when it is set, regardless of "core.autocrlf".
Add a new setting for "crlf": "auto", which enables end-of-line
conversion but does not override the automatic text file detection.
Add a new attribute "eol" with possible values "crlf" and "lf". When
set, this attribute enables normalization and forces git to use CRLF or
LF line endings in the working directory, respectively.
The line ending style to be used for normalized text files in the
working directory is set using "core.autocrlf". When it is set to
"true", CRLFs are used in the working directory; when set to "input" or
"false", LFs are used.
Signed-off-by: Eyvind Bernhardsen <eyvind.bernhardsen@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the list of GIT_* environment variables that are local to a
repository into a static list in environment.c, as it is also
useful elsewhere. Also add the missing GIT_CONFIG variable to the
list.
Make it easy to use the list both by NULL-termination and by size;
the latter (excluding the terminating NULL) is stored in the
local_repo_env_size define.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/sparse: (25 commits)
t7002: test for not using external grep on skip-worktree paths
t7002: set test prerequisite "external-grep" if supported
grep: do not do external grep on skip-worktree entries
commit: correctly respect skip-worktree bit
ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
tests: rename duplicate t1009
sparse checkout: inhibit empty worktree
Add tests for sparse checkout
read-tree: add --no-sparse-checkout to disable sparse checkout support
unpack-trees(): ignore worktree check outside checkout area
unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
unpack-trees.c: generalize verify_* functions
unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
Introduce "sparse checkout"
dir.c: export excluded_1() and add_excludes_from_file_1()
excluded_1(): support exclude files in index
unpack-trees(): carry skip-worktree bit over in merged_entry()
Read .gitignore from index if it is skip-worktree
Avoid writing to buffer in add_excludes_from_file_1()
...
Conflicts:
.gitignore
Documentation/config.txt
Documentation/git-update-index.txt
Makefile
entry.c
t/t7002-grep.sh
* cc/replace:
Documentation: talk a little bit about GIT_NO_REPLACE_OBJECTS
Documentation: fix typos and spelling in replace documentation
replace: use a GIT_NO_REPLACE_OBJECTS env variable
This has the same effect as --no-replace-objects option; git ignores the
replace refs. When --no-replace-objects option is passed to git, this
environment variable is set to "1" and exported to subprocesses in order
to propagate the same setting.
It is useful for example for scripts, as the git commands used in them can
now be aware that they must not read replace refs.
Tested-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit notes are blobs which are shown together with the commit
message. These blobs are taken from the notes ref, which you can
configure by the config variable core.notesRef, which in turn can
be overridden by the environment variable GIT_NOTES_REF.
The notes ref is a branch which contains "files" whose names are
the names of the corresponding commits (i.e. the SHA-1).
The rationale for putting this information into a ref is this: we
want to be able to fetch and possibly union-merge the notes,
maybe even look at the date when a note was introduced, and we
want to store them efficiently together with the other objects.
This patch has been improved by the following contributions:
- Thomas Rast: fix core.notesRef documentation
- Tor Arne Vestbø: fix printing of multi-line notes
- Alex Riesen: Using char array instead of char pointer costs less BSS
- Johan Herland: Plug leak when msg is good, but msglen or type causes return
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_commit_notes(): Plug memory leak when 'if' triggers, but not because of read_sha1_file() failure
This patch introduces core.sparseCheckout, which will control whether
sparse checkout support is enabled in unpack_trees()
It also loads sparse-checkout file that will be used in the next patch.
I split it out so the next patch will be shorter, easier to read.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cc/replace:
t6050: check pushing something based on a replaced commit
Documentation: add documentation for "git replace"
Add git-replace to .gitignore
builtin-replace: use "usage_msg_opt" to give better error messages
parse-options: add new function "usage_msg_opt"
builtin-replace: teach "git replace" to actually replace
Add new "git replace" command
environment: add global variable to disable replacement
mktag: call "check_sha1_signature" with the replacement sha1
replace_object: add a test case
object: call "check_sha1_signature" with the replacement sha1
sha1_file: add a "read_sha1_file_repl" function
replace_object: add mechanism to replace objects found in "refs/replace/"
refs: add a "for_each_replace_ref" function
Introduce --ignore-whitespace option and corresponding config bool to
ignore whitespace differences while applying patches, akin to the
'patch' program.
'git am', 'git rebase' and the bash git completion are made aware of
this option.
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/maint-graft-unhide-true-parents:
git repack: keep commits hidden by a graft
Add a test showing that 'git repack' throws away grafted-away parents
Conflicts:
git-repack.sh
When you have grafts that pretend that a given commit has different
parents than the ones recorded in the commit object, it is dangerous
to let 'git repack' remove those hidden parents, as you can easily
remove the graft and end up with a broken repository.
So let's play it safe and keep those parent objects and everything
that is reachable by them, in addition to the grafted parents.
As this behavior can only be triggered by git pack-objects, and as that
command handles duplicate parents gracefully, we do not bother to cull
duplicated parents that may result by using both true and grafted
parents.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the message said "we will be changing the default in the future, so
this is to warn people who want to keep the current default what to do",
it would have made some sense, but as it stands, the message is merely an
unsolicited advertisement for a new feature, which it is not helpful at
all. Squelch it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new "read_replace_refs" global variable is set to 1 by
default, so that replace refs are used by default. But
reachability traversal and packing commands ("cmd_fsck",
"cmd_prune", "cmd_pack_objects", "upload_pack",
"cmd_unpack_objects") set it to 0, as they must work with the
original DAG.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"Unreliable hardlinks" is a misleading description for what is happening.
So rename it to something less misleading.
Suggested by Linus Torvalds.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems that accessing NTFS partitions with ufsd (at least on my EeePC)
has an unnerving bug: if you link() a file and unlink() it right away,
the target of the link() will have the correct size, but consist of NULs.
It seems as if the calls are simply not serialized correctly, as single-stepping
through the function move_temp_to_file() works flawlessly.
As ufsd is "Commertial software" (sic!), I cannot fix it, and have to work
around it in Git.
At the same time, it seems that this fixes msysGit issues 222 and 229 to
assume that Windows cannot handle link() && unlink().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git push" is not told what refspecs to push, it pushes all matching
branches to the current remote. For some workflows this default is not
useful, and surprises new users. Some have even found that this default
behaviour is too easy to trigger by accident with unwanted consequences.
Introduce a new configuration variable "push.default" that decides what
action git push should take if no refspecs are given or implied by the
command line arguments or the current remote configuration.
Possible values are:
'nothing' : Push nothing;
'matching' : Current default behaviour, push all branches that already
exist in the current remote;
'tracking' : Push the current branch to whatever it is tracking;
'current' : Push the current branch to a branch of the same name,
i.e. HEAD.
Signed-off-by: Finn Arne Gangstad <finnag@pvv.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit notes are blobs which are shown together with the commit
message. These blobs are taken from the notes ref, which you can
configure by the config variable core.notesRef, which in turn can
be overridden by the environment variable GIT_NOTES_REF.
The notes ref is a branch which contains "files" whose names are
the names of the corresponding commits (i.e. the SHA-1).
The rationale for putting this information into a ref is this: we
want to be able to fetch and possibly union-merge the notes,
maybe even look at the date when a note was introduced, and we
want to store them efficiently together with the other objects.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This can do the lstat() storm in parallel, giving potentially much
improved performance for cold-cache cases or things like NFS that have
weak metadata caching.
Just use "read_cache_preload()" instead of "read_cache()" to force an
optimistic preload of the index stat data. The function takes a
pathspec as its argument, allowing us to preload only the relevant
portion of the index.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Fix non-literal format in printf-style calls
git-submodule: Avoid printing a spurious message.
git ls-remote: make usage string match manpage
Makefile: help people who run 'make check' by mistake
These were found using gcc 4.3.2-1ubuntu11 with the warning:
warning: format not a string literal and no format arguments
Incorporated suggestions from Brandon Casey <casey@nrlssc.navy.mil>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ar/maint-mksnpath:
Use git_pathdup instead of xstrdup(git_path(...))
git_pathdup: returns xstrdup-ed copy of the formatted path
Fix potentially dangerous use of git_path in ref.c
Add git_snpath: a .git path formatting routine with output buffer
Conflicts:
builtin-revert.c
refs.c
rerere.c
This function is used to learn whether git_dir is already set up or not.
It is necessary, because we want to read configuration in compat/cygwin.c
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A new configuration variable 'core.trustctime' is introduced to
allow ignoring st_ctime information when checking if paths
in the working tree has changed, because there are situations where
it produces too much false positives. Like when file system crawlers
keep changing it when scanning and using the ctime for marking scanned
files.
The default is to notice ctime changes.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A lot of modules that have nothing to do with git-shell functionality
were linked in, bloating git-shell more than 8 times.
This patch cuts off redundant dependencies by:
1. providing stubs for three functions that make no sense for git-shell;
2. moving quote_path_fully from environment.c to quote.c to make the
later self sufficient;
3. moving make_absolute_path into a new separate file.
The following numbers have been received with the default optimization
settings on master using GCC 4.1.2:
Before:
text data bss dec hex filename
143915 1348 93168 238431 3a35f git-shell
After:
text data bss dec hex filename
17670 788 8232 26690 6842 git-shell
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/config-fsync:
Add config option to enable 'fsync()' of object files
Split up default "i18n" and "branch" config parsing into helper routines
Split up default "user" config parsing into helper routine
Split up default "core" config parsing into helper routine
As explained in the documentation[*] this is totally useless on
filesystems that do ordered/journalled data writes, but it can be a
useful safety feature on filesystems like HFS+ that only journal the
metadata, not the actual file contents.
It defaults to off, although we could presumably in theory some day
auto-enable it on a per-filesystem basis.
[*] Yes, I updated the docs for the thing. Hell really _has_ frozen
over, and the four horsemen are probably just beyond the horizon.
EVERYBODY PANIC!
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* db/clone-in-c:
Add test for cloning with "--reference" repo being a subset of source repo
Add a test for another combination of --reference
Test that --reference actually suppresses fetching referenced objects
clone: fall back to copying if hardlinking fails
builtin-clone.c: Need to closedir() in copy_or_link_directory()
builtin-clone: fix initial checkout
Build in clone
Provide API access to init_db()
Add a function to set a non-default work tree
Allow for having for_each_ref() list extra refs
Have a constant extern refspec for "--tags"
Add a library function to add an alternate to the alternates file
Add a lockfile function to append to a file
Mark the list of refs to fetch as const
Conflicts:
cache.h
t/t5700-clone-reference.sh
* sb/committer:
commit: Show committer if automatic
commit: Show author if different from committer
Preparation to call determine_author_info from prepare_to_commit
* lt/core-optim:
Optimize symlink/directory detection
Avoid some unnecessary lstat() calls
is_racy_timestamp(): do not check timestamp for gitlinks
diff-lib.c: rename check_work_tree_entity()
diff: a submodule not checked out is not modified
Add t7506 to test submodule related functions for git-status
t4027: test diff for submodule with empty directory
Make git-add behave more sensibly in a case-insensitive environment
When adding files to the index, add support for case-independent matches
Make unpack-tree update removed files before any updated files
Make branch merging aware of underlying case-insensitive filsystems
Add 'core.ignorecase' option
Make hash_name_lookup able to do case-independent lookups
Make "index_name_exists()" return the cache_entry it found
Move name hashing functions into a file of its own
Make unpack_trees_options bit flags actual bitfields
Change cd67e4d4 introduced a new configuration parameter that told
pull to automatically perform a rebase instead of a merge. This
change provides a configuration option to enable this feature
automatically when creating a new branch.
If the variable branch.autosetuprebase applies for a branch that's
being created, that branch will have branch.<name>.rebase set to true.
Signed-off-by: Dustin Sallings <dustin@spy.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/case-insensitive:
Make git-add behave more sensibly in a case-insensitive environment
When adding files to the index, add support for case-independent matches
Make unpack-tree update removed files before any updated files
Make branch merging aware of underlying case-insensitive filsystems
Add 'core.ignorecase' option
Make hash_name_lookup able to do case-independent lookups
Make "index_name_exists()" return the cache_entry it found
Move name hashing functions into a file of its own
Make unpack_trees_options bit flags actual bitfields
To warn the user in case he/she might be using an unintended
committer identity.
Signed-off-by: Santi Béjar <sbejar@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function may only be used before the work tree is used.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch allows .git to be a regular textfile containing the path of
the real git directory (prefixed with "gitdir: "), which can be useful on
platforms lacking support for real symlinks.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
..and start using it for directory entry traversal (ie "git status" will
not consider entries that match an existing entry case-insensitively to
be a new file)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git branch" and "git checkout -b" now honor --track option even when
the upstream branch is local. Previously --track was silently ignored
when forking from a local branch. Also the command did not error out
when --track was explicitly asked for but the forked point specified
was not an existing branch (i.e. when there is no way to set up the
tracking configuration), but now it correctly does.
The configuration setting branch.autosetupmerge can now be set to
"always", which is equivalent to using --track from the command line.
Setting branch.autosetupmerge to "true" will retain the former behavior
of only setting up branch.*.merge for remote upstream branches.
Includes test cases for the new functionality.
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also use "git_config_string" to simplify "config.c" code
where "excludes_file" is set.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also use "git_config_string" to simplify "config.c" code
where "editor_program" is set.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also use "git_config_string" to simplify "config.c" code
where "pager_program" is set.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CRLF conversion bears a slight chance of corrupting data.
autocrlf=true will convert CRLF to LF during commit and LF to
CRLF during checkout. A file that contains a mixture of LF and
CRLF before the commit cannot be recreated by git. For text
files this is the right thing to do: it corrects line endings
such that we have only LF line endings in the repository.
But for binary files that are accidentally classified as text the
conversion can corrupt data.
If you recognize such corruption early you can easily fix it by
setting the conversion type explicitly in .gitattributes. Right
after committing you still have the original file in your work
tree and this file is not yet corrupted. You can explicitly tell
git that this file is binary and git will handle the file
appropriately.
Unfortunately, the desired effect of cleaning up text files with
mixed line endings and the undesired effect of corrupting binary
files cannot be distinguished. In both cases CRLFs are removed
in an irreversible way. For text files this is the right thing
to do because CRLFs are line endings, while for binary files
converting CRLFs corrupts data.
This patch adds a mechanism that can either warn the user about
an irreversible conversion or can even refuse to convert. The
mechanism is controlled by the variable core.safecrlf, with the
following values:
- false: disable safecrlf mechanism
- warn: warn about irreversible conversions
- true: refuse irreversible conversions
The default is to warn. Users are only affected by this default
if core.autocrlf is set. But the current default of git is to
leave core.autocrlf unset, so users will not see warnings unless
they deliberately chose to activate the autocrlf mechanism.
The safecrlf mechanism's details depend on the git command. The
general principles when safecrlf is active (not false) are:
- we warn/error out if files in the work tree can modified in an
irreversible way without giving the user a chance to backup the
original file.
- for read-only operations that do not modify files in the work tree
we do not not print annoying warnings.
There are exceptions. Even though...
- "git add" itself does not touch the files in the work tree, the
next checkout would, so the safety triggers;
- "git apply" to update a text file with a patch does touch the files
in the work tree, but the operation is about text files and CRLF
conversion is about fixing the line ending inconsistencies, so the
safety does not trigger;
- "git diff" itself does not touch the files in the work tree, it is
often run to inspect the changes you intend to next "git add". To
catch potential problems early, safety triggers.
The concept of a safety check was originally proposed in a similar
way by Linus Torvalds. Thanks to Dimitry Potapov for insisting
on getting the naked LF/autocrlf=true case right.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
When deciding whether or not to turn on automatic color
support, git_config_colorbool checks whether stdout is a
tty. However, because we run a pager, if stdout is not a
tty, we must check whether it is because we started the
pager. This used to be done by checking the pager_in_use
variable.
This variable was set only when the git program being run
started the pager; there was no way for an external program
running git indicate that it had already started a pager.
This patch allows a program to set GIT_PAGER_IN_USE to a
true value to indicate that even though stdout is not a tty,
it is because a pager is being used.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/spht:
Use gitattributes to define per-path whitespace rule
core.whitespace: documentation updates.
builtin-apply: teach whitespace_rules
builtin-apply: rename "whitespace" variables and fix styles
core.whitespace: add test for diff whitespace error highlighting
git-diff: complain about >=8 consecutive spaces in initial indent
War on whitespace: first, a bit of retreat.
Conflicts:
cache.h
config.c
diff.c
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]). This attribute gives you finer
control per path.
For example, if you have these in the .gitattributes:
frotz whitespace
nitfol -whitespace
xyzzy whitespace=-trailing
all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'. A project with mixed Python and C might want
to have:
*.c whitespace
*.py whitespace=-indent-with-non-tab
in its toplevel .gitattributes file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are inconsistencies in the way commands currently handle
the core.excludesfile configuration variable. The problem is
the variable is too new to be noticed by anything other than
git-add and git-status.
* git-ls-files does not notice any of the "ignore" files by
default, as it predates the standardized set of ignore files.
The calling scripts established the convention to use
.git/info/exclude, .gitignore, and later core.excludesfile.
* git-add and git-status know about it because they call
add_excludes_from_file() directly with their own notion of
which standard set of ignore files to use. This is just a
stupid duplication of code that need to be updated every time
the definition of the standard set of ignore files is
changed.
* git-read-tree takes --exclude-per-directory=<gitignore>,
not because the flexibility was needed. Again, this was
because the option predates the standardization of the ignore
files.
* git-merge-recursive uses hardcoded per-directory .gitignore
and nothing else. git-clean (scripted version) does not
honor core.* because its call to underlying ls-files does not
know about it. git-clean in C (parked in 'pu') doesn't either.
We probably could change git-ls-files to use the standard set
when no excludes are specified on the command line and ignore
processing was asked, or something like that, but that will be a
change in semantics and might break people's scripts in a subtle
way. I am somewhat reluctant to make such a change.
On the other hand, I think it makes perfect sense to fix
git-read-tree, git-merge-recursive and git-clean to follow the
same rule as other commands. I do not think of a valid use case
to give an exclude-per-directory that is nonstandard to
read-tree command, outside a "negative" test in the t1004 test
script.
This patch is the first step to untangle this mess.
The next step would be to teach read-tree, merge-recursive and
clean (in C) to use setup_standard_excludes().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces core.whitespace configuration variable that lets
you specify the definition of "whitespace error".
Currently there are two kinds of whitespace errors defined:
* trailing-space: trailing whitespaces at the end of the line.
* space-before-tab: a SP appears immediately before HT in the
indent part of the line.
You can specify the desired types of errors to be detected by
listing their names (unique abbreviations are accepted)
separated by comma. By default, these two errors are always
detected, as that is the traditional behaviour. You can disable
detection of a particular type of error by prefixing a '-' in
front of the name of the error, like this:
[core]
whitespace = -trailing-space
This patch teaches the code to output colored diff with
DIFF_WHITESPACE color to highlight the detected whitespace
errors to honor the new configuration.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cr/tag:
Teach "git stripspace" the --strip-comments option
Make verify-tag a builtin.
builtin-tag.c: Fix two memory leaks and minor notation changes.
launch_editor(): Heed GIT_EDITOR and core.editor settings
Make git tag a builtin.
The old version of work-tree support was an unholy mess, barely readable,
and not to the point.
For example, why do you have to provide a worktree, when it is not used?
As in "git status". Now it works.
Another riddle was: if you can have work trees inside the git dir, why
are some programs complaining that they need a work tree?
IOW it is allowed to call
$ git --git-dir=../ --work-tree=. bla
when you really want to. In this case, you are both in the git directory
and in the working tree. So, programs have to actually test for the right
thing, namely if they are inside a working tree, and not if they are
inside a git directory.
Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was
specified, unless there is a repository in the current working directory.
It does now.
The logic to determine if a repository is bare, or has a work tree
(tertium non datur), is this:
--work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true,
which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR
ends in /.git, which overrides the directory in which .git/ was found.
In related news, a long standing bug was fixed: when in .git/bla/x.git/,
which is a bare repository, git formerly assumed ../.. to be the
appropriate git dir. This problem was reported by Shawn Pearce to have
caused much pain, where a colleague mistakenly ran "git init" in "/" a
long time ago, and bare repositories just would not work.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the function set_git_dir() you can reset the path that will
be used for git_path(), git_dir() and friends.
The responsibility to close files and throw away information from the
old git_dir lies with the caller.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the commit 'Add GIT_EDITOR environment and core.editor
configuration variables', this was done for the shell scripts.
Port it over to builtin-tag's version of launch_editor(), which
is just about to be refactored into editor.c.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a configuration variable that performs the same function as,
but is overridden by, GIT_PAGER.
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Acked-by: Johannes E. Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We always quote "unusual" byte values in a pathname using
C-string style, to make it safer for parsing scripts that do not
handle NUL separated records well (or just too lazy to bother).
The absolute minimum bytes that need to be quoted for this
purpose are TAB, LF (and other control characters), double quote
and backslash.
However, we have also always quoted the bytes in high 8-bit
range; this was partly because we were lazy and partly because
we were being cautious.
This introduces an internal "quote_path_fully" variable, and
core.quotepath configuration variable to control it. When set
to false, it does not quote bytes in high 8-bit range anymore
but passes them intact.
The variable defaults to "true" to retain the traditional
behaviour for now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add config variables pack.compression and core.loosecompression ,
and switch --compression=level to pack-objects.
Loose objects will be compressed using core.loosecompression if set,
else core.compression if set, else Z_BEST_SPEED.
Packed objects will be compressed using --compression=level if seen,
else pack.compression if set, else core.compression if set,
else Z_DEFAULT_COMPRESSION. This is the "pack compression level".
Loose objects added to a pack undeltified will be recompressed
to the pack compression level if it is unequal to the current
loose compression level by the preceding rules, or if the loose
object was written while core.legacyheaders = true. Newly
deltified loose objects are always compressed to the current
pack compression level.
Previously packed objects added to a pack are recompressed
to the current pack compression level exactly when their
deltification status changes, since the previous pack data
cannot be reused.
In either case, the --no-reuse-object switch from the first
patch below will always force recompression to the current pack
compression level, instead of assuming the pack compression level
hasn't changed and pack data can be reused when possible.
This applies on top of the following patches from Nicolas Pitre:
[PATCH] allow for undeltified objects not to be reused
[PATCH] make "repack -f" imply "pack-objects --no-reuse-object"
Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now that we encourage and actively preserve objects in a packed form
more agressively than we did at the time the new loose object format and
core.legacyheaders were introduced, that extra loose object format
doesn't appear to be worth it anymore.
Because the packing of loose objects has to go through the delta match
loop anyway, and since most of them should end up being deltified in
most cases, there is really little advantage to have this parallel loose
object format as the CPU savings it might provide is rather lost in the
noise in the end.
This patch gets rid of core.legacyheaders, preserve the legacy format as
the only writable loose object format and deprecate the other one to
keep things simpler.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration variable core.deltaBaseCacheLimit allows the
user to control how much memory they are willing to give to Git for
caching base objects of deltas. This is not normally meant to be
a user tweakable knob; the "out of the box" settings are meant to
be suitable for almost all workloads.
We default to 16 MiB under the assumption that the cache is not
meant to consume all of the user's available memory, and that the
cache's main purpose was to cache trees, for faster path limiters
during revision traversal. Since trees tend to be relatively small
objects, this relatively small limit should still allow a large
number of objects.
On the other hand we don't want the cache to start storing 200
different versions of a 200 MiB blob, as this could easily blow
the entire address space of a 32 bit process.
We evict OBJ_BLOB from the cache first (credit goes to Junio) as
we want to favor OBJ_TREE within the cache. These are the objects
that have the highest inflate() startup penalty, as they tend to
be small and thus don't have that much of a chance to ammortize
that penalty over the entire data.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The new builtin-revert code introduces a few new compiler errors
when I'm building with my stricter set of checks enabled in CFLAGS.
These all just stem from trying to store a constant string into
a non-const char*. Simple fix, make the variables const char*.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We shouldn't attempt to assign constant strings into char*, as the
string is not writable at runtime. Likewise we should always be
treating unsigned values as unsigned values, not as signed values.
Most of these are very straightforward. The only exception is the
(unnecessary) xstrdup/free in builtin-branch.c for the detached
head case. Since this is a user-level interactive type program
and that particular code path is executed no more than once, I feel
that the extra xstrdup call is well worth the easy elimination of
this warning.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Some file systems that can host git repositories and their working copies
do not support symbolic links. But then if the repository contains a symbolic
link, it is impossible to check out the working copy.
This patch enables partial support of symbolic links so that it is possible
to check out a working copy on such a file system. A new flag
core.symlinks (which is true by default) can be set to false to indicate
that the filesystem does not support symbolic links. In this case, symbolic
links that exist in the trees are checked out as small plain files, and
checking in modifications of these files preserve the symlink property in
the database (as long as an entry exists in the index).
Of course, this does not magically make symbolic links work on such defective
file systems; hence, this solution does not help if the working copy relies
on that an entry is a real symbolic link.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows you to do:
[core]
AutoCRLF = input
and it should do only the CRLF->LF translation (ie it simplifies CRLF only
when reading working tree files, but when checking out files, it leaves
the LF alone, and doesn't turn it into a CRLF).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.
Anyway, BY DEFAULT it is off regardless, because it requires a
[core]
AutoCRLF = true
in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).
But you can actually enable it on UNIX, and it will cause:
- "git update-index" will write blobs without CRLF
- "git diff" will diff working tree files without CRLF
- "git checkout" will write files to the working tree _with_ CRLF
and things work fine.
Funnily, it actually shows an odd file in git itself:
git clone -n git test-crlf
cd test-crlf
git config core.autocrlf true
git checkout
git diff
shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.
Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).
I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.
NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.
The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the old is_bare_git_dir(const char *) to ask if a
directory, if it is a GIT_DIR, is a bare repository, and
replaces it with is_bare_repository(void *). The function looks
at core.bare configuration variable if exists but uses the old
heuristics: if it is ".git" or ends with "/.git", then it does
not look like a bare repository, otherwise it does.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The patches to prevent Porcelainish that require working tree
from doing any damage in a bare repository make a lot of sense,
and I want to make the is_bare_git_dir() function more reliable.
In order to allow the repository owner override the heuristic
implemented in is_bare_git_dir() if/when it misidentifies a
particular repository, it would make sense to introduce a new
configuration variable "[core] bare = true/false", and make
is_bare_git_dir() notice it.
The scripts would do a 'repo-config --bool --get core.bare' and
iff the command fails (i.e. there is no such variable in the
configuration file), it would use the heuristic implemented at
the script level [*1*].
However, setup_git_env() which is called a lot earlier than we
even read from the repository configuration currently makes a
call to is_bare_git_dir(), in order to change the default
setting for log_all_ref_updates. It somehow feels that this is
a hack.
By the way, [*1*] is another thing I hate about the current
config mechanism. "git-repo-config --get" does not know what
the possible configuration variables are, let alone what the
default values for them are. It allows us not to maintain a
centralized configuration table, which makes it easy to
introduce ad-hoc variables and gives a warm fuzzy feeling of
being modular, but my feeling is that it is turning out to be a
rather high price to pay for scripts.
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the compiler has asked us to disable use of mmap() on their
platform then we are forced to use git_mmap and its emulation via
pread. In this case large (e.g. 32 MiB) windows for pack access
are simply too big as a command will wind up reading a lot more
data than it will ever need, significantly reducing response time.
To prevent a high latency when NO_MMAP has been selected we now
use a default of 1 MiB for core.packedGitWindowSize. Credit goes
to Linus and Junio for recommending this more reasonable setting.
[jc: upcased the name of the symbolic constant, and made another
hardcoded constant into a symbolic constant while at it. ]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This finally turns on the sliding window behavior for packfile data
access by mapping limited size windows and chaining them under the
packed_git->windows list.
We consider a given byte offset to be within the window only if there
would be at least 20 bytes (one hash worth of data) accessible after
the requested offset. This range selection relates to the contract
that use_pack() makes with its callers, allowing them to access
one hash or one object header without needing to call use_pack()
for every byte of data obtained.
In the worst case scenario we will map the same page of data twice
into memory: once at the end of one window and once again at the
start of the next window. This duplicate page mapping will happen
only when an object header or a delta base reference is spanned
over the end of a window and is always limited to just one page of
duplication, as no sane operating system will ever have a page size
smaller than a hash.
I am assuming that the possible wasted page of virtual address
space is going to perform faster than the alternatives, which
would be to copy the object header or ref delta into a temporary
buffer prior to parsing, or to check the window range on every byte
during header parsing. We may decide to revisit this decision in
the future since this is just a gut instinct decision and has not
actually been proven out by experimental testing.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Rather than hardcoding the maximum number of bytes which can be
mmapped from pack files we should make this value configurable,
allowing the end user to increase or decrease this limit on a
per-repository basis depending on the size of the repository
and the capabilities of their operating system.
In general users should not need to manually tune such a low-level
setting within the core code, but being able to artifically limit
the number of bytes which we can mmap at once from pack files will
make it easier to craft test cases for the new mmap sliding window
implementation.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8). Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.
The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.
Signed-off-by: Junio C Hamano <junkio@cox.net>
New and experienced Git users alike are finding out too late that
they forgot to enable reflogs in the current repository, and cannot
use the information stored within it to recover from an incorrectly
entered command such as `git reset --hard HEAD^^^` when they really
meant HEAD^^ (aka HEAD~2).
So enable reflogs by default in all future versions of Git, unless
the user specifically disables it with:
[core]
logAllRefUpdates = false
in their .git/config or ~/.gitconfig.
We only enable reflogs in repositories that have a working directory
associated with them, as shared/bare repositories do not have
an easy means to prune away old log entries, or may fail logging
entirely if the user's gecos information is not valid during a push.
This heuristic was suggested on the mailing list by Junio.
Documentation was also updated to indicate the new default behavior.
We probably should start to teach usuing the reflog to recover
from mistakes in some of the tutorial material, as new users are
likely to make a few along the way and will feel better knowing
they can recover from them quickly and easily, without fsck-objects'
lost+found features.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The 'receive.denynonfastforwards' option has nothing to do with
the repository format version. Since receive-pack already uses
git_config to initialize itself before executing any updates we
can use the normal configuration strategy and isolate the receive
specific variables away from the core variables.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
If receive.denyNonFastforwards is set to true, git-receive-pack will deny
non fast-forwards, i.e. forced updates. Most notably, a push to a repository
which has that flag set will fail.
As a first user, 'git-init-db --shared' sets this flag, since in a shared
setup, you are most unlikely to want forced pushes to succeed.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
[jc: I needed to hand merge the changes to the updated codebase,
so the result needs to be checked.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
enable/disable colored output when the pager is in use
Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The pack-file format is slightly different from the traditional git
object format, in that it has a much denser binary header encoding.
The traditional format uses an ASCII string with type and length
information, which is somewhat wasteful.
A new object format starts with uncompressed binary header
followed by compressed payload -- this will allow us later to
copy the payload straight to packfiles.
Obviously they cannot be read by older versions of git, so for
now new object files are created with the traditional format.
core.legacyheaders configuration item, when set to false makes
the code write in new format for people to experiment with.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
On one of my systems, the linker is not intelligent enough to link with
pager.o (in libgit.a) when only the variable pager_in_use is needed. The
consequence is that the linker complains about an undefined variable. So,
put the variable into environment.o, where it is linked always.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With the change in default, "git add ." on kernel dir is about
twice as fast as before, with only minimal (0.5%) change in
object size. The speed difference is even more noticeable
when committing large files, which is now up to 8 times faster.
The configurability is through setting core.compression = [-1..9]
which maps to the zlib constants; -1 is the default, 0 is no
compression, and 1..9 are various speed/size tradeoffs, 9
being slowest.
Signed-off-by: Joachim B Haga (cjhaga@fys.uio.no)
Acked-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This enhances core.sharedrepository to have additionally
specify that read and exec permissions to be given to others as
well. It is useful when serving a repository via gitweb and
git-daemon that runs as a user outside the project group.
The configuration item can take the following values:
[core]
sharedrepository ; the same as "group"
sharedrepository = true ; ditto
sharedrepository = 1 ; ditto
sharedrepository = group ; allow rwx to group
sharedrepository = all ; allow rwx to group, allow rx to other
sharedrepository = umask ; not shared - use umask
It also extends "git init-db" to take "--shared=all" and friends
from the command line.
Signed-off-by: Junio C Hamano <junkio@cox.net>
If config parameter core.logAllRefUpdates is true or the log
file already exists then append a line to ".git/logs/refs/<ref>"
whenever git-update-ref <ref> is executed. Each log line contains
the following information:
oldsha1 <SP> newsha1 <SP> committer <LF>
where committer is the current user, date, time and timezone in
the standard GIT ident format. If the caller is unable to append
to the log file then git-update-ref will fail without updating <ref>.
An optional message may be included in the log line with the -m flag.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When inspecting a project whose build infrastructure used to
assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs
in the config file of such a project would help to bisect its
history.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The new configuration option apply.whitespace can take one of
"warn", "error", "error-all", or "strip". When git-apply is run
to apply the patch to the index, they are used as the default
value if there is no command line --whitespace option.
Andrew can now tell people who feed him git trees to update to
this version and say:
git repo-config apply.whitespace error
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds "assume unchanged" logic, started by this message in the list
discussion recently:
<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>
This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of. On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.
You can use two new options to update-index to set and reset the
CE_VALID bit:
git-update-index --assume-unchanged path...
git-update-index --no-assume-unchanged path...
These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file. Nor they add a new
entry to the index.
When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:
- update-index to explicitly register the current object name to the
index file.
- when update-index --refresh finds the path to be up-to-date.
- when tools like read-tree -u and apply --index update the working
tree file and register the current object name to the index file.
The flag is dropped upon read-tree that does not check out the index
entry. This happens regardless of the core.ignorestat settings.
Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time. However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:
- while "git-read-tree -m" or git-apply need to make sure
that the paths involved in the merge do not have local
modifications. This sacrifices performance for safety.
- when git-checkout-index -f -q -u -a tries to see if it needs
to checkout the paths. Otherwise you can never check
anything out ;-).
- when git-update-index --really-refresh (a new flag) tries to
see if the index entry is up to date. You can start with
everything marked as CE_VALID and run this once to drop
CE_VALID bit for paths that are modified.
Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".
This version is not expected to be perfect. I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.
But the basics seem to work, and ready to be tested by people
who asked for this feature.
Signed-off-by: Junio C Hamano <junkio@cox.net>
If the config variable 'core.sharedrepository' is set, the directories
$GIT_DIR/objects/
$GIT_DIR/objects/??
$GIT_DIR/objects/pack
$GIT_DIR/refs
$GIT_DIR/refs/heads
$GIT_DIR/refs/heads/tags
are set group writable (and g+s, since the git group may be not the primary
group of all users).
Since all files are written as lock files first, and then moved to
their destination, they do not have to be group writable. Indeed, if
this leads to problems you found a bug.
Note that -- as in my first attempt -- the config variable is set in the
function which checks the repository format. If this were done in
git_default_config instead, a lot of programs would need to be modified
to call git_config(git_default_config) first.
[jc: git variables should be in environment.c unless there is a
compelling reason to do otherwise.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is to hold what the project-local rule as to the
charset/encoding for the commit log message is. Lack of it
defaults to utf-8.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows you to force git to avoid symlinks for refs. Just add
something like
[core]
symrefsonly = true
to .git/config.
Don´t forget to "git checkout your_branch", or it does not do anything...
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This starts using the "user.name" and "user.email" config variables if
they exist as the default name and email when committing. This means
that you don't have to use the GIT_COMMITTER_EMAIL environment variable
to override your email - you can just edit the config file instead.
The patch looks bigger than it is because it makes the default name and
email information non-static and renames it appropriately. And it moves
the common git environment variables into a new library file, so that
you can link against libgit.a and get the git environment without having
to link in zlib and libcrypt.
In short, most of it is renaming and moving, the real change core is
just a few new lines in "git_default_config()" that copies the user
config values to the new base.
It also changes "git-var -l" to list the config variables.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>