This finally turns on the sliding window behavior for packfile data
access by mapping limited size windows and chaining them under the
packed_git->windows list.
We consider a given byte offset to be within the window only if there
would be at least 20 bytes (one hash worth of data) accessible after
the requested offset. This range selection relates to the contract
that use_pack() makes with its callers, allowing them to access
one hash or one object header without needing to call use_pack()
for every byte of data obtained.
In the worst case scenario we will map the same page of data twice
into memory: once at the end of one window and once again at the
start of the next window. This duplicate page mapping will happen
only when an object header or a delta base reference is spanned
over the end of a window and is always limited to just one page of
duplication, as no sane operating system will ever have a page size
smaller than a hash.
I am assuming that the possible wasted page of virtual address
space is going to perform faster than the alternatives, which
would be to copy the object header or ref delta into a temporary
buffer prior to parsing, or to check the window range on every byte
during header parsing. We may decide to revisit this decision in
the future since this is just a gut instinct decision and has not
actually been proven out by experimental testing.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Rather than hardcoding the maximum number of bytes which can be
mmapped from pack files we should make this value configurable,
allowing the end user to increase or decrease this limit on a
per-repository basis depending on the size of the repository
and the capabilities of their operating system.
In general users should not need to manually tune such a low-level
setting within the core code, but being able to artifically limit
the number of bytes which we can mmap at once from pack files will
make it easier to craft test cases for the new mmap sliding window
implementation.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/utf8:
t3900: test conversion to non UTF-8 as well
Rename t3900 test vector file
UTF-8: introduce i18n.logoutputencoding.
Teach log family --encoding
i18n.logToUTF8: convert commit log message to UTF-8
Move encoding conversion routine out of mailinfo to utf8.c
Conflicts:
commit.c
Junio rightly pointed out that the --reflog-action parameter
was starting to get out of control, as most porcelain code
needed to hand it to other porcelain and plumbing alike to
ensure the reflog contained the top-level user action and
not the lower-level actions it invoked.
At Junio's suggestion we are introducing the new set_reflog_action
function to all shell scripts, allowing them to declare early on
what their default reflog name should be, but this setting only
takes effect if the caller has not already set the GIT_REFLOG_ACTION
environment variable.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8). Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.
The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Junio asked for a 'git gc' utility which users can execute on a
regular basis to perform basic repository actions such as:
* pack-refs --prune
* reflog expire
* repack -a -d
* prune
* rerere gc
So here is a command which does exactly that. The parameters fed
to reflog's expire subcommand can be chosen by the user by setting
configuration options in .git/config (or ~/.gitconfig), as users may
want different expiration windows for each repository but shouldn't
be bothered to remember what they are all of the time.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Recent "git push" keeps transferred objects packed much more aggressively
than before. Monitoring output from git-count-objects -v for number of
loose objects is not enough to decide when to repack -- having too many
small packs is also a good cue for repacking.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix minor mark-up mistakes and adjust to v1.5.0 BCP, namely:
- use "git add" instead of "git update-index";
- use "git merge" instead of "git pull .";
- use separate remote layout;
- use config instead of remotes/origin file;
Also updates "My typical git day" example since now I have
'next' branch these days.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Instead of just warning, refuse to add otherwise ignored files
by default, and allow it with an -f option.
Signed-off-by: Junio C Hamano <junkio@cox.net>
One thing many people found confusing about git-add was that a
file whose name matches an ignored pattern could not be added to
the index. With this, such a file can be added by explicitly
spelling its name to git-add.
Fileglobs and recursive behaviour do not add ignored files to
the index. That is, if a pattern '*.o' is in .gitignore, and
two files foo.o, bar/baz.o are in the working tree:
$ git add foo.o
$ git add '*.o'
$ git add bar
Only the first form adds foo.o to the index.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds and uses the install-doc-quick.sh file to
Documentation/, which is usable for people who track either the
'html' or 'man' heads in Junio's repository (prefixed with
'origin/' if cloned locally). You may override this by
specifying DOC_REF in the make environment or in config.mak.
GZ may also be set in the environment (or config.mak) if you
wish to gzip the documentation after installing it.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Branch has "-r" for remote branches and "-a" for local and remote.
It seems logical to mirror that in show-branch. Also removes the
dubiously useful "--tags" option (as part of changing the meaning
for "--all").
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This imitates the behaviour of git-commit.
Noticed by Han-Wen Nienhuys.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This corrects minor remaining bits that still talked about <tree-ish>;
the Porcelain users (as opposed to plumbers) are mostly interested in
commits so use <commit> consistently and keep a sentence that mentions
that <tree-ish> can be used in place of them.
This adds --skip=<n> option to revision traversal machinery.
Documentation and test were added by Robert Fitzsimons.
Signed-off-by: Robert Fitzsimons <robfitz@273k.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/test-clone: (35 commits)
Introduce GIT_TEMPLATE_DIR
Revert "fix testsuite: make sure they use templates freshly built from the source"
fix testsuite: make sure they use templates freshly built from the source
rerere: fix breakage of resolving.
Add config example with respect to branch
Add documentation for show-branch --topics
make git a bit less cryptic on fetch errors
make patch_delta() error cases a bit more verbose
racy-git: documentation updates.
show-ref: fix --exclude-existing
parse-remote::expand_refs_wildcard()
vim syntax: follow recent changes to commit template
show-ref: fix --verify --hash=length
show-ref: fix --quiet --verify
avoid accessing _all_ loose refs in git-show-ref --verify
git-fetch: Avoid reading packed refs over and over again
Teach show-branch how to show ref-log data.
markup fix in svnimport documentation.
Documentation: new option -P for git-svnimport
Fix mis-mark-up in git-merge-file.txt documentation
...
Update config.txt with example with respect to branch
config variable. This give a better idea regarding
how branch names are expected.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add a quick paragraph explaining the --topics option for show-branch.
The explanation is an abbreviated version of the commit message from
d320a5437f.
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We've removed the workaround for runtime penalty that did not
exist in practice some time ago, but the technical paper that
proposed that change still said "we probably should do so".
Signed-off-by: Junio C Hamano <junkio@cox.net>
Finally.
The separate-remote layout is so much more organized than
traditional and easier to work with especially when you need to
deal with remote repositories with multiple branches and/or you
need to deal with more than one remote repositories, and using
traditional layout for new repositories simply does not make
much sense.
Internally we still have code for 1:1 mappings to create a bare
clone; that is a good thing and will not go away.
Signed-off-by: Junio C Hamano <junkio@cox.net>
'set-tree' probably accurately describes what the command
formerly known as 'commit' does.
I'm not entirely sure that 'dcommit' should be renamed to 'commit'
just yet... Perhaps 'push' or 'push-changes'?
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Most of this is derived from the documentation of RCS merge.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When --use-separate-remote is used on git-clone, the remote
heads are saved under $GIT_DIR/refs/remotes/origin/, not
"$GIT_DIR/remotes/origin/"
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now that 'git add' is considered a first-class UI for 'update-index'
and that the 'git add' documentation states "Even modified files
must be added to the set of changes about to be committed" we should
make the output of 'git status' align with that documentation and
common usage.
So now we see a status output such as:
# Added but not yet committed:
# (will commit)
#
# new file: x
#
# Changed but not added:
# (use "git add file1 file2" to include for commit)
#
# modified: x
#
# Untracked files:
# (use "git add" on files to include for commit)
#
# y
which just reads better in the context of using 'git add' to
manipulate a commit (and not a checkin, whatever the heck that is).
We also now support 'color.status.added' as an alias for the existing
'color.status.updated', as this alias more closely aligns with the
current output and documentation.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Two of the cases has "[--] [<path>...]" and two had "-- [<path>...]".
Not terribly consistent and potentially confusing. Also add "[--]" to
the synopsis so that it's obvious you can use it from the very
beginning.
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
For multivars, the "git-repo-config name value ^$" is useful but
nonintuitive and troublesome to do repeatedly (since the value is not
at the end of the command line). This commit simply adds an --add
option that adds a new value to a multivar. Particularly useful for
tracking a new branch on a remote:
git-repo-config --add remote.origin.fetch +next:origin/next
Includes documentation and test.
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
New and experienced Git users alike are finding out too late that
they forgot to enable reflogs in the current repository, and cannot
use the information stored within it to recover from an incorrectly
entered command such as `git reset --hard HEAD^^^` when they really
meant HEAD^^ (aka HEAD~2).
So enable reflogs by default in all future versions of Git, unless
the user specifically disables it with:
[core]
logAllRefUpdates = false
in their .git/config or ~/.gitconfig.
We only enable reflogs in repositories that have a working directory
associated with them, as shared/bare repositories do not have
an easy means to prune away old log entries, or may fail logging
entirely if the user's gecos information is not valid during a push.
This heuristic was suggested on the mailing list by Junio.
Documentation was also updated to indicate the new default behavior.
We probably should start to teach usuing the reflog to recover
from mistakes in some of the tutorial material, as new users are
likely to make a few along the way and will feel better knowing
they can recover from them quickly and easily, without fsck-objects'
lost+found features.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Back in the old days of Git when people messed around with their
GIT_DIR environment variable more often it was nice to know whether
or not git-init-db created a .git directory or used GIT_DIR.
As most users at that time were rather technical UNIXy folk the
message "defaulting to local storage area" made sense to some and
seemed reasonable.
But it doesn't really convey any meaning to the new Git user,
as they don't know what a 'local storage area is' nor do they
know enough about Git to care. It also really doesn't tell the
experienced Git user a whole lot about the command they just ran,
especially if they might be reinitializing an existing repository
(e.g. to update hooks).
So now we print out what we did ("Initialized empty" or
"Reinitialized existing"), what type of repository ("" or "shared"),
and what location the repository will be in ("$GIT_DIR").
Suggested in part by Andy Parkins in his Git 'niggles' list
(<200612132237.10051.andyparkins@gmail.com>).
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is nicer to let the user know when a commit succeeded all the time,
not only the first time. Also the commit sha1 is much more useful than
the tree sha1 in this case.
This patch also introduces a -q switch to supress this message as well
as the summary of created/deleted files.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since git-show is pure Porcelain, it is the ideal candidate to
pretty print other things than commits, too.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Porcelain documentation should talk in terms of end-user workflow, not
in terms of implementation details. Do not suggest update-index, but
git-add instead. Explain differences among 0-, 1- and 2-tree cases
not as differences of number of trees given to the command, but say
why user would want to give these number of trees to the command in
what situation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/read-tree-ignore:
read-tree: document --exclude-per-directory
Loosen "working file will be lost" check in Porcelain-ish
read-tree: further loosen "working file will be lost" check.