This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Placing `struct lock_file`s on the stack used to be a bad idea, because
the temp- and lockfile-machinery would keep a pointer into the struct.
But after 076aa2cbd (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we can safely have lockfiles on the stack. (This applies
even if a user returns early, leaving a locked lock behind.)
These `struct lock_file`s are local to their respective functions and we
can drop their staticness.
For good measure, I have inspected these sites and come to believe that
they always release the lock, with the possible exception of bailing out
using `die()` or `exit()` or by returning from a `cmd_foo()`.
As pointed out by Jeff King, it would be bad if someone held on to a
`struct lock_file *` for some reason. After some grepping, I agree with
his findings: no-one appears to be doing that.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of oid_object_info
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.
Signed-off-by: Stefan Beller <sbeller@google.com>
Convert sha1_object_info and sha1_object_info_extended to take pointers
to struct object_id and rename them to use "oid" instead of "sha1" in
their names. Update the declaration and definition and apply the
following semantic patch, plus the standard object_id transforms:
@@
expression E1, E2;
@@
- sha1_object_info(E1.hash, E2)
+ oid_object_info(&E1, E2)
@@
expression E1, E2;
@@
- sha1_object_info(E1->hash, E2)
+ oid_object_info(E1, E2)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1.hash, E2, E3)
+ oid_object_info_extended(&E1, E2, E3)
@@
expression E1, E2, E3;
@@
- sha1_object_info_extended(E1->hash, E2, E3)
+ oid_object_info_extended(E1, E2, E3)
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert find_unique_abbrev and find_unique_abbrev_r to each take a
pointer to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git describe $garbage" stopped giving any errors when the garbage
happens to be a string with 40 hexadecimal letters.
* sb/describe-blob:
describe: confirm that blobs actually exist
Prior to 644eb60bd0 (builtin/describe.c: describe a blob,
2017-11-15), we noticed and complained about missing
objects, since they were not valid commits:
$ git describe 0000000000000000000000000000000000000000
fatal: 0000000000000000000000000000000000000000 is not a valid 'commit' object
After that commit, we feed any non-commit to lookup_blob(),
and complain only if it returns NULL. But the lookup_*
functions do not actually look at the on-disk object
database at all. They return an entry from the in-memory
object hash if present (and if it matches the requested
type), and otherwise auto-create a "struct object" of the
requested type.
A missing object would hit that latter case: we create a
bogus blob struct, walk all of history looking for it, and
then exit successfully having produced no output.
One reason nobody may have noticed this is that some related
cases do still work OK:
1. If we ask for a tree by sha1, then the call to
lookup_commit_referecne_gently() would have parsed it,
and we would have its true type in the in-memory object
hash.
2. If we ask for a name that doesn't exist but isn't a
40-hex sha1, then get_oid() would complain before we
even look at the objects at all.
We can fix this by replacing the lookup_blob() call with a
check of the true type via sha1_object_info(). This is not
quite as efficient as we could possibly make this check. We
know in most cases that the object was already parsed in the
earlier commit lookup, so we could call lookup_object(),
which does auto-create, and check the resulting struct's
type (or NULL). However it's not worth the fragility nor
code complexity to save a single object lookup.
The new tests cover this case, as well as that of a
tree-by-sha1 (which does work as described above, but was
not explicitly tested).
Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An old regression in "git describe --all $annotated_tag^0" has been
fixed.
* dk/describe-all-output-fix:
describe: prepend "tags/" when describing tags with embedded name
Call strbuf_add_unique_abbrev() to add an abbreviated hash to a strbuf
instead of taking a detour through find_unique_abbrev() and its static
buffer. This is shorter and a bit more efficient.
Patch generated by Coccinelle (and contrib/coccinelle/strbuf.cocci).
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git describe" was taught to dig trees deeper to find a
<commit-ish>:<path> that refers to a given blob object.
* sb/describe-blob:
builtin/describe.c: describe a blob
builtin/describe.c: factor out describe_commit
builtin/describe.c: print debug statements earlier
builtin/describe.c: rename `oid` to avoid variable shadowing
revision.h: introduce blob/tree walking in order of the commits
list-objects.c: factor out traverse_trees_and_blobs
t6120: fix typo in test name
The man page of the "git describe" command explains the expected
output when using the --all option, i.e. the full reference path is
shown, including heads/ or tags/ prefix.
When 212945d4a8 ("Teach git-describe
to verify annotated tag names before output") made Git favor the
embedded name of annotated tags, it accidentally changed the output
format when the --all flag is given, only printing the tag's name
without the prefix.
Check if --all was specified and re-add the "tags/" prefix for this
special case to fix the regresssion.
Signed-off-by: Daniel Knittl-Frank <knittl89+git@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])
When describing commits, we try to anchor them to tags or refs, as these
are conceptually on a higher level than the commit. And if there is no ref
or tag that matches exactly, we're out of luck. So we employ a heuristic
to make up a name for the commit. These names are ambiguous, there might
be different tags or refs to anchor to, and there might be different
path in the DAG to travel to arrive at the commit precisely.
When describing a blob, we want to describe the blob from a higher layer
as well, which is a tuple of (commit, deep/path) as the tree objects
involved are rather uninteresting. The same blob can be referenced by
multiple commits, so how we decide which commit to use? This patch
implements a rather naive approach on this: As there are no back pointers
from blobs to commits in which the blob occurs, we'll start walking from
any tips available, listing the blobs in-order of the commit and once we
found the blob, we'll take the first commit that listed the blob. For
example
git describe --tags v0.99:Makefile
conversion-901-g7672db20c2:Makefile
tells us the Makefile as it was in v0.99 was introduced in commit 7672db20.
The walking is performed in reverse order to show the introduction of a
blob rather than its last occurrence.
[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor out describing commits into its own function `describe_commit`,
which will put any output to stdout into a strbuf, to be printed
afterwards.
As the next patch will teach Git to describe blobs using a commit and path,
this refactor will make it easy to reuse the code describing commits.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging, print the received argument at the start of the
function instead of in the middle. This ensures that the received
argument is printed in all code paths, and also allows a subsequent
refactoring to not need to move the "arg" parameter.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `describe` has already a variable named `oid` declared at
the beginning of the function for an object id. Do not shadow that
variable with a pointer to an object id.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (25 commits)
refs/files-backend: convert static functions to object_id
refs: convert read_raw_ref backends to struct object_id
refs: convert peel_object to struct object_id
refs: convert resolve_ref_unsafe to struct object_id
worktree: convert struct worktree to object_id
refs: convert resolve_gitlink_ref to struct object_id
Convert remaining callers of resolve_gitlink_ref to object_id
sha1_file: convert index_path and index_fd to struct object_id
refs: convert reflog_expire parameter to struct object_id
refs: convert read_ref_at to struct object_id
refs: convert peel_ref to struct object_id
builtin/pack-objects: convert to struct object_id
pack-bitmap: convert traverse_bitmap_commit_list to object_id
refs: convert dwim_log to struct object_id
builtin/reflog: convert remaining unsigned char uses to object_id
refs: convert dwim_ref and expand_ref to struct object_id
refs: convert read_ref and read_ref_full to object_id
refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
Convert check_connected to use struct object_id
refs: update ref transactions to use struct object_id
...
Calling cmd_foo() as if it is a general purpose helper function is
a no-no. Correct two instances of such to set an example.
* jc/no-cmd-as-subroutine:
merge-ours: do not use cmd_*() as a subroutine
describe: do not use cmd_*() as a subroutine
Convert peel_ref (and its corresponding backend) to struct object_id.
This transformation was done with an update to the declaration,
definition, comments, and test helper and the following semantic patch:
@@
expression E1, E2;
@@
- peel_ref(E1, E2.hash)
+ peel_ref(E1, &E2)
@@
expression E1, E2;
@@
- peel_ref(E1, E2->hash)
+ peel_ref(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cmd_foo() function is a moral equivalent of 'main' for a Git
subcommand 'git foo', and as such, it is allowed to do many things
that make it unsuitable to be called as a subroutine, including
- call exit(3) to terminate the process;
- allocate resource held and used throughout its lifetime, without
releasing it upon return/exit;
- rely on global variables being initialized at program startup,
and update them as needed, making another clean invocation of the
function impossible.
The call to cmd_diff_index() "git describe" makes has been working
by accident that the function did not call exit(3); it sets a bad
precedent for people to cut and paste.
We could invoke it via the run_command() interface, but the diff
family of commands have helper functions in diff-lib.c that are
meant to be usable as subroutines, and using the latter does not
make the resulting code all that longer. Use it.
Note that there is also an invocation of cmd_name_rev() at the end;
"git describe --contains" massages its command line arguments to be
suitable for "git name-rev" invocation and jumps to it, never to
regain control. This call is left as-is as an exception to the
rule. When we start to allow calling name-rev repeatedly as a
helper function, we would be able to remove this call as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git describe` uses `--match`, it matches only tags, basically
ignoring the `--all` argument even when it is specified.
Fix it by also matching branch name and $remote_name/$remote_branch_name,
for remote-tracking references, with the specified patterns. Update
documentation accordingly and add tests.
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git describe --match` with multiple patterns matches only first pattern.
If it fails, next patterns are not tried.
Fix it, add test cases and update existing test which has wrong
expectation.
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is to address concerns raised by ThreadSanitizer on the mailing list
about threaded unprotected R/W access to map.size with my previous "disallow
rehash" change (0607e10009).
See:
https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/
Add API to hashmap to disable item counting and thus automatic rehashing.
Also include API to later re-enable them.
When item counting is disabled, the map.size field is invalid. So to
prevent accidents, the field has been renamed and an accessor function
hashmap_get_size() has been added. All direct references to this
field have been been updated. And the name of the field changed
to map.private_size to communicate this.
Here is the relevant output from ThreadSanitizer showing the problem:
WARNING: ThreadSanitizer: data race (pid=10554)
Read of size 4 at 0x00000082d488 by thread T2 (mutexes: write M16):
#0 hashmap_add hashmap.c:209
#1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
#2 handle_range_dir name-hash.c:347
#3 handle_range_1 name-hash.c:415
#4 lazy_dir_thread_proc name-hash.c:471
#5 <null> <null>
Previous write of size 4 at 0x00000082d488 by thread T1 (mutexes: write M31):
#0 hashmap_add hashmap.c:209
#1 hash_dir_entry_with_parent_and_prefix name-hash.c:302
#2 handle_range_dir name-hash.c:347
#3 handle_range_1 name-hash.c:415
#4 handle_range_dir name-hash.c:380
#5 handle_range_1 name-hash.c:415
#6 lazy_dir_thread_proc name-hash.c:471
#7 <null> <null>
Martin gives instructions for running TSan on test t3008 in this post:
https://public-inbox.org/git/CAN0heSoJDL9pWELD6ciLTmWf-a=oyxe4EXXOmCKvsG5MSuzxsA@mail.gmail.com/
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many uses of comparision callback function the hashmap API uses
cast the callback function type when registering it to
hashmap_init(), which defeats the compile time type checking when
the callback interface changes (e.g. gaining more parameters).
The callback implementations have been updated to take "void *"
pointers and cast them to the type they expect instead.
* sb/hashmap-cleanup:
t/helper/test-hashmap: use custom data instead of duplicate cmp functions
name-hash.c: drop hashmap_cmp_fn cast
submodule-config.c: drop hashmap_cmp_fn cast
remote.c: drop hashmap_cmp_fn cast
patch-ids.c: drop hashmap_cmp_fn cast
convert/sub-process: drop cast to hashmap_cmp_fn
config.c: drop hashmap_cmp_fn cast
builtin/describe: drop hashmap_cmp_fn cast
builtin/difftool.c: drop hashmap_cmp_fn cast
attr.c: drop hashmap_cmp_fn cast
Update the hashmap API so that data to customize the behaviour of
the comparison function can be specified at the time a hashmap is
initialized.
* sb/hashmap-customize-comparison:
hashmap: migrate documentation from Documentation/technical into header
patch-ids.c: use hashmap correctly
hashmap.h: compare function has access to a data field
When using the hashmap a common need is to have access to caller provided
data in the compare function. A couple of times we abuse the keydata field
to pass in the data needed. This happens for example in patch-ids.c.
This patch changes the function signature of the compare function
to have one more void pointer available. The pointer given for each
invocation of the compare function must be defined in the init function
of the hashmap and is just passed through.
Documentation of this new feature is deferred to a later patch.
This is a rather mechanical conversion, just adding the new pass-through
parameter. However while at it improve the naming of the fields of all
compare functions used by hashmaps by ensuring unused parameters are
prefixed with 'unused_' and naming the parameters what they are (instead
of 'unused' make it 'unused_keydata').
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.
This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_tag to take a pointer to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.
Introduce a temporary in parse_object buffer in order to convert this
function. This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted. Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.
parse_object_buffer will lose this temporary in a later patch.
This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)
@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)
@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)
@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)
@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git describe --debug localizes all debug messages but not the terms
head, lightweight, annotated that it outputs for the candidates.
Localize them, too.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-describe tells you the version number you're at, or errors out, e.g.
when you run it outside of a repository, which may happen when downloading
a tar ball instead of using git to obtain the source code.
To keep this property of only erroring out, when not in a repository,
severe (submodule) errors must be downgraded to reporting them gently
instead of having git-describe error out completely.
To achieve that a flag '--broken' is introduced, which is in the same
vein as '--dirty' but uses an actual child process to check for dirtiness.
When that child dies unexpectedly, we'll append '-broken' instead of
'-dirty'.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the functions in this file and struct commit_name to struct
object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-describe the `--exclude` option which will allow specifying
a glob pattern of tags to ignore. This can be combined with the
`--match` patterns to enable more flexibility in determining which tags
to consider.
For example, suppose you wish to find the first official release tag
that contains a certain commit. If we assume that official release tags
are of the form "v*" and pre-release candidates include "*rc*" in their
name, we can now find the first release tag that introduces the commit
abcdef:
git describe --contains --match="v*" --exclude="*rc*" abcdef
Add documentation, tests, and completion for this change.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach `--match` to be accepted multiple times, accumulating a list of
patterns to match into a string list. Each pattern is inclusive, such
that a tag need only match one of the provided patterns to be
considered for matching.
This extension is useful as it enables more flexibility in what tags
match, and may avoid the need to run the describe command multiple
times to get the same result.
Add tests and update the documentation for this change.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT. The resulting code is
shorter and supports empty arrays with NULL pointers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object. This provides no
functional change, as it is essentially a macro substitution.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
struct object is one of the major data structures dealing with object
IDs. Convert it to use struct object_id instead of an unsigned char
array. Convert get_object_hash to refer to the new member as well.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash. Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
"git describe" without argument defaulted to describe the HEAD
commit, but "git describe --contains" didn't. Arguably, in a
repository used for active development, such defaulting would not
be very useful as the tip of branch is typically not tagged, but it
is better to be consistent.
* sg/describe-contains:
describe --contains: default to HEAD when no commit-ish is given
'git describe --contains' doesn't default to HEAD when no commit is
given, and it doesn't produce any output, not even an error:
~/src/git ((v2.5.0))$ ./git describe --contains
~/src/git ((v2.5.0))$ ./git describe --contains HEAD
v2.5.0^0
Unlike other 'git describe' options, the '--contains' code path is
implemented by calling 'name-rev' with a bunch of options plus all the
commit-ishes that were passed to 'git describe'. If no commit-ish was
present, then 'name-rev' got invoked with none, which then leads to the
behavior illustrated above.
Porcelain commands usually default to HEAD when no commit-ish is given,
and 'git describe' already does so in all other cases, so it should do
so with '--contains' as well.
Pass HEAD to 'name-rev' when no commit-ish is given on the command line
to make '--contains' behave consistently with other 'git describe'
options. While at it, use argv_array_pushv() instead of the loop to
pass commit-ishes to 'git name-rev'.
'git describe's short help already indicates that the commit-ish is
optional, but the synopsis in the man page doesn't, so update it
accordingly as well.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite to take an object_id argument and convert the local variable
"peeled" object_id.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".
To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().
This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:
- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).
Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hashmap entries are typically looked up by just a key. The hashmap_get()
API expects an initialized entry structure instead, to support compound
keys. This flexibility is currently only needed by find_dir_entry() in
name-hash.c (and compat/win32/fscache.c in the msysgit fork). All other
(currently five) call sites of hashmap_get() have to set up a near emtpy
entry structure, resulting in duplicate code like this:
struct hashmap_entry keyentry;
hashmap_entry_init(&keyentry, hash(key));
return hashmap_get(map, &keyentry, key);
Add a hashmap_get_from_hash() API that allows hashmap lookups by just
specifying the key and its hash code, i.e.:
return hashmap_get_from_hash(map, hash(key), key);
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Copying the first bytes of a SHA1 is duplicated in six places,
however, the implications (the actual value would depend on the
endianness of the platform) is documented only once.
Add a properly documented API for this.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We started using wildmatch() in place of fnmatch(3); complete the
process and stop using fnmatch(3).
* nd/no-more-fnmatch:
actually remove compat fnmatch source code
stop using fnmatch (either native or compat)
Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch"
use wildmatch() directly without fnmatch() wrapper
Improvements to our hash table to get it to meet the needs of the
msysgit fscache project, with some nice performance improvements.
* kb/fast-hashmap:
name-hash: retire unused index_name_exists()
hashmap.h: use 'unsigned int' for hash-codes everywhere
test-hashmap.c: drop unnecessary #includes
.gitignore: test-hashmap is a generated file
read-cache.c: fix memory leaks caused by removed cache entries
builtin/update-index.c: cleanup update_one
fix 'git update-index --verbose --again' output
remove old hash.[ch] implementation
name-hash.c: remove cache entries instead of marking them CE_UNHASHED
name-hash.c: use new hash map implementation for cache entries
name-hash.c: remove unreferenced directory entries
name-hash.c: use new hash map implementation for directories
diffcore-rename.c: use new hash map implementation
diffcore-rename.c: simplify finding exact renames
diffcore-rename.c: move code around to prepare for the next patch
buitin/describe.c: use new hash map implementation
add a hashtable implementation that supports O(1) removal
submodule: don't access the .gitmodules cache entry after removing it
Make it clear that we don't use fnmatch() anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We liberally use "committish" and "commit-ish" (and "treeish" and
"tree-ish"); as these are non-words, let's unify these terms to
their dashed form. More importantly, clarify the documentation on
object peeling using these terms.
* rh/ishes-doc:
glossary: fix and clarify the definition of 'ref'
revisions.txt: fix and clarify <rev>^{<type>}
glossary: more precise definition of tree-ish (a.k.a. treeish)
use 'commit-ish' instead of 'committish'
use 'tree-ish' instead of 'treeish'
glossary: define commit-ish (a.k.a. committish)
glossary: mention 'treeish' as an alternative to 'tree-ish'
Replace 'committish' in documentation and comments with 'commit-ish'
to match gitglossary(7) and to be consistent with 'tree-ish'.
The only remaining instances of 'committish' are:
* variable, function, and macro names
* "(also committish)" in the definition of commit-ish in
gitglossary[7]
Signed-off-by: Richard Hansen <rhansen@bbn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN,
2011-09-27). All occurrences of the respective variables have
been reviewed and none of them relied on the counting up mechanism,
but all of them were using the variable as a true boolean.
This patch does not change semantics of any command intentionally.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git describe" takes a commit and gives it a name based on tags in
its neighbourhood. The command does take a commit-ish but when
given a tag that points at a commit, it should dereference the tag
before computing the name for the commit.
As the whole processing is internally delegated to name-rev, if we
unwrap tags down to the underlying commit when invoking name-rev, it
will make the name-rev issue an error message based on the unwrapped
object name (i.e. either 40-hex object name, or "$tag^0") that is
different from what the end-user gave to the command when the commit
cannot be described. Introduce an internal option --peel-tag to the
name-rev to tell it to unwrap a tag in its input from the command
line.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a hand allocated args[] array, use argv-array API
to manage the dynamically created list of arguments when invoking
name-rev.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define memory ownership and lifetime rules for what for-each-ref
feeds to its callbacks (in short, "you do not own it, so make a
copy if you want to keep it").
* mh/reflife: (25 commits)
refs: document the lifetime of the args passed to each_ref_fn
register_ref(): make a copy of the bad reference SHA-1
exclude_existing(): set existing_refs.strdup_strings
string_list_add_refs_by_glob(): add a comment about memory management
string_list_add_one_ref(): rename first parameter to "refname"
show_head_ref(): rename first parameter to "refname"
show_head_ref(): do not shadow name of argument
add_existing(): do not retain a reference to sha1
do_fetch(): clean up existing_refs before exiting
do_fetch(): reduce scope of peer_item
object_array_entry: fix memory handling of the name field
find_first_merges(): remove unnecessary code
find_first_merges(): initialize merges variable using initializer
fsck: don't put a void*-shaped peg in a char*-shaped hole
object_array_remove_duplicates(): rewrite to reduce copying
revision: use object_array_filter() in implementation of gc_boundary()
object_array: add function object_array_filter()
revision: split some overly-long lines
cmd_diff(): make it obvious which cases are exclusive of each other
cmd_diff(): rename local variable "list" -> "entry"
...
Do not retain a reference to the refname passed to the each_ref_fn
callback get_name(), because there is no guarantee of the lifetimes of
these names. Instead, make a local copy when needed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only consider the first parent commit when walking the commit history. This
is useful if you only wish to match tags on your branch after a merge.
Signed-off-by: Mike Crowe <mac@mcrowe.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--match=<pattern>" option of "git describe", when used with
"--all" to allow refs that are not annotated tags to be used as a
base of description, did not restrict the output from the command
to those that match the given pattern.
We may want to have a looser matching that does not restrict to tags,
but that can be done as a follow-up topic; this step is purely a bugfix.
* jc/describe:
describe: --match=<pattern> must limit the refs even when used with --all
The logic to limit the refs used for describing with a matching pattern
with --match=<pattern> parameter was implemented incorrectly when --all
is in effect. It just demoted a ref that did not match the pattern to
lower priority---if there aren't other refs with higher priority
that describe the given commit, such an unmatching ref was still used.
When --match is used, reject refs that do not match the given
criteria, so that with or without --all, the output will only use
refs that match the pattern.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of references still survive to .git/refs as a tree
of all refs. Fix one in docs, one in a -h message, one in
a -h message quoted in docs.
Signed-off-by: Greg Price <price@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Speeds up "git upload-pack" (what is invoked by "git fetch" on the
other side of the connection) by reducing the cost to advertise the
branches and tags that are available in the repository.
* jk/peel-ref:
upload-pack: use peel_ref for ref advertisements
peel_ref: check object type before loading
peel_ref: do not return a null sha1
peel_ref: use faster deref_tag_noverify
The idea of the peel_ref function is to dereference tag
objects recursively until we hit a non-tag, and return the
sha1. Conceptually, it should return 0 if it is successful
(and fill in the sha1), or -1 if there was nothing to peel.
However, the current behavior is much more confusing. For a
regular loose ref, the behavior is as described above. But
there is an optimization to reuse the peeled-ref value for a
ref that came from a packed-refs file. If we have such a
ref, we return its peeled value, even if that peeled value
is null (indicating that we know the ref definitely does
_not_ peel).
It might seem like such information is useful to the caller,
who would then know not to bother loading and trying to peel
the object. Except that they should not bother loading and
trying to peel the object _anyway_, because that fallback is
already handled by peel_ref. In other words, the whole point
of calling this function is that it handles those details
internally, and you either get a sha1, or you know that it
is not peel-able.
This patch catches the null sha1 case internally and
converts it into a -1 return value (i.e., there is nothing
to peel). This simplifies callers, which do not need to
bother checking themselves.
Two callers are worth noting:
- in pack-objects, a comment indicates that there is a
difference between non-peelable tags and unannotated
tags. But that is not the case (before or after this
patch). Whether you get a null sha1 has to do with
internal details of how peel_ref operated.
- in show-ref, if peel_ref returns a failure, the caller
tries to decide whether to try peeling manually based on
whether the REF_ISPACKED flag is set. But this doesn't
make any sense. If the flag is set, that does not
necessarily mean the ref came from a packed-refs file
with the "peeled" extension. But it doesn't matter,
because even if it didn't, there's no point in trying to
peel it ourselves, as peel_ref would already have done
so. In other words, the fallback peeling is guaranteed
to fail.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running git describe --dirty the index should be refreshed. Previously
the cached index would cause describe to think that the index was dirty when,
in reality, it was just stale.
The issue was exposed by python setuptools which hardlinks files into another
directory when building a distribution.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The default of 7 comes from fairly early in git development, when
seven hex digits was a lot (it covers about 250+ million hash
values). Back then I thought that 65k revisions was a lot (it was what
we were about to hit in BK), and each revision tends to be about 5-10
new objects or so, so a million objects was a big number.
These days, the kernel isn't even the largest git project, and even
the kernel has about 220k revisions (_much_ bigger than the BK tree
ever was) and we are approaching two million objects. At that point,
seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number
of objects and the hash size, there _will_ be collisions in truncated
hash values. It's no longer even close to unrealistic - it happens all
the time.
We should both increase the default abbrev that was unrealistically
small, _and_ add a way for people to set their own default per-project
in the git config file.
This is the first step to first make it configurable; the default of 7
is not raised yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For the find_exact_renames() function, this allows us to pass the
diff_options structure pointer to the low-level routines. We will use
that to distinguish between the "rename" and "copy" cases.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that struct commit.util is not used until after we've checked that
the argument doesn't exactly match a tag, we can wait until then to
look up the commits for each tag.
This avoids a lot of I/O on --exact-match queries in repositories with
many tags. For example, 'git describe --exact-match HEAD' becomes
about 12 times faster on a cold cache (3.2s instead of 39s) in a
linux-2.6 repository with 2000 packed tags. That is a huge win for the
interactivity of the __git_ps1 shell prompt helper when on a detached
HEAD.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
describe is currently forced to look up the commit at each tag in
order to store the struct commit_name pointers in struct commit.util.
For --exact-match queries, those lookups are wasteful. In preparation
for removing them, put the commit_names into a hash table, indexed by
commit SHA1, that can be used to quickly check for exact matches.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now add_to_known_names overwrites commit_names in place when multiple
tags point to the same commit. This will make it easier to store
commit_names in a hash table.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't waste time checking for dangling refs; they wouldn't affect the
output of 'git describe' anyway. Although this does not gain much
performance by itself, it does in conjunction with the next commits.
Signed-off-by: Anders Kaseorg <andersk@ksplice.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add commit_list prefix to insert_by_date function and to sort_by_date,
so it's clear that these functions refer to commit_list structure.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/maint-describe-tiebreak-with-tagger-date:
describe: Break annotated tag ties by tagger date
tag.c: Parse tagger date (if present)
tag.c: Refactor parse_tag_buffer to be saner to program
tag.h: Remove unused signature field
tag.c: Correct indentation
This shrinks the top-level directory a bit, and makes it much more
pleasant to use auto-completion on the thing. Instead of
[torvalds@nehalem git]$ em buil<tab>
Display all 180 possibilities? (y or n)
[torvalds@nehalem git]$ em builtin-sh
builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c
builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o
[torvalds@nehalem git]$ em builtin-shor<tab>
builtin-shortlog.c builtin-shortlog.o
[torvalds@nehalem git]$ em builtin-shortlog.c
you get
[torvalds@nehalem git]$ em buil<tab> [type]
builtin/ builtin.h
[torvalds@nehalem git]$ em builtin [auto-completes to]
[torvalds@nehalem git]$ em builtin/sh<tab> [type]
shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o
[torvalds@nehalem git]$ em builtin/sho [auto-completes to]
[torvalds@nehalem git]$ em builtin/shor<tab> [type]
shortlog.c shortlog.o
[torvalds@nehalem git]$ em builtin/shortlog.c
which doesn't seem all that different, but not having that annoying
break in "Display all 180 possibilities?" is quite a relief.
NOTE! If you do this in a clean tree (no object files etc), or using an
editor that has auto-completion rules that ignores '*.o' files, you
won't see that annoying 'Display all 180 possibilities?' message - it
will just show the choices instead. I think bash has some cut-off
around 100 choices or something.
So the reason I see this is that I'm using an odd editory, and thus
don't have the rules to cut down on auto-completion. But you can
simulate that by using 'ls' instead, or something similar.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>