A handful of code paths had to read the commit object more than
once when showing header fields that are usually not parsed. The
internal data structure to keep track of the contents of the commit
object has been updated to reduce the need for this double-reading,
and to allow the caller find the length of the object.
* jk/commit-buffer-length:
reuse cached commit buffer when parsing signatures
commit: record buffer length in cache
commit: convert commit->buffer to a slab
commit-slab: provide a static initializer
use get_commit_buffer everywhere
convert logmsg_reencode to get_commit_buffer
use get_commit_buffer to avoid duplicate code
use get_cached_commit_buffer where appropriate
provide helpers to access the commit buffer
provide a helper to set the commit buffer
provide a helper to free commit buffer
sequencer: use logmsg_reencode in get_message
logmsg_reencode: return const buffer
do not create "struct commit" with xcalloc
commit: push commit_index update into alloc_commit_node
alloc: include any-object allocations in alloc_report
replace dangerous uses of strbuf_attach
commit_tree: take a pointer/len pair rather than a const strbuf
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.
For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.
This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these sites assumes that commit->buffer is valid.
Since they would segfault if this was not the case, they are
likely to be correct in practice. However, we can
future-proof them by using get_commit_buffer.
And as a side effect, we abstract away the final bare uses
of commit->buffer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In both blame and merge-recursive, we sometimes create a
"fake" commit struct for convenience (e.g., to represent the
HEAD state as if we would commit it). By allocating
ourselves rather than using alloc_commit_node, we do not
properly set the "index" field of the commit. This can
produce subtle bugs if we then use commit-slab on the
resulting commit, as we will share the "0" index with
another commit.
We can fix this by using alloc_commit_node() to allocate.
Note that we cannot free the result, as it is part of our
commit allocator. However, both cases were already leaking
the allocated commit anyway, so there's nothing to fix up.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On a case-insensitive filesystem, when merging, a file would be
wrongly deleted from the working tree if an incoming commit had
renamed it changing only its case. When merging a rename, the file
with the old name would be deleted -- but since the filesystem
considers the old name to be the same as the new name, the new
file would in fact be deleted.
We avoid this by not deleting files that have a case-clone in the
index at stage 0.
Signed-off-by: David Turner <dturner@twitter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"merge-recursive" was broken in 1.7.7 era and stopped working in an
empty (temporary) working tree, when there are renames involved.
This has been corrected.
* bk/refresh-missing-ok-in-merge-recursive:
merge-recursive.c: tolerate missing files while refreshing index
read-cache.c: extend make_cache_entry refresh flag with options
read-cache.c: refactor --ignore-missing implementation
t3030-merge-recursive: test known breakage with empty work tree
Allow "merge-recursive" to work in an empty (temporary) working
tree again when there are renames involved, correcting an old
regression in 1.7.7 era.
* bk/refresh-missing-ok-in-merge-recursive:
merge-recursive.c: tolerate missing files while refreshing index
read-cache.c: extend make_cache_entry refresh flag with options
read-cache.c: refactor --ignore-missing implementation
t3030-merge-recursive: test known breakage with empty work tree
Teach add_cacheinfo to tell make_cache_entry to skip refreshing stat
information when a file is missing from the work tree. We do not want
the index to be stat-dirty after the merge but also do not want to fail
when a file happens to be missing.
This fixes the 'merge-recursive w/ empty work tree - ours has rename'
case in t3030-merge-recursive.
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the make_cache_entry boolean 'refresh' argument to a more
general 'refresh_options' argument. Pass the value through to the
underlying refresh_cache_ent call. Add option CE_MATCH_REFRESH to
enable stat refresh. Update call sites to use the new signature.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up and protection against concurrent write access to the
ref namespace.
* mh/safe-create-leading-directories:
rename_tmp_log(): on SCLD_VANISHED, retry
rename_tmp_log(): limit the number of remote_empty_directories() attempts
rename_tmp_log(): handle a possible mkdir/rmdir race
rename_ref(): extract function rename_tmp_log()
remove_dir_recurse(): handle disappearing files and directories
remove_dir_recurse(): tighten condition for removing unreadable dir
lock_ref_sha1_basic(): if locking fails with ENOENT, retry
lock_ref_sha1_basic(): on SCLD_VANISHED, retry
safe_create_leading_directories(): add new error value SCLD_VANISHED
cmd_init_db(): when creating directories, handle errors conservatively
safe_create_leading_directories(): introduce enum for return values
safe_create_leading_directories(): always restore slash at end of loop
safe_create_leading_directories(): split on first of multiple slashes
safe_create_leading_directories(): rename local variable
safe_create_leading_directories(): add explicit "slash" pointer
safe_create_leading_directories(): reduce scope of local variable
safe_create_leading_directories(): fix format of "if" chaining
Instead of returning magic integer values (which a couple of callers
go to the trouble of distinguishing), return values from an enum. Add
a docstring.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git merge-recursive" did not parse its "--diff-algorithm=" command
line option correctly.
* jk/diff-algo:
merge-recursive: fix parsing of "diff-algorithm" option
The "diff-algorithm" option to the recursive merge strategy takes the
name of the algorithm as an option, but it uses strcmp on the option
string to check if it starts with "diff-algorithm=", meaning that this
options cannot actually be used.
Fix this by switching to prefixcmp. At the same time, clarify the
following line by using strlen instead of a hard-coded length, which
also makes it consistent with nearby code.
Reported-by: Luke Noel-Storr <luke.noel-storr@integrate.co.uk>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
"git mv A B" when moving a submodule A does "the right thing",
inclusing relocating its working tree and adjusting the paths in
the .gitmodules file.
* jl/submodule-mv: (53 commits)
rm: delete .gitmodules entry of submodules removed from the work tree
mv: update the path entry in .gitmodules for moved submodules
submodule.c: add .gitmodules staging helper functions
mv: move submodules using a gitfile
mv: move submodules together with their work trees
rm: do not set a variable twice without intermediate reading.
t6131 - skip tests if on case-insensitive file system
parse_pathspec: accept :(icase)path syntax
pathspec: support :(glob) syntax
pathspec: make --literal-pathspecs disable pathspec magic
pathspec: support :(literal) syntax for noglob pathspec
kill limit_pathspec_to_literal() as it's only used by parse_pathspec()
parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN
parse_pathspec: make sure the prefix part is wildcard-free
rename field "raw" to "_raw" in struct pathspec
tree-diff: remove the use of pathspec's raw[] in follow-rename codepath
remove match_pathspec() in favor of match_pathspec_depth()
remove init_pathspec() in favor of parse_pathspec()
remove diff_tree_{setup,release}_paths
convert common_prefix() to use struct pathspec
...
While at there, move free_pathspec() to pathspec.c
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is
- diff-lib.c: setting CE_UPTODATE
- name-hash.c: setting CE_HASHED
- preload-index.c, read-cache.c, unpack-trees.c and
builtin/update-index: obvious
- entry.c: write_entry() may refresh the checked out entry via
fill_stat_cache_info(). This causes "non-const struct cache_entry
*" in builtin/apply.c, builtin/checkout-index.c and
builtin/checkout.c
- builtin/ls-files.c: --with-tree changes stagemask and may set
CE_UPDATE
Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.
So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.
The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:
diff --git a/cache.h b/cache.h
index 430d021..1692891 100644
--- a/cache.h
+++ b/cache.h
@@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
#define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
struct index_state {
- struct cache_entry **cache;
+ const struct cache_entry **cache;
unsigned int version;
unsigned int cache_nr, cache_alloc, cache_changed;
struct string_list *resolve_undo;
will help quickly identify them without bogus warnings.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since command line options have higher priority than config file
variables and taking previous commit into account, we need a way
how to specify myers algorithm on command line. However,
inventing `--myers` is not the right answer. We need far more
general option, and that is `--diff-algorithm`.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two different static functions and one global function,
all of them called "merge_file()", with different signatures and
purposes. Rename them all to reduce confusion in "git grep" output:
* Rename the static one in merge-index to "merge_one_path(const char
*path)" as that function is about asking an external command to
resolve conflicts in one path.
* Rename the global one in merge-file.c that is only used by
merge-tree to "merge_blobs()", as the function takes three blobs and
returns the merged result only in-core, without doing anything to
the filesystem.
* Rename the one in merge-recursive to "merge_one_file()", just to be
fair.
Also rename merge-file.[ch] to merge-blobs.[ch].
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rj/path-cleanup:
Call mkpathdup() rather than xstrdup(mkpath(...))
Call git_pathdup() rather than xstrdup(git_path("..."))
path.c: Use vsnpath() in the implementation of git_path()
path.c: Don't discard the return value of vsnpath()
path.c: Remove the 'git_' prefix from a file scope function
In addition to updating the xstrdup(mkpath(...)) call sites with
mkpathdup(), we also fix a memory leak (in merge_3way()) caused by
neglecting to free the memory allocated to the 'base_name' variable.
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function "merge_recursive" prints the count of common ancestors
as "found %u common ancestor(s):". We should use a singular and a
plural form of this message to help translators.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
flush_buffer() is a thin wrapper around write_in_full() with two very
confusing properties:
* It runs a loop to handle short reads, ensuring that we write
everything. But that is precisely what write_in_full() does!
* It checks for a return value of 0 from write_in_full(), which cannot
happen: it returns this value only if count=0, but flush_buffer()
will never call write_in_full() in this case.
Remove it.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09). The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.
Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.
Note that the function can still die().
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark strings in merge-recursive for translation.
Some tests would start to fail with GETTEXT_POISON turned on after
this update. Use test_i18ncmp and test_i18ngrep where appropriate
to mark strings that should only be checked in the C locale output
to avoid such issues.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Reviewed-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Forbids rename detection logic from matching two empty files as renames
during merge-recursive to prevent mismerges.
By Jeff King
* jk/diff-no-rename-empty:
merge-recursive: don't detect renames of empty files
teach diffcore-rename to optionally ignore empty content
make is_empty_blob_sha1 available everywhere
drop casts from users EMPTY_TREE_SHA1_BIN
Resurrects the preparatory clean-up patches from another topic that was
discarded, as this would give a saner foundation to build on diff.algo
configuration option series.
* jc/diff-algo-cleanup:
xdiff: PATIENCE/HISTOGRAM are not independent option bits
xdiff: remove XDL_PATCH_* macros
Merge-recursive detects renames so that if one side modifies
"foo" and the other side moves it to "bar", the modification
is applied to "bar". However, our rename detection is based
on content analysis, it can be wrong (i.e., two files were
not intended as a rename, but just happen to have the same
or similar content).
This is quite rare if the files actually contain content,
since two unrelated files are unlikely to have exactly the
same content. However, empty files present a problem, in
that there is nothing to analyze. An uninteresting
placeholder file with zero bytes may or may not be related
to a placeholder file with another name.
The result is that adding content to an empty file may cause
confusion if the other side of a merge removed it; your
content may end up in another random placeholder file that
was added.
Let's err on the side of caution and not consider empty
files as renames. This will cause a modify/delete conflict
on the merge, which will let the user sort it out
themselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro already evaluates to the correct type, as it
casts the string literal to "unsigned char *" itself
(and callers who want the literal can use the _LITERAL
form).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because the default Myers, patience and histogram algorithms cannot be in
effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not
independent bits. Instead of wasting one bit per algorithm, define a few
macros to access the few bits they occupy and update the code that access
them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/cache-tree:
reset: update cache-tree data when appropriate
commit: write cache-tree data when writing index anyway
Refactor cache_tree_update idiom from commit
Test the current state of the cache-tree optimization
Add test-scrap-cache-tree
We'll need to safely create or update the cache-tree data of the_index
from other places. While at it, give it an argument that lets us
silence the messages produced by unmerged entries (which prevent it
from working).
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge-recursive code uses the commit->util field directly to annotate
the commit objects given from the command line, i.e. the remote heads to
be merged, with a single string to be used to describe it in its trace
messages and conflict markers.
Correct this short-signtedness by redefining the field to be a pointer to
a structure "struct merge_remote_desc" that later enhancements can add
more information. Store the original objects we were told to merge in a
field "obj" in this struct, so that we can recover the tag we were told to
merge.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The submodule merge search is not useful during virtual merges because
the results cannot be used automatically. Furthermore any suggestions
made by the search may apply to commits different than HEAD:sub and
MERGE_HEAD:sub, thus confusing the user. Skip searching for submodule
merges during a virtual merge such as that between B and C while merging
the heads of:
B---BC
/ \ /
A X
\ / \
C---CB
Run the search only when the recursion level is zero (!o->call_depth).
This fixes known breakage tested in t7405-submodule-merge.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix another instance of a recursive merge incorrectly paying attention to
the working tree file during a virtual ancestor merge, that resulted in
spurious and useless "addinfo_cache failed" error message.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git term is 'working tree', so replace the most public references
to 'working copy'.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/merge-recursive-2: (57 commits)
merge-recursive: Don't re-sort a list whose order we depend upon
merge-recursive: Fix virtual merge base for rename/rename(1to2)/add-dest
t6036: criss-cross + rename/rename(1to2)/add-dest + simple modify
merge-recursive: Avoid unnecessary file rewrites
t6022: Additional tests checking for unnecessary updates of files
merge-recursive: Fix spurious 'refusing to lose untracked file...' messages
t6022: Add testcase for spurious "refusing to lose untracked" messages
t3030: fix accidental success in symlink rename
merge-recursive: Fix working copy handling for rename/rename/add/add
merge-recursive: add handling for rename/rename/add-dest/add-dest
merge-recursive: Have conflict_rename_delete reuse modify/delete code
merge-recursive: Make modify/delete handling code reusable
merge-recursive: Consider modifications in rename/rename(2to1) conflicts
merge-recursive: Create function for merging with branchname:file markers
merge-recursive: Record more data needed for merging with dual renames
merge-recursive: Defer rename/rename(2to1) handling until process_entry
merge-recursive: Small cleanups for conflict_rename_rename_1to2
merge-recursive: Fix rename/rename(1to2) resolution for virtual merge base
merge-recursive: Introduce a merge_file convenience function
merge-recursive: Fix modify/delete resolution in the recursive case
...
When this code was first written (v1.4.3-rc1~174^2~4, merge-recur: if
there is no common ancestor, fake empty one, 2006-08-09), everyone
needing a fake empty tree had to make her own, but ever since
v1.5.5-rc0~180^2~1 (2008-02-13), the object lookup machinery provides
a ready-made one. Use it.
This is just a simplification, though it also fixes a small leak
(since the tree in the virtual common ancestor commit is never freed).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In record_df_conflict_files() we would resort the entries list using
df_name_compare to get a convenient ordering. Unfortunately, this broke
assumptions of the get_renames() code (via string_list_lookup() calls)
which needed the list to be in the standard ordering. When those lookups
would fail, duplicate stage_data entries could be inserted, causing the
process_renames and process_entry code to fail (in particular, a path that
that process_renames had marked as processed would still be processed
anyway in process_entry due to the duplicate entry).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier in this series, the patch "merge-recursive: add handling for
rename/rename/add-dest/add-dest" added code to handle the rename on each
side of history also being involved in a rename/add conflict, but only
did so in the non-recursive case. Add code for the recursive case,
ensuring that the "added" files are not simply deleted.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Often times, a potential conflict at a path is resolved by merge-recursive
by using the content that was already present at that location. In such
cases, we do not want to overwrite the content that is already present, as
that could trigger unnecessary recompilations. One of the patches earlier
in this series ("merge-recursive: When we detect we can skip an update,
actually skip it") fixed the cases that involved content merges, but there
were a few other cases as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling update_stages() before update_file() can sometimes result in git
thinking the file being updated is untracked (whenever update_stages
moves it to stage 3). Reverse the call order, and add a big comment to
update_stages to hopefully prevent others from making the same mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If either side of a rename/rename(1to2) conflict is itself also involved
in a rename/add-dest conflict, then we need to make sure both the rename
and the added file appear in the working copy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each side of the rename in rename/rename(1to2) could potentially also be
involved in a rename/add conflict. Ensure stages for such conflicts are
also recorded.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>