Commit Graph

126 Commits

Author SHA1 Message Date
Jeff King
06f46f237a avoid "write_in_full(fd, buf, len) != len" pattern
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).

So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:

  1. It can do a funny signed/unsigned comparison. If your
     "len" is signed (e.g., a size_t) then the compiler will
     promote the "-1" to its unsigned variant.

     This works out for "!= len" (unless you really were
     trying to write the maximum size_t bytes), but is a
     bug if you check "< len" (an example of which was fixed
     recently in config.c).

     We should avoid promoting the mental model that you
     need to check the length at all, so that new sites are
     not tempted to copy us.

  2. Checking for a negative value is shorter to type,
     especially when the length is an expression.

  3. Linus says so. In d34cf19b89 (Clean up write_in_full()
     users, 2007-01-11), right after the write_in_full()
     semantics were changed, he wrote:

       I really wish every "write_in_full()" user would just
       check against "<0" now, but this fixes the nasty and
       stupid ones.

     Appeals to authority aside, this makes it clear that
     writing it this way does not have an intentional
     benefit. It's a historical curiosity that we never
     bothered to clean up (and which was undoubtedly
     cargo-culted into new sites).

So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).

[1] A careful reader may notice there is one way that
    write_in_full() can return a different value. If we ask
    write() to write N bytes and get a return value that is
    _larger_ than N, we could return a larger total. But
    besides the fact that this would imply a totally broken
    version of write(), it would already invoke undefined
    behavior. Our internal remaining counter is an unsigned
    size_t, which means that subtracting too many byte will
    wrap it around to a very large number. So we'll instantly
    begin reading off the end of the buffer, trying to write
    gigabytes (or petabytes) of data.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Junio C Hamano
50f03c6676 Merge branch 'ab/free-and-null'
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.

* ab/free-and-null:
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro
  coccinelle: make use of the "expression" FREE_AND_NULL() rule
  coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
  coccinelle: make use of the "type" FREE_AND_NULL() rule
  coccinelle: add a rule to make "type" code use FREE_AND_NULL()
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24 14:28:41 -07:00
Junio C Hamano
f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Ævar Arnfjörð Bjarmason
88ce3ef636 *.[ch] refactoring: make use of the FREE_AND_NULL() macro
Replace occurrences of `free(ptr); ptr = NULL` which weren't caught by
the coccinelle rule. These fall into two categories:

 - free/NULL assignments one after the other which coccinelle all put
   on one line, which is functionally equivalent code, but very ugly.

 - manually spotted occurrences where the NULL assignment isn't right
   after the free() call.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:09 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Nguyễn Thái Ngọc Duy
f7566f073f rerere.c: move error_errno() closer to the source system call
We are supposed to report errno from fopen(). fclose() between fopen()
and the report function could either change errno or reset it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:33:56 +09:00
Nguyễn Thái Ngọc Duy
5118d7f4e6 print errno when reporting a system call error
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:33:56 +09:00
Nguyễn Thái Ngọc Duy
e9d983f116 wrapper.c: add and use fopen_or_warn()
When fopen() returns NULL, it could be because the given path does not
exist, but it could also be some other errors and the caller has to
check. Add a wrapper so we don't have to repeat the same error check
everywhere.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:33:56 +09:00
Junio C Hamano
b3e83cc752 hold_locked_index(): align error handling with hold_lockfile_for_update()
Callers of the hold_locked_index() function pass 0 when they want to
prepare to write a new version of the index file without wishing to
die or emit an error message when the request fails (e.g. somebody
else already held the lock), and pass 1 when they want the call to
die upon failure.

This option is called LOCK_DIE_ON_ERROR by the underlying lockfile
API, and the hold_locked_index() function translates the paramter to
LOCK_DIE_ON_ERROR when calling the hold_lock_file_for_update().

Replace these hardcoded '1' with LOCK_DIE_ON_ERROR and stop
translating.  Callers other than the ones that are replaced with
this change pass '0' to the function; no behaviour change is
intended with this patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Among the callers of hold_locked_index() that passes 0:

 - diff.c::refresh_index_quietly() at the end of "git diff" is an
   opportunistic update; it leaks the lockfile structure but it is
   just before the program exits and nobody should care.

 - builtin/describe.c::cmd_describe(),
   builtin/commit.c::cmd_status(),
   sequencer.c::read_and_refresh_cache() are all opportunistic
   updates and they are OK.

 - builtin/update-index.c::cmd_update_index() takes a lock upfront
   but we may end up not needing to update the index (i.e. the
   entries may be fully up-to-date), in which case we do not need to
   issue an error upon failure to acquire the lock.  We do diagnose
   and die if we indeed need to update, so it is OK.

 - wt-status.c::require_clean_work_tree() IS BUGGY.  It asks
   silence, does not check the returned value.  Compare with
   callsites like cmd_describe() and cmd_status() to notice that it
   is wrong to call update_index_if_able() unconditionally.
2016-12-07 11:31:59 -08:00
brian m. carlson
99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
Junio C Hamano
ec34a8b135 Merge branch 'jc/rerere-multi'
* jc/rerere-multi:
  rerere: remove an null statement
  rerere: plug memory leaks upon "rerere forget" failure
2016-05-23 14:54:38 -07:00
Junio C Hamano
d9d501b068 rerere: remove an null statement
J6t spotted that previous commit added an empty statement by
mistake.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-19 12:51:22 -07:00
Junio C Hamano
40cfc95856 Merge branch 'nd/error-errno'
The code for warning_errno/die_errno has been refactored and a new
error_errno() reporting helper is introduced.

* nd/error-errno: (41 commits)
  wrapper.c: use warning_errno()
  vcs-svn: use error_errno()
  upload-pack.c: use error_errno()
  unpack-trees.c: use error_errno()
  transport-helper.c: use error_errno()
  sha1_file.c: use {error,die,warning}_errno()
  server-info.c: use error_errno()
  sequencer.c: use error_errno()
  run-command.c: use error_errno()
  rerere.c: use error_errno() and warning_errno()
  reachable.c: use error_errno()
  mailmap.c: use error_errno()
  ident.c: use warning_errno()
  http.c: use error_errno() and warning_errno()
  grep.c: use error_errno()
  gpg-interface.c: use error_errno()
  fast-import.c: use error_errno()
  entry.c: use error_errno()
  editor.c: use error_errno()
  diff-no-index.c: use error_errno()
  ...
2016-05-17 14:38:28 -07:00
Junio C Hamano
8f4496148b rerere: plug memory leaks upon "rerere forget" failure
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-11 16:25:32 -07:00
Nguyễn Thái Ngọc Duy
033e011e64 rerere.c: use error_errno() and warning_errno()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
Junio C Hamano
5b715ec48f Merge branch 'jc/rerere-multi'
"git rerere" can encounter two or more files with the same conflict
signature that have to be resolved in different ways, but there was
no way to record these separate resolutions.

* jc/rerere-multi:
  rerere: adjust 'forget' to multi-variant world order
  rerere: split code to call ll_merge() further
  rerere: move code related to "forget" together
  rerere: gc and clear
  rerere: do use multiple variants
  t4200: rerere a merge with two identical conflicts
  rerere: allow multiple variants to exist
  rerere: delay the recording of preimage
  rerere: handle leftover rr-cache/$ID directory and postimage files
  rerere: scan $GIT_DIR/rr-cache/$ID when instantiating a rerere_id
  rerere: split conflict ID further
2016-04-25 15:17:15 -07:00
Junio C Hamano
890fca84be rerere: adjust 'forget' to multi-variant world order
Because conflicts with the same contents inside conflict blocks
enclosed by "<<<<<<<" and ">>>>>>>" can now have multiple variants
to help three-way merge to adjust to the differences outside the
conflict blocks, "rerere forget $path" needs to be taught that there
may be multiple recorded resolutions that share the same conflict
hash (which groups the conflicts with "the same contents inside
conflict blocks"), among which there are some that would not be
relevant to the conflict we are looking at.  These "other variants"
that happen to share the same conflict hash should not be cleared,
and the variant that would apply to the current conflict may not be
the zero-th one (which is the only one that is cleared by the
current code).

After finding the conflict hash, iterate over the existing variants
and try to resolve the conflict using each of them to find the one
that "cleanly" resolves the current conflict.  That is the one we
want to forget and record the preimage for, so that the user can
record the corrected resolution.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-06 15:54:41 -07:00
Junio C Hamano
0ce02b3620 rerere: split code to call ll_merge() further
The merge() helper function is given an existing rerere ID (i.e. the
name of the .git/rr-cache/* subdirectory, and the variant number)
that identifies one <preimage, postimage> pair, try to see if the
conflicted state in the given path can be resolved by using the pair,
and if this succeeds, then update the conflicted path with the
result in the working tree.

To implement rerere_forget() in the multiple variant world, we'd
need a helper to do the "see if a <preimage, postimage> pair cleanly
resolves a conflicted state we have in-core" part, without actually
touching any file in the working tree, in order to identify which
variant(s) to remove.  Split the logic to do so into a separate
helper function try_merge() out of merge().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-06 15:52:40 -07:00
Junio C Hamano
3d730ed9b2 rerere: move code related to "forget" together
"rerere forget" is the only user of handle_cache() helper, which in
turn is the only user of rerere_io that reads from an in-core buffer
whose getline method is implemented as rerere_mem_getline().  Gather
them together.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-06 15:52:40 -07:00
Junio C Hamano
1be1e85115 rerere: gc and clear
Adjust "git rerere gc" and "git rerere clear" to the new world order
with rerere database with multiple variants for the same shape of
conflicts.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-06 15:52:22 -07:00
Junio C Hamano
629716d256 rerere: do use multiple variants
This enables the multiple-variant support for real.  Multiple
conflicts of the same shape can have differences in contexts where
they appear, interfering the replaying of recorded resolution of one
conflict to another, and in such a case, their resolutions are
recorded as different variants under the same conflict ID.

We still need to adjust garbage collection codepaths for this
change, but the basic "replay" functionality is functional with
this change.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-15 15:32:40 -07:00
Junio C Hamano
a13d13700b rerere: allow multiple variants to exist
The shape of the conflict in a path determines the conflict ID.  The
preimage and postimage pair that was recorded for the conflict ID
previously may or may not replay well for the conflict we just saw.

Currently, we punt when the previous resolution does not cleanly
replay, but ideally we should then be able to record the currently
conflicted path by assigning a new 'variant', and then record the
resolution the user is going to make.

Introduce a mechanism to have more than one variant for a given
conflict ID; we do not actually assign any variant other than 0th
variant yet at this step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-15 15:30:58 -07:00
Junio C Hamano
c0a5423b6f rerere: delay the recording of preimage
We record the preimage only when there is no directory to record the
conflict we encountered, i.e. when $GIT_DIR/rr-cache/$ID does not
exist.  As the plan is to allow multiple <preimage,postimage> pairs
as variants for the same conflict ID eventually, this logic needs to
go.

As the first step in that direction, stop the "did we create the
directory?  Then we record the preimage" logic.  Instead, we record
if a preimage does not exist when we saw a conflict in a path.  Also
make sure that we remove a stale postimage, which most likely is
totally unrelated to the resolution of this new conflict, when we
create a new preimage under $ID when $GIT_DIR/rr-cache/$ID already
exists.

In later patches, we will further update this logic to be "do we
have <preimage,postimage> pair that cleanly resolve the current
conflicts?  If not, record a new preimage as a new variant", but
that does not happen at this stage yet.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-15 15:29:54 -07:00
Junio C Hamano
05dd9f139d rerere: handle leftover rr-cache/$ID directory and postimage files
If by some accident there is only $GIT_DIR/rr-cache/$ID directory
existed, we wouldn't have recorded a preimage for a conflict that
is newly encountered, which would mean after a manual resolution,
we wouldn't have recorded it by storing the postimage, because the
logic used to be "if there is no rr-cache/$ID directory, then we are
the first so record the preimage".  Instead, record preimage if we
do not have one.

In addition, if there is only $GIT_DIR/rr-cache/$ID/postimage
without corresponding preimage, we would have tried to call into
merge() and punted.

These would have been a situation frustratingly hard to recover
from.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-15 15:29:30 -07:00
Junio C Hamano
b1a90b68cf Merge branch 'jk/rerere-xsnprintf'
Some calls to strcpy(3) triggers a false warning from static
analysers that are less intelligent than humans, and reducing the
number of these false hits helps us notice real issues.  A few
calls to strcpy(3) in "git rerere" that are already safe has been
rewritten to avoid false wanings.

* jk/rerere-xsnprintf:
  rerere: replace strcpy with xsnprintf
2016-02-17 10:13:33 -08:00
Junio C Hamano
2c7929b133 rerere: scan $GIT_DIR/rr-cache/$ID when instantiating a rerere_id
This will help fixing bootstrap corner-case issues, e.g. having an
empty $GIT_DIR/rr-cache/$ID directory would fail to record a
preimage, in later changes in this series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-08 15:01:19 -08:00
Junio C Hamano
1869bbe1ce rerere: split conflict ID further
The plan is to keep assigning the backward compatible conflict ID
based on the hash of the (normalized) text of conflicts, keep using
that conflict ID as the directory name under $GIT_DIR/rr-cache/, but
allow each conflicted path to use a separate "variant" to record
resolutions, i.e. having more than one <preimage,postimage> pairs
under $GIT_DIR/rr-cache/$ID/ directory.  As the first step in that
direction, separate the shared "conflict ID" out of the rerere_id
structure.

The plan is to keep information per $ID in rerere_dir, that can be
shared among rerere_id that is per conflicted path.

When we are done with rerere(), which can be directly called from
other programs like "git apply", "git commit" and "git merge", the
shared rerere_dir structures can be freed entirely, so they are not
reference-counted and they are not freed when we release rerere_id's
that reference them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-08 15:01:15 -08:00
Jeff King
f58316db0e rerere: replace strcpy with xsnprintf
This shouldn't overflow, as we are copying a sha1 hex into a
41-byte buffer. But it does not hurt to use a bound-checking
function, which protects us and makes auditing for overflows
easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-08 14:55:28 -08:00
Junio C Hamano
dc5400e11d Merge branch 'jc/rerere'
Code clean-up and minor fixes.

* jc/rerere: (21 commits)
  rerere: un-nest merge() further
  rerere: use "struct rerere_id" instead of "char *" for conflict ID
  rerere: call conflict-ids IDs
  rerere: further clarify do_rerere_one_path()
  rerere: further de-dent do_plain_rerere()
  rerere: refactor "replay" part of do_plain_rerere()
  rerere: explain the remainder
  rerere: explain "rerere forget" codepath
  rerere: explain the primary codepath
  rerere: explain MERGE_RR management helpers
  rerere: fix benign off-by-one non-bug and clarify code
  rerere: explain the rerere I/O abstraction
  rerere: do not leak mmfile[] for a path with multiple stage #1 entries
  rerere: stop looping unnecessarily
  rerere: drop want_sp parameter from is_cmarker()
  rerere: report autoupdated paths only after actually updating them
  rerere: write out each record of MERGE_RR in one go
  rerere: lift PATH_MAX limitation
  rerere: plug conflict ID leaks
  rerere: handle conflicts with multiple stage #1 entries
  ...
2015-10-05 12:30:05 -07:00
Jeff King
9dd330e6ca rerere: release lockfile in non-writing functions
There's a bug in builtin/am.c in which we take a lock on
MERGE_RR recursively. But rather than fix am.c, this patch
fixes the confusing interface from rerere.c that caused the
bug. Read on for the gory details.

The setup_rerere() function both reads the existing MERGE_RR
file, and takes MERGE_RR.lock. In the rerere() and
rerere_forget() functions, we end up in write_rr(), which
will then commit the lock file.

But for functions like rerere_clear() that do not write to
MERGE_RR, we expect the caller to have handled
setup_rerere(). That caller would then need to release the
lockfile, but it can't; the lock struct is local to
rerere.c.

For builtin/rerere.c, this is OK. We run a single rerere
operation and then exit immediately, which has the side
effect of rolling back the lockfile.

But in builtin/am.c, this is actively wrong. If we run "git
am -3 --skip", we call setup-rerere twice without releasing
the lock:

  1. The "--skip" causes us to call am_rerere_clear(), which
     calls setup_rerere(), but never drops the lock.

  2. We then proceed to the next patch.

  3. The "--3way" may cause us to call rerere() to handle
     conflicts in that patch, but we are already holding the
     lock. The lockfile code dies with:

     BUG: prepare_tempfile_object called for active object

We could fix this by having rerere_clear() call
rollback_lock_file(). But it feels a bit odd for it to roll
back a lockfile that it did not itself take. So let's
simplify the interface further, and handle setup_rerere in
the function itself, taking away the question from the
caller over whether they need to do so.

We can give rerere_gc() the same treatment, as well (even
though it doesn't have any callers besides builtin/rerere.c
at this point). Note that these functions don't take flags
from their callers to pass along to setup_rerere; that's OK,
because the flags would not be meaningful for what they are
doing.

Both of those functions need to hold the lock because even
though they do not write to MERGE_RR, they are still writing
and should be protected from a simultaneous "rerere" run.
But rerere_remaining(), "rerere diff", and "rerere status"
are all read-only operations. They want to setup_rerere(),
but do not care about taking the lock in the first place.
Since our update of MERGE_RR is the usual atomic rename done
by commit_lock_file, they can just do a lockless read. For
that, we teach setup_rerere a READONLY flag to avoid the
lock.

As a bonus, this pushes builtin/rerere.c's setup_rerere call
closer to the functions that use it. Which means that "git
rerere totally-bogus-command" will no longer silently
exit(0) in a repository without rerere enabled.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-01 15:52:54 -07:00
Jeff King
f932729cc7 memoize common git-path "constant" files
One of the most common uses of git_path() is to pass a
constant, like git_path("MERGE_MSG"). This has two
drawbacks:

  1. The return value is a static buffer, and the lifetime
     is dependent on other calls to git_path, etc.

  2. There's no compile-time checking of the pathname. This
     is OK for a one-off (after all, we have to spell it
     correctly at least once), but many of these constant
     strings appear throughout the code.

This patch introduces a series of functions to "memoize"
these strings, which are essentially globals for the
lifetime of the program. We compute the value once, take
ownership of the buffer, and return the cached value for
subsequent calls.  cache.h provides a helper macro for
defining these functions as one-liners, and defines a few
common ones for global use.

Using a macro is a little bit gross, but it does nicely
document the purpose of the functions. If we need to touch
them all later (e.g., because we learned how to change the
git_dir variable at runtime, and need to invalidate all of
the stored values), it will be much easier to have the
complete list.

Note that the shared-global functions have separate, manual
declarations. We could do something clever with the macros
(e.g., expand it to a declaration in some places, and a
declaration _and_ a definition in path.c). But there aren't
that many, and it's probably better to stay away from
too-magical macros.

Likewise, if we abandon the C preprocessor in favor of
generating these with a script, we could get much fancier.
E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz".
But the small amount of saved typing is probably not worth
the resulting confusion to readers who want to grep for the
function's definition.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:14 -07:00
Junio C Hamano
15ed07d532 rerere: un-nest merge() further
By consistently using "upon failure, set 'ret' and jump to out"
pattern, flatten the function further.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:05:40 -07:00
Junio C Hamano
1d51eced10 rerere: use "struct rerere_id" instead of "char *" for conflict ID
This gives a thin abstraction between the conflict ID that is a hash
value obtained by inspecting the conflicts and the name of the
directory under $GIT_DIR/rr-cache/, in which the previous resolution
is recorded to be replayed.  The plan is to make sure that the
presence of the directory does not imply the presense of a previous
resolution and vice-versa, and later allow us to have more than one
pair of <preimage, postimage> for a given conflict ID.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:05:26 -07:00
Junio C Hamano
18bb99342f rerere: call conflict-ids IDs
Most places we call conflict IDs "name" and some others we call them
"hex"; update all of them to "id".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:04:41 -07:00
Junio C Hamano
925d73c421 rerere: further clarify do_rerere_one_path()
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:03:56 -07:00
Junio C Hamano
c7a25d3790 rerere: further de-dent do_plain_rerere()
It's just easier to follow this way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:47 -07:00
Junio C Hamano
8e7768b2de rerere: refactor "replay" part of do_plain_rerere()
Extract the body of a loop that attempts to replay recorded
resolution for each conflicted path into a helper function, not
because I want to call it from multiple places later, but because
the logic has become too deeply nested and hard to read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:37 -07:00
Junio C Hamano
e828de826b rerere: explain the remainder
Explain the internals of rerere as in-code comments, while
sprinkling "NEEDSWORK" comment to highlight iffy bits and
questionable assumptions.

This covers the codepath that implements "rerere gc" and "rerere
clear".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:31 -07:00
Junio C Hamano
963ec00356 rerere: explain "rerere forget" codepath
Explain the internals of rerere as in-code comments, while
sprinkling "NEEDSWORK" comment to highlight iffy bits and
questionable assumptions.

This covers the codepath that implements "rerere forget".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:18 -07:00
Junio C Hamano
cc899eca55 rerere: explain the primary codepath
Explain the internals of rerere as in-code comments, while
sprinkling "NEEDSWORK" comment to highlight iffy bits and
questionable assumptions.

This one covers the codepath reached from rerere(), the primary
interface to the subsystem.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:17 -07:00
Junio C Hamano
4b68c2a087 rerere: explain MERGE_RR management helpers
Explain the internals of rerere as in-code comments, while
sprinkling "NEEDSWORK" comment to highlight iffy bits and
questionable assumptions.

This one covers the "$GIT_DIR/MERGE_RR" file and in-core merge_rr
that are used to keep track of the status of "rerere" session in
progress.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:06 -07:00
Junio C Hamano
d3c2749def rerere: fix benign off-by-one non-bug and clarify code
rerere_io_putconflict() wants to use a limited fixed-sized buf[] on
stack repeatedly to formulate a longer string, but its implementation
is doubly confusing:

 * When it knows that the whole thing fits in buf[], it wants to
   fill early part of buf[] with conflict marker characters,
   followed by a LF and a NUL.  It miscounts the size of the buffer
   by 1 and does not use the last byte of buf[].

 * When it needs to show only the early part of a long conflict
   marker string (because the whole thing does not fit in buf[]), it
   adjusts the number of bytes shown in the current round in a
   strange-looking way.  It makes sure that this round does not emit
   all bytes and leaves at least one byte to the next round, so that
   "it all fits" case will pick up the rest and show the terminating
   LF.  While this is correct, one needs to stop and think for a
   while to realize why it is correct without an explanation.

Fix the benign off-by-one, and add comments to explain the
strange-looking size adjustment.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 16:02:02 -07:00
Junio C Hamano
a96847cc16 rerere: explain the rerere I/O abstraction
Explain the internals of rerere as in-code comments.

This one covers our thin I/O abstraction to read from either
a file or a memory while optionally writing out to a file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:11:05 -07:00
Junio C Hamano
7d4053b69b rerere: do not leak mmfile[] for a path with multiple stage #1 entries
A conflicted index can have multiple stage #1 entries when dealing
with a criss-cross merge and using the "resolve" merge strategy.

Plug the leak by reading only the first one of the same stage
entries.

Strictly speaking, this fix does change the semantics, in that we
used to use the last stage #1 entry as the common ancestor when
doing the plain-vanilla three-way merge, but with the leak fix, we
will use the first stage #1 entry.  But it is not a grave backward
compatibility breakage.  Either way, we are arbitrarily picking one
of multiple stage #1 entries and using it, ignoring others, and
there is no meaning in the ordering of these stage #1 entries.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:11:02 -07:00
Junio C Hamano
74444d4ec4 rerere: stop looping unnecessarily
handle_cache() loops 3 times starting from an index entry that is
unmerged, while ignoring an entry for a path that is different from
what we are looking for.

As the index is sorted, once we see a different path, we know we saw
all stages for the path we are interested in.  Just loop while we
see the same path and then break, instead of continuing for 3 times.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:09:32 -07:00
Junio C Hamano
67711cdc39 rerere: drop want_sp parameter from is_cmarker()
As the nature of the conflict marker line determines if there should
be a SP and label after it, the caller shouldn't have to pass the
parameter redundantly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:09:07 -07:00
Junio C Hamano
a14c7ab8f5 rerere: report autoupdated paths only after actually updating them
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:08:44 -07:00
Junio C Hamano
e2cb6a950b rerere: write out each record of MERGE_RR in one go
Instead of writing the hash for a conflict, a HT, and the path
with three separate write_in_full() calls, format them into a
single record into a strbuf and write it out in one go.

As a more recent "rerere remaining" codepath abuses the .util field
of the merge_rr data to store a sentinel token, make sure that
codepath does not call into this function (of course, "remaining" is
a read-only operation and currently does not call it).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:08:27 -07:00
Junio C Hamano
f5800f6ad8 rerere: lift PATH_MAX limitation
The MERGE_RR file records a collection of NUL-terminated entries,
each of which consists of

 - a hash that identifies the conflict
 - a HT
 - the pathname

We used to read this piece-by-piece, and worse yet, read the
pathname part a byte at a time into a fixed buffer of size PATH_MAX.

Instead, read a whole entry using strbuf_getwholeline() and parse
out the fields.  This way, we issue fewer read(2) calls and more
importantly we do not have to limit the pathname to PATH_MAX.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:08:23 -07:00
Junio C Hamano
8d9b5a4ada rerere: plug conflict ID leaks
The merge_rr string list stores the conflict ID (a hexadecimal
string that is used to index into $GIT_DIR/rr-cache) in the .util
field of its elements, and when do_plain_rerere() resolves a
conflict, the field is cleared.  Also, when rerere_forget()
recomputes the conflict ID to updates the preimage file, the
conflict ID for the path is updated.

We forgot to free the existing conflict ID when we did these two
operations.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-24 15:08:22 -07:00