In 224c7d70fa (mingw: only test index entries for backslashes, not tree
entries, 2019-12-31), we relaxed the check for backslashes in tree
entries to check only index entries.
However, the code change was incorrect: it was added to
`add_index_entry_with_check()`, not to `add_index_entry()`, so under
certain circumstances it was possible to side-step the protection.
Besides, the description of that commit purported that all index entries
would be checked when in fact they were only checked when being added to
the index (there are code paths that do not do that, constructing
"transient" index entries).
In any case, it was pointed out in one insightful review at
https://github.com/git-for-windows/git/pull/2437#issuecomment-566771835
that it would be a much better idea to teach `verify_path()` to perform
the check for a backslash. This is safer, even if it comes with two
notable drawbacks:
- `verify_path()` cannot say _what_ is wrong with the path, therefore
the user will no longer be told that there was a backslash in the
path, only that the path was invalid.
- The `git apply` command also calls the `verify_path()` function, and
might have been able to handle Windows-style paths (i.e. with
backslashes instead of forward slashes). This will no longer be
possible unless the user (temporarily) sets `core.protectNTFS=false`.
Note that `git add <windows-path>` will _still_ work because
`normalize_path_copy_len()` will convert the backslashes to forward
slashes before hitting the code path that creates an index entry.
The clear advantage is that `verify_path()`'s purpose is to check the
validity of the file name, therefore we naturally tap into all the code
paths that need safeguarding, also implicitly into future code paths.
The benefits of that approach outweigh the downsides, so let's move the
check from `add_index_entry_with_check()` to `verify_path()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The clear_ce_flags_dir() method processes the cache entries within
a common directory. The returned int is the number of cache entries
processed by that directory. When using the sparse-checkout feature
in cone mode, we can skip the pattern matching for entries in the
directories that are entirely included or entirely excluded.
eb42feca (unpack-trees: hash less in cone mode, 2019-11-21)
introduced this performance feature. The old mechanism relied on
the counts returned by calling clear_ce_flags_1(), but the new
mechanism calculated the number of rows by subtracting "cache_end"
from "cache" to find the size of the range. However, the equation
is wrong because it divides by sizeof(struct cache_entry *). This
is not how pointer arithmetic works!
A coverity build of Git for Windows in preparation for the 2.25.0
release found this issue with the warning, "Pointer differences,
such as cache_end - cache, are automatically scaled down by the
size (8 bytes) of the pointed-to type (struct cache_entry *).
Most likely, the division by sizeof(struct cache_entry *) is
extraneous and should be eliminated." This warning is correct.
This leaves us with the question "how did this even work?" The
problem that occurs with this incorrect pointer arithmetic is
a performance-only bug, and a very slight one at that. Since
the entry count returned by clear_ce_flags_dir() is reduced by
a factor of 8, the loop in clear_ce_flags_1() will re-process
entries from those directories.
By inserting global counters into unpack-tree.c and tracing
them with trace2_data_intmax() (in a private change, for
testing), I was able to see count how many times the loop inside
clear_ce_flags_1() processed an entry and how many times
clear_ce_flags_dir() was called. Each of these are reduced by at
least a factor of 8 with the current change. A factor larger
than 8 happens when multiple levels of directories are repeated.
Specifically, in the Linux kernel repo, the command
git sparse-checkout set LICENSES
restricts the working directory to only the files at root and
in the LICENSES directory. Here are the measured counts:
clear_ce_flags_1 loop blocks:
Before: 11,520
After: 1,621
clear_ce_flags_dir calls:
Before: 7,048
After: 606
While these are dramatic counts, the time spent in
clear_ce_flags_1() is under one millisecond in each case, so
the improvement is not measurable as an end-to-end time.
Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The english term generation is here not used in the sense of "to
generate" but in the sense of "generations of beings".
This corrects the initial translation from cf4c0c25 (l10n: update German
translation, 2018-12-06).
Fixed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
The whole submoduleAlternateErrorStrategyDie item is interpreted as
being part of the supporting content of the preceding item. This is
because we don't give a double-colon "::" for the separator, but just a
single colon, ":". Let's fix that.
There are a few other matches for [^:]:\s*$ in Documentation/config, but
I didn't spot any similar bugs among them.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since recent updates to the log graph rendering code, drawing
certain merges started triggering an assert on a condition that
would no longer hold true, which has been corrected.
* ds/graph-assert-fix:
graph: fix lack of color in horizontal lines
graph: drop assert() for merge with two collapsing parents
* https://github.com/prati0100/git-gui:
git-gui: allow opening currently selected file in default app
git-gui: allow closing console window with Escape
git gui: fix branch name encoding error
git-gui: revert untracked files by deleting them
git-gui: update status bar to track operations
git-gui: consolidate naming conventions
In some cases, horizontal lines in rendered graphs can lose their
coloring. This is due to a use of graph_line_addch() instead of
graph_line_write_column(). Using a ternary operator to pick the
character is nice for compact code, but we actually need a column to
provide the color.
Add a test to t4215-log-skewed-merges.sh to prevent regression.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git log --graph" shows a merge commit that has two collapsing
lines, like:
| | | | *
| |_|_|/|
|/| | |/
| | |/|
| |/| |
| * | |
* | | |
we trigger an assert():
graph.c:1228: graph_output_collapsing_line: Assertion
`graph->mapping[i - 3] == target' failed.
The assert was introduced by eaf158f8 ("graph API: Use horizontal
lines for more compact graphs", 2009-04-21), which is quite old.
This assert is trying to say that when we complete a horizontal
line with a single slash, it is because we have reached our target.
It is actually the _second_ collapsing line that hits this assert.
The reason we are in this code path is because we are collapsing
the first line, and in that case we are hitting our target now
that the horizontal line is complete. However, the second line
cannot be a horizontal line, so it will collapse without horizontal
lines. In this case, it is inappropriate to assert that we have
reached our target, as we need to continue for another column
before reaching the target. Dropping the assert is safe here.
The new behavior in 0f0f389f12 (graph: tidy up display of
left-skewed merges, 2019-10-15) caused the behavior change that
made this assertion failure possible. In addition to making the
assert possible, it also changed how multiple edges collapse.
In a larger example, the current code will output a collapse
as follows:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | | | |/| /
| | | |/| |/
| | |/| |/|
| |/| |/| |
| | |/| | |
| | * | | |
However, the intended collapse should allow multiple horizontal lines
as follows:
| | | | | | *
| |_|_|_|_|/|\
|/| | | | |/ /
| | |_|_|/| /
| |/| | | |/
| | | |_|/|
| | |/| | |
| | * | | |
This behavior is not corrected by this change, but is noted for a later
update.
Helped-by: Jeff King <peff@peff.net>
Reported-by: Bradley Smith <brad@brad-smith.co.uk>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since ba227857d2 (Reduce the number of connects when fetching,
2008-02-04), when we disconnect a git transport, we send a final flush
packet. This cleanly tells the other side that we're done, and avoids
the other side complaining "the remote end hung up unexpectedly" (though
we'd only see that for transports that pass along the server stderr,
like ssh or local-host).
But when we've initiated a v2 stateless-connect session over a transport
helper, there's no point in sending this flush packet. Each operation
we've performed is self-contained, and the other side is fine with us
hanging up between operations.
But much worse, by sending the flush packet we may cause the helper to
issue an entirely new request _just_ to send the flush packet. So we can
incur an extra network request just to say "by the way, we have nothing
more to send".
Let's drop this extra flush packet. As the test shows, this reduces the
number of POSTs required for a v2 ls-remote over http from 2 to 1.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's possible in a case where the index file contains a tree extension
but no blobs within that tree exist for index_pos_by_traverse_info() to
segfault. If the name_entry passed into index_pos_by_traverse_info() has
no blobs inside, AND is alphabetically later than all blobs currently in
the index file, index_pos_by_traverse_info() will segfault. For example,
an index file which looks something like this:
aaa#0
bbb/aaa#0
[Extensions]
TREE: zzz
In this example, 'index_name_pos(..., "zzz/", ...)' will return '-4',
indicating that "zzz/" could be inserted at position 3. However, when
the checks which ensure that the insertion position of "zzz/" look for a
blob at that position beginning with "zzz/", the index cache is accessed
out of range, causing a segfault.
This kind of index state is not typically generated during user
operations, and is in fact an edge case of the state being checked for
in the conditional where it was added. However, since the entry for the
BUG() line is ambiguous, tell some additional context to help Git
developers debug the failure later. When we know the name of the dir we
were trying to look up, it becomes possible to examine the index file
in a hex util to determine what went wrong; the position gives a hint
about where to start looking.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git restore --staged <path>" removes a path that's in the index,
it marks the entry with CE_REMOVE, but we don't do anything to
invalidate the cache-tree. In the non-staged case, we end up in
checkout_worktree(), which calls remove_marked_cache_entries(). That
actually drops the entries from the index, as well as invalidating the
cache-tree and untracked-cache.
But with --staged, we never call checkout_worktree(), and the CE_REMOVE
entries remain. Interestingly, they are dropped when we write out the
index, but that means the resulting index is inconsistent: its
cache-tree will not match the actual entries, and running "git commit"
immediately after will create the wrong tree.
We can solve this by calling remove_marked_cache_entries() ourselves
before writing out the index. Note that we can't just hoist it out of
checkout_worktree(); that function needs to iterate over the CE_REMOVE
entries (to drop their matching worktree files) before removing them.
One curiosity about the test: without this patch, it actually triggers a
BUG() when running git-restore:
BUG: cache-tree.c:810: new1 with flags 0x4420000 should not be in cache-tree
But in the original problem report, which used a similar recipe,
git-restore actually creates the bogus index (and the commit is created
with the wrong tree). I'm not sure why the test here behaves differently
than my out-of-suite reproduction, but what's here should catch either
symptom (and the fix corrects both cases).
Reported-by: Torsten Krah <krah.tm@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 328c6cb853 (doc: promote "git switch", 2019-03-29), an example
was changed to use "git switch" rather than "git checkout" but an
instance of "git checkout" in the explanatory text preceding the
example was overlooked. Fix this oversight.
Signed-off-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In this function, we free the pointer we get from locate_in_PATH and
then check whether it's NULL. However, this is undefined behavior if
the pointer is non-NULL, since the C standard no longer permits us to
use a valid pointer after freeing it.
The only case in which the C standard would permit this to be defined
behavior is if r were NULL, since it states that in such a case "no
action occurs" as a result of calling free.
It's easy to suggest that this is not likely to be a problem, but we
know that GCC does aggressively exploit the fact that undefined
behavior can never occur to optimize and rewrite code, even when that's
contrary to the expectations of the programmer. It is, in fact, very
common for it to omit NULL pointer checks, just as we have here.
Since it's easy to fix, let's do so, and avoid a potential headache in
the future.
Noticed-by: Miriam R. <mirucam@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 1959bf6430 (string_list API: document what "sorted" means,
2012-09-17), Documentation/technical/api-string-list.txt was updated to
specify that strcmp() was used for sorting. In commit 8dd5afc926
(string-list: allow case-insensitive string list, 2013-01-07), a cmp
member was added to struct string_list to allow callers to specify an
alternative comparison function, but api-string-list.txt was not
updated. In commit 4f665f2cf3 (string-list.h: move documentation from
Documentation/api/ into header, 2017-09-26), the now out-dated
api-string-list.txt documentation was moved into string-list.h. Update
the docs to reflect the configurability of sorting.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to write split commit-graph file(s) upon fetching computed
bogus value for the parameter used in splitting the resulting
files, which has been corrected.
* ds/commit-graph-set-size-mult:
commit-graph: prefer default size_mult when given zero
"git sparse-checkout list" subcommand learned to give its output in
a more concise form when the "cone" mode is in effect.
* ds/sparse-list-in-cone-mode:
sparse-checkout: document interactions with submodules
sparse-checkout: list directories in cone mode
An earlier update to Git for Windows declared that a tree object is
invalid if it has a path component with backslash in it, which was
overly strict, which has been corrected. The only protection the
Windows users need is to prevent such path (or any path that their
filesystem cannot check out) from entering the index.
* js/mingw-loosen-overstrict-tree-entry-checks:
mingw: only test index entries for backslashes, not tree entries
The sentence wants to talk about the superproject's possesive, not plural form.
Signed-off-by: Thomas Menzel <dev@tomsit.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to "From:" and "Subject:" already mentioned in the
documentation, "Date:" can also appear as an in-body header
to override the value in the e-mail headers. Document it.
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's core.multiPackIndex, not pack.multiIndex.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This typo was introduced in 94c0956b60 (sparse-checkout: create builtin
with 'list' subcommand, 2019-11-21).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow opening the currently selected file in its default app by clicking
on its name.
* zs/open-current-file:
git-gui: allow opening currently selected file in default app
In 50f26bd ("fetch: add fetch.writeCommitGraph config setting",
2019-09-02), the fetch builtin added the capability to write a
commit-graph using the "--split" feature. This feature creates
multiple commit-graph files, and those can merge based on a set
of "split options" including a size multiple. The default size
multiple is 2, which intends to provide a log_2 N depth of the
commit-graph chain where N is the number of commits.
However, I noticed during dogfooding that my commit-graph chains
were becoming quite large when left only to builds by 'git fetch'.
It turns out that in split_graph_merge_strategy(), we default the
size_mult variable to 2 except we override it with the context's
split_opts if they exist. In builtin/fetch.c, we create such a
split_opts, but do not populate it with values.
This problem is due to two failures:
1. It is unclear that we can add the flag COMMIT_GRAPH_WRITE_SPLIT
with a NULL split_opts.
2. If we have a non-NULL split_opts, then we override the default
values even if a zero value is given.
Correct both of these issues. First, do not override size_mult when
the options provide a zero value. Second, stop creating a split_opts
in the fetch builtin.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During a clone of a repository that contained a file with a backslash in
its name in the past, as of v2.24.1(2), Git for Windows prints errors
like this:
error: filename in tree entry contains backslash: '\'
The idea is to prevent Git from even trying to write files with
backslashes in their file names: while these characters are valid in
file names on other platforms, on Windows it is interpreted as directory
separator (which would obviously lead to ambiguities, e.g. when there is
a file `a\b` and there is also a file `a/b`).
Arguably, this is the wrong layer for that error: As long as the user
never checks out the files whose names contain backslashes, there should
not be any problem in the first place.
So let's loosen the requirements: we now leave tree entries with
backslashes in their file names alone, but we do require any entries
that are added to the Git index to contain no backslashes on Windows.
Note: just as before, the check is guarded by `core.protectNTFS` (to
allow overriding the check by toggling that config setting), and it
is _only_ performed on Windows, as the backslash is not a directory
separator elsewhere, even when writing to NTFS-formatted volumes.
An alternative approach would be to try to prevent creating files with
backslashes in their file names. However, that comes with its own set of
problems. For example, `git config -f C:\ProgramData\Git\config ...` is
a very valid way to specify a custom config location, and we obviously
do _not_ want to prevent that. Therefore, the approach chosen in this
patch would appear to be better.
This addresses https://github.com/git-for-windows/git/issues/2435
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a partial clone, if a user provides the hash of the empty tree ("git
mktree </dev/null" - for SHA-1, this is 4b825d...) to a command which
requires that that object be parsed, for example:
git diff-tree 4b825d <a non-empty tree>
then Git will lazily fetch the empty tree, unnecessarily, because
parsing of that object invokes repo_has_object_file(), which does not
special-case the empty tree.
Instead, teach repo_has_object_file() to consult find_cached_object()
(which handles the empty tree), thus bringing it in line with the rest
of the object-store-accessing functions. A cost is that
repo_has_object_file() will now need to oideq upon each invocation, but
that is trivial compared to the filesystem lookup or the pack index
search required anyway. (And if find_cached_object() needs to do more
because of previous invocations to pretend_object_file(), all the more
reason to be consistent in whether we present cached objects.)
As a historical note, the function now known as repo_read_object_file()
was taught the empty tree in 346245a1bb ("hard-code the empty tree
object", 2008-02-13), and the function now known as oid_object_info()
was taught the empty tree in c4d9986f5f ("sha1_object_info: examine
cached_object store too", 2011-02-07). repo_has_object_file() was never
updated, perhaps due to oversight. The flag OBJECT_INFO_SKIP_CACHED,
introduced later in dfdd4afcf9 ("sha1_file: teach
sha1_object_info_extended more flags", 2017-06-26) and used in
e83e71c5e1 ("sha1_file: refactor has_sha1_file_with_flags", 2017-06-26),
was introduced to preserve this difference in empty-tree handling, but
now it can be removed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Forbid pathnames that the platform's filesystem cannot represent on
MinGW.
* js/mingw-reserved-filenames:
mingw: refuse paths containing reserved names
mingw: short-circuit the conversion of `/dev/null` to UTF-16
"git rebase --signoff" stopped working when the command was written
in C, which has been corrected.
* en/rebase-signoff-fix:
rebase: fix saving of --signoff state for am-based rebases