When serializing UTF-16 (and UTF-32), there are three possible ways to
write the stream. One can write the data with a BOM in either big-endian
or little-endian format, or one can write the data without a BOM in
big-endian format.
Most systems' iconv implementations choose to write it with a BOM in
some endianness, since this is the most foolproof, and it is resistant
to misinterpretation on Windows, where UTF-16 and the little-endian
serialization are very common. For compatibility with Windows and to
avoid accidental misuse there, Git always wants to write UTF-16 with a
BOM, and will refuse to read UTF-16 without it.
However, musl's iconv implementation writes UTF-16 without a BOM,
relying on the user to interpret it as big-endian. This causes t0028 and
the related functionality to fail, since Git won't read the file without
a BOM.
Add a Makefile and #define knob, ICONV_OMITS_BOM, that can be set if the
iconv implementation has this behavior. When set, Git will write a BOM
manually for UTF-16 and UTF-32 and then force the data to be written in
UTF-16BE or UTF-32BE. We choose big-endian behavior here because the
tests use the raw "UTF-16" encoding, which will be big-endian when the
implementation requires this knob to be set.
Update the tests to detect this case and write test data with an added
BOM if necessary. Always write the BOM in the tests in big-endian
format, since all iconv implementations that omit a BOM must use
big-endian serialization according to the Unicode standard.
Preserve the existing behavior for systems which do not have this knob
enabled, since they may use optimized implementations, including
defaulting to the native endianness, which may improve performance.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The setup code uses octal values with printf to generate a BOM for
UTF-16/32 BE/LE. It specifically uses '\777' to emit a 0xff byte. This
relies on the fact that most shells truncate the value above 0o377.
Ash however interprets '\777' as '\77' + a literal '7', resulting in an
invalid BOM.
Fix this by using the proper value of 0xff: '\377'.
Signed-off-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For "rebase -i --reschedule-failed-exec", we do not want the "-y"
shortcut after all.
* js/rebase-i-redo-exec-fix:
Revert "rebase: introduce a shortcut for --reschedule-failed-exec"
Some errors from the other side coming over smart HTTP transport
were not noticed, which has been corrected.
* js/smart-http-detect-remote-error:
t5551: test server-side ERR packet
remote-curl: tighten "version 2" check for smart-http
remote-curl: refactor smart-http discovery
The embedded blanks in the full path of the test git repository cased bash
to generate an ambugious redirect error.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 53fc999306 ("gpg-interface t: extend the existing GPG tests with
GPGSM", 2018-07-20), the gpgconf call which kills gpg-agent was copied
from the existing gpg setup code.
The reason for killing gpg-agent is given in 29ff1f8f74 ("t: lib-gpg:
flush gpg agent on startup", 2017-07-20):
When running gpg-relevant tests, a gpg-daemon is spawned for each
GNUPGHOME used. This daemon may stay running after the test and cache
file descriptors for the trash directories, even after the trash
directory is removed. This leads to ENOENT errors when attempting to
create files if tests are run multiple times.
Add a cleanup script to force flushing the gpg-agent for that GNUPGHOME
(if any) before setting up the GPG relevant-environment.
Killing gpg-agent once per test is sufficient.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When gpgsm is installed, lib-gpg.sh attempts to update trustlist.txt to
relax the checking of some root certificate requirements. The path to
"${GNUPGHOME}" contains spaces which cause an "ambiguous redirect"
warning when bash is used to run the tests:
$ bash t7030-verify-tag.sh
/git/t/lib-gpg.sh: line 66: ${GNUPGHOME}/trustlist.txt: ambiguous redirect
ok 1 - create signed tags
ok 2 # skip create signed tags x509 (missing GPGSM)
...
No warning is issued when using bash called as /bin/sh, dash, or mksh.
Quote the path to ensure the redirect works as intended and sets the
GPGSM prereq. While we're here, drop the space after ">>".
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git --work-tree=$there --git-dir=$here describe --dirty" did not
work correctly as it did not pay attention to the location of the
worktree specified by the user by mistake, which has been
corrected.
* ss/describe-dirty-in-the-right-directory:
t6120: test for describe with a bare repository
describe: setup working tree for --dirty
"git rebase -x $cmd" did not reject multi-line command, even though
the command is incapable of handling such a command. It now is
rejected upfront.
* pw/rebase-x-sanity-check:
rebase -x: sanity check command
Prepare to run test suite on Azure Pipeline.
* js/vsts-ci: (22 commits)
test-date: drop unused parameter to getnanos()
ci: parallelize testing on Windows
ci: speed up Windows phase
tests: optionally skip bin-wrappers/
t0061: workaround issues with --with-dashes and RUNTIME_PREFIX
tests: add t/helper/ to the PATH with --with-dashes
mingw: try to work around issues with the test cleanup
tests: include detailed trace logs with --write-junit-xml upon failure
tests: avoid calling Perl just to determine file sizes
README: add a build badge (status of the Azure Pipelines build)
mingw: be more generous when wrapping up the setitimer() emulation
ci: use git-sdk-64-minimal build artifact
ci: add a Windows job to the Azure Pipelines definition
Add a build definition for Azure DevOps
ci/lib.sh: add support for Azure Pipelines
tests: optionally write results as JUnit-style .xml
test-date: add a subcommand to measure times in shell scripts
ci: use a junction on Windows instead of a symlink
ci: inherit --jobs via MAKEFLAGS in run-build-and-tests
ci/lib.sh: encapsulate Travis-specific things
...
The documentation of "git commit-tree" said that the command
understands "--gpg-sign" in addition to "-S", but the command line
parser did not know about the longhand, which has been corrected.
* br/commit-tree-fully-spelled-gpg-sign-option:
commit-tree: add missing --gpg-sign flag
t7510: invoke git as part of &&-chain
"git pack-objects" learned another algorithm to compute the set of
objects to send, that trades the resulting packfile off to save
traversal cost to favor small pushes.
* ds/push-sparse-tree-walk:
pack-objects: create GIT_TEST_PACK_SPARSE
pack-objects: create pack.useSparse setting
revision: implement sparse algorithm
list-objects: consume sparse tree walk
revision: add mark_tree_uninteresting_sparse
The test lint learned to catch non-portable "sed" options.
* tb/test-lint-sed-options:
test-lint: only use only sed [-n] [-e command] [-f command_file]
A new date format "--date=human" that morphs its output depending
on how far the time is from the current time has been introduced.
"--date=auto" can be used to use this new format when the output is
going to the pager or to the terminal and otherwise the default
format.
* lt/date-human:
Add `human` date format tests.
Add `human` format to test-tool
Add 'human' date format documentation
Replace the proposed 'auto' mode with 'auto:'
Add 'human' date format
Code cleanup.
* jk/unused-parameter-cleanup:
convert: drop path parameter from actual conversion functions
convert: drop len parameter from conversion checks
config: drop unused parameter from maybe_remove_section()
show_date_relative(): drop unused "tz" parameter
column: drop unused "opts" parameter in item_length()
create_bundle(): drop unused "header" parameter
apply: drop unused "def" parameter from find_name_gnu()
match-trees: drop unused path parameter from score functions
The assumption to work on the single "in-core index" instance has
been reduced from the library-ish part of the codebase.
* nd/the-index-final:
cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch
read-cache.c: remove the_* from index_has_changes()
merge-recursive.c: remove implicit dependency on the_repository
merge-recursive.c: remove implicit dependency on the_index
sha1-name.c: remove implicit dependency on the_index
read-cache.c: replace update_index_if_able with repo_&
read-cache.c: kill read_index()
checkout: avoid the_index when possible
repository.c: replace hold_locked_index() with repo_hold_locked_index()
notes-utils.c: remove the_repository references
grep: use grep_opt->repo instead of explict repo argument
More code in "git bisect" has been rewritten in C.
* tt/bisect-in-c:
bisect--helper: `bisect_start` shell function partially in C
bisect--helper: `get_terms` & `bisect_terms` shell function in C
bisect--helper: `bisect_next_check` shell function in C
bisect--helper: `check_and_set_terms` shell function in C
wrapper: move is_empty_file() and rename it as is_empty_or_missing_file()
bisect--helper: `bisect_write` shell function in C
bisect--helper: `bisect_reset` shell function in C
A new encoding UTF-16LE-BOM has been invented to force encoding to
UTF-16 with BOM in little endian byte order, which cannot be directly
generated by using iconv.
* tb/utf-16-le-with-explicit-bom:
Support working-tree-encoding "UTF-16LE-BOM"
"git cat-file --batch" reported a dangling symbolic link by
mistake, when it wanted to report that a given name is ambiguous.
* dt/cat-file-batch-ambiguous:
t1512: test ambiguous cat-file --batch and --batch-output
Do not print 'dangling' for cat-file in case of ambiguity
"git rebase --merge" as been reimplemented by reusing the internal
machinery used for "git rebase -i".
* en/rebase-merge-on-sequencer:
rebase: implement --merge via the interactive machinery
rebase: define linearization ordering and enforce it
git-legacy-rebase: simplify unnecessary triply-nested if
git-rebase, sequencer: extend --quiet option for the interactive machinery
am, rebase--merge: do not overlook --skip'ed commits with post-rewrite
t5407: add a test demonstrating how interactive handles --skip differently
rebase: fix incompatible options error message
rebase: make builtin and legacy script error messages the same
The git-p4 login ticket expiry test causes unreliable test
runs. Since the handling of ticket expiry in git-p4 is far
from polished anyway, let's remove it for now.
A better way to actually run the test is to create a python
"fake" version of "p4" which returns whatever expiry results
the test requires.
Ideally git-p4 would look at the expiry time before starting
any long operations, and cleanup gracefully if there is not
enough time left. But that's quite hard to do.
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a smart HTTP server sends an error message via pkt-line, we detect
the error due to using PACKET_READ_DIE_ON_ERR_PACKET. This case was
added by 2d103c31c2 (pack-protocol.txt: accept error packets in any
context, 2018-12-29), but not covered by tests.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The getnanos() helper always gets the current time from our
getnanotime() facility. The caller cannot override it via TEST_DATE_NOW,
and hence we simply ignore the "now" parameter to the function. Let's
remove it, as it may mislead callers into thinking it does something.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch was contributed only as a tentative "we could introduce a
convenient short option if we do not want to change the default behavior
in the long run" patch, opening the discussion whether other people
agree with deprecating the current behavior in favor of the rescheduling
behavior.
But the consensus on the Git mailing list was that it would make sense
to show a warning in the near future, and flip the default
rebase.rescheduleFailedExec to reschedule failed `exec` commands by
default. See e.g.
<CAGZ79kZL5CRqCDRb6B-EedUm8Z_i4JuSF2=UtwwdRXMitrrOBw@mail.gmail.com>
So let's back out that patch that added the `-y` short option that we
agreed was not necessary or desirable.
This reverts commit 81ef8ee75d.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When GIT_SEQUENCE_EDITOR is set, the command was incorrectly
started when modes of "git rebase" that implicitly uses the
machinery for the interactive rebase are run, which has been
corrected.
* pw/no-editor-in-rebase-i-implicit:
implicit interactive rebase: don't run sequence editor
"git diff --color-moved --cc --stat -p" did not work well due to
funny interaction between a bug in color-moved and the rest, which
has been fixed.
* jk/diff-cc-stat-fixes:
combine-diff: treat --dirstat like --stat
combine-diff: treat --summary like --stat
combine-diff: treat --shortstat like --stat
combine-diff: factor out stat-format mask
diff: clear emitted_symbols flag after use
t4006: resurrect commented-out tests
"git checkout -b <new> [HEAD]" to create a new branch from the
current commit and check it out ought to be a no-op in the index
and the working tree in normal cases, but there are corner cases
that do require updates to the index and the working tree. Running
it immediately after "git clone --no-checkout" is one of these
cases that an earlier optimization kicked in incorrectly, which has
been fixed.
* bp/checkout-new-branch-optim:
checkout: fix regression in checkout -b on intitial checkout
checkout: add test demonstrating regression with checkout -b on initial commit
Asking "git check-attr" about a macro (e.g. "binary") on a specific
path did not work correctly, even though "git check-attr -a" listed
such a macro correctly. This has been corrected.
* jk/attr-macro-fix:
attr: do not mark queried macros as unset
On a case-insensitive filesystem, we failed to compare the part of
the path that is above the worktree directory in an absolute
pathname, which has been corrected.
* js/abspath-part-inside-repo:
abspath_part_inside_repo: respect core.ignoreCase
The codepath to show progress meter while writing out commit-graph
file has been improved.
* ab/commit-graph-write-progress:
commit-graph write: emit a percentage for all progress
commit-graph write: add itermediate progress
commit-graph write: remove empty line for readability
commit-graph write: add more descriptive progress output
commit-graph write: show progress for object search
commit-graph write: more descriptive "writing out" output
commit-graph write: add "Writing out" progress output
commit-graph: don't call write_graph_chunk_extra_edges() unnecessarily
commit-graph: rename "large edges" to "extra edges"
"git add --ignore-errors" did not work as advertised and instead
worked as an unintended synonym for "git add --renormalize", which
has been fixed.
* jk/add-ignore-errors-bit-assignment-fix:
add: use separate ADD_CACHE_RENORMALIZE flag
In Git for Windows, "git clone \\server\share\path" etc. that uses
UNC paths from command line had bad interaction with its shell
emulation.
* js/mingw-unc-path-w-backslashes:
mingw: special-case arguments to `sh`
mingw (t5580): document bug when cloning from backslashed UNC paths
"git fetch" and "git upload-pack" learned to send all exchange over
the sideband channel while talking the v2 protocol.
* jt/fetch-v2-sideband:
tests: define GIT_TEST_SIDEBAND_ALL
{fetch,upload}-pack: sideband v2 fetch response
sideband: reverse its dependency on pkt-line
pkt-line: introduce struct packet_writer
pack-protocol.txt: accept error packets in any context
Use packet_reader instead of packet_read_line
The codepath to read from the commit-graph file attempted to read
past the end of it when the file's table-of-contents was corrupt.
* js/commit-graph-chunk-table-fix:
Makefile: correct example fuzz build
commit-graph: fix buffer read-overflow
commit-graph, fuzz: add fuzzer for commit-graph
"git p4" failed to update a shelved change when there were moved
files, which has been corrected.
* ld/git-p4-shelve-update-fix:
git-p4: handle update of moved/copied files when updating a shelve
git-p4: add failing test for shelved CL update involving move/copy
Update the protocol message specification to allow only the limited
use of scaled quantities. This is ensure potential compatibility
issues will not go out of hand.
* js/filter-options-should-use-plain-int:
filter-options: expand scaled numbers
tree:<depth>: skip some trees even when collecting omits
list-objects-filter: teach tree:# how to handle >0
The in-core repository instances are passed through more codepaths.
* sb/more-repo-in-api: (23 commits)
t/helper/test-repository: celebrate independence from the_repository
path.h: make REPO_GIT_PATH_FUNC repository agnostic
commit: prepare free_commit_buffer and release_commit_memory for any repo
commit-graph: convert remaining functions to handle any repo
submodule: don't add submodule as odb for push
submodule: use submodule repos for object lookup
pretty: prepare format_commit_message to handle arbitrary repositories
commit: prepare logmsg_reencode to handle arbitrary repositories
commit: prepare repo_unuse_commit_buffer to handle any repo
commit: prepare get_commit_buffer to handle any repo
commit-reach: prepare in_merge_bases[_many] to handle any repo
commit-reach: prepare get_merge_bases to handle any repo
commit-reach.c: allow get_merge_bases_many_0 to handle any repo
commit-reach.c: allow remove_redundant to handle any repo
commit-reach.c: allow merge_bases_many to handle any repo
commit-reach.c: allow paint_down_to_common to handle any repo
commit: allow parse_commit* to handle any repo
object: parse_object to honor its repository argument
object-store: prepare has_{sha1, object}_file to handle any repo
object-store: prepare read_object_file to deal with any repo
...
This ensures that nothing breaks the basic functionality of describe for
bare repositories. Please note that --broken and --dirty need a working
tree.
Signed-off-by: Sebastian Staudt <koraktor@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't use NEED_WORK_TREE when running the git-describe builtin,
since you should be able to describe a commit even in a bare repository.
However, the --dirty flag does need a working tree. Since we don't call
setup_work_tree(), it uses whatever directory we happen to be in. That's
unlikely to match our index, meaning we'd say "dirty" even when the real
working tree is clean.
We can fix that by calling setup_work_tree() once we know that the user
has asked for --dirty.
The --broken option also needs a working tree. But because its
implementation calls git-diff-index we don‘t have to setup the working
tree in the git-describe process.
Signed-off-by: Sebastian Staudt <koraktor@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test the new "ambiguous" result from cat-file --batch and
--batch-check. This is in t1512 instead of t1006 since
we need a repo with ambiguous object_id names.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users who want UTF-16 files in the working tree set the .gitattributes
like this:
test.txt working-tree-encoding=UTF-16
The unicode standard itself defines 3 allowed ways how to encode UTF-16.
The following 3 versions convert all back to 'g' 'i' 't' in UTF-8:
a) UTF-16, without BOM, big endian:
$ printf "\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000 g i t
b) UTF-16, with BOM, little endian:
$ printf "\377\376g\000i\000t\000" | iconv -f UTF-16 -t UTF-8 | od -c
0000000 g i t
c) UTF-16, with BOM, big endian:
$ printf "\376\377\000g\000i\000t" | iconv -f UTF-16 -t UTF-8 | od -c
0000000 g i t
Git uses libiconv to convert from UTF-8 in the index into ITF-16 in the
working tree.
After a checkout, the resulting file has a BOM and is encoded in "UTF-16",
in the version (c) above.
This is what iconv generates, more details follow below.
iconv (and libiconv) can generate UTF-16, UTF-16LE or UTF-16BE:
d) UTF-16
$ printf 'git' | iconv -f UTF-8 -t UTF-16 | od -c
0000000 376 377 \0 g \0 i \0 t
e) UTF-16LE
$ printf 'git' | iconv -f UTF-8 -t UTF-16LE | od -c
0000000 g \0 i \0 t \0
f) UTF-16BE
$ printf 'git' | iconv -f UTF-8 -t UTF-16BE | od -c
0000000 \0 g \0 i \0 t
There is no way to generate version (b) from above in a Git working tree,
but that is what some applications need.
(All fully unicode aware applications should be able to read all 3 variants,
but in practise we are not there yet).
When producing UTF-16 as an output, iconv generates the big endian version
with a BOM. (big endian is probably chosen for historical reasons).
iconv can produce UTF-16 files with little endianess by using "UTF-16LE"
as encoding, and that file does not have a BOM.
Not all users (especially under Windows) are happy with this.
Some tools are not fully unicode aware and can only handle version (b).
Today there is no way to produce version (b) with iconv (or libiconv).
Looking into the history of iconv, it seems as if version (c) will
be used in all future iconv versions (for compatibility reasons).
Solve this dilemma and introduce a Git-specific "UTF-16LE-BOM".
libiconv can not handle the encoding, so Git pick it up, handles the BOM
and uses libiconv to convert the rest of the stream.
(UTF-16BE-BOM is added for consistency)
Rported-by: Adrián Gimeno Balaguer <adrigibal@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>