When executing a buffered block or builtin, the usual approach is to
execute, collect output in a string, and then output that string to
stdout or whatever the redirections say. Similarly for stderr.
If we get no output, then we can elide the outputting which means
skipping the background thread. In this case we just mark the process as
finished immediately.
We do this in multiple locations which is confusing. Factor them all
together into a new function run_internal_process_or_short_circuit.
for-loops that were not inside a function could overwrite global
and universal variables with the loop variable. Avoid this by making
for-loop-variables local variables in their enclosing scope.
This means that if someone does:
set a global
for a in local; end
echo $a
The local $a will shadow the global one (but not be visible in child
scopes). Which is surprising, but less dangerous than the previous
behavior.
The detection whether the loop is running inside a function was failing
inside command substitutions. Remove this special handling of functions
alltogether, it's not needed anymore.
Fixes#6480
'fish_test_helper print_pid_then_sleep' tried to sleep for .5 seconds,
but instead it divided by .5 so it actually slept for 2 seconds.
This exceeds the maximum value on NetBSD so it wasn't sleeping at all
there.
Fixes#6476
Empty items are used as sentinels to indicate that we've reached the end of
history, so they should not be added as actual items. Enforce this.
Fixes#6032
This reduces the syscall count for `fish -c exit` from 651 to 566.
We don't attempt to *cache* the pgrp or anything, we just call it once
when we're about to execute the job to see if we are in foreground and
to assign it to the job, instead of once for checking foreground and
once to give it to the job.
Caching it with a simple `static` would get the count down to 480, but
it's possible for fish to have its pgroup changed.
Store the entire function declaration, not just its job list.
This allows us to extract the body of the function complete with any
leading comments and indents.
Fixes#5285
In particular, this allows `true && time true`, or `true; and time true`,
and both `time not true` as well as `not time true` (like bash).
time is valid only as job _prefix_, so `true | time true` could call
`/bin/time` (same in bash)
See discussion in #6442
Extend the commit 8e17d29e04 to block processes, for example:
begin ; stuff ; end
or if/while blocks as well.
Note there's an existing optimization where we do not create a job for a
block if it has no redirections.
job_promote attempts to bring the most recently "touched" job to the front
of the job list. It did this via:
std::rotate(begin, job, end)
However this has the effect of pushing job-1 to the end. That is,
promoting '2' in [1, 2, 3] would result in [2, 3, 1].
Correct this by replacing it with:
std::rotate(begin, job, job+1);
now we get the desired [2, 1, 3].
Also add a test.
This PR is aimed at improving how job ids are assigned. In particular,
previous to this commit, a job id would be consumed by functions (and
thus aliases). Since it's usual to use functions as command wrappers
this results in awkward job id assignments.
For example if the user is like me and just made the jump from vim -> neovim
then the user might create the following alias:
```
alias vim=nvim
```
Previous to this commit if the user ran `vim` after setting up this
alias, backgrounded (^Z) and ran `jobs` then the output might be:
```
Job Group State Command
2 60267 stopped nvim $argv
```
If the user subsequently opened another vim (nvim) session, backgrounded
and ran jobs then they might see what follows:
```
Job Group State Command
4 70542 stopped nvim $argv
2 60267 stopped nvim $argv
```
These job ids feel unnatural, especially when transitioning away from
e.g. bash where job ids are sequentially incremented (and aliases/functions
don't consume a job id).
See #6053 for more details.
As @ridiculousfish pointed out in
https://github.com/fish-shell/fish-shell/issues/6053#issuecomment-559899400,
we want to elide a job's job id if it corresponds to a single function in the
foreground. This translates to the following prerequisites:
- A job must correspond to a single process (i.e. the job continuation
must be empty)
- A job must be in the foreground (i.e. `&` wasn't appended)
- The job's single process must resolve to a function invocation
If all of these conditions are true then we should mark a job as
"internal" and somehow remove it from consideration when any
infrastructure tries to interact with jobs / job ids.
I saw two paths to implement these requirements:
- At the time of job creation calculate whether or not a job is
"internal" and use a separate list of job ids to track their ids.
Additionally introduce a new flag denoting that a job is internal so
that e.g. `jobs` doesn't list internal jobs
- I started implementing this route but quickly realized I was
computing the same information that would be computed later on (e.g.
"is this job a single process" and "is this jobs statement a
function"). Specifically I was computing data that populate_job_process
would end up computing later anyway. Additionally this added some
weird complexities to the job system (after the change there were two
job id lists AND an additional flag that had to be taken into
consideration)
- Once a function is about to be executed we release the current jobs
job id if the prerequisites are satisfied (which at this point have
been fully computed).
- I opted for this solution since it seems cleaner. In this
implementation "releasing a job id" is done by both calling
`release_job_id` and by marking the internal job_id member variable to
-1. The former operation allows subsequent child jobs to reuse that
same job id (so e.g. the situation described in Motivation doesn't
occur), and the latter ensures that no other job / job id
infrastructure will interact with these jobs because valid jobs have
positive job ids. The second operation causes job_id to become
non-const which leads to the list of code changes outside of `exec.c`
(i.e. a codemod from `job_t::job_id` -> `job_t::job_id()` and moving the
old member variable to a non-const private `job_t::job_id_`)
Note: Its very possible I missed something and setting the job id to -1
will break some other infrastructure, please let me know if so!
I tried to run `make/ninja lint`, but a bunch of non-relevant issues
appeared (e.g. `fatal error: 'config.h' file not found`). I did
successfully clang-format (`git clang-format -f`) and run tests, though.
This PR closes#6053.
It looks like the last status already contains the signal that cancelled
execution.
Also make `fish -c something` always return the last exit status of
"something", instead of hardcoded 127 if exited or signalled.
Fixes#6444
Previously, the block stack was a true stack. However in most cases, you
want to traverse the stack from the topmost frame down. This is awkward
to do with range-based for loops.
Switch it to pushing new blocks to the front of the block list.
This simplifies some traversals.
This was previously required so that, if there was a redirection to a
file, we would fork a process to create the file even if there was no
output. For example `echo -n >/tmp/file.txt` would have to create
file.txt even though it would be empty.
However now we open the file before fork, so we no longer need special
logic around this.
Do this only when splitting on IFS characters which usually contains
whitespace characters --- read --delimiter is unchanged; it still
consumes no more than one delimiter per variable. This seems better,
because it allows arbitrary delimiters in the last field.
Fixes#6406
user_supplied was used to distinguish IO redirections which were
explicit, vs those that came about through "transmogrphication." But
transmogrification is no more. Remove the flag.
As of GCC 7.4 (at least under macOS 10.10), the previous workaround of
casting a must-use result to `(void)` to avoid warnings about unused
code no longer works.
This workaround is uglier but it quiets these warnings.
The C++ spec (as of C++17/n4713) does not specify the sign of `wchar_t`,
saying only (in section 6.7.1: Fundamental Types)
> Type wchar_t shall have the same size, signedness, and alignment
> requirements (6.6.5) as one of the other integral types, called its
> underlying type.
On most *nix platforms on AMD64 architecture, `wchar_t` is a signed type
and can be compared with `int32_t` without incident, but on at least
some platforms (tested: clang under FreeBSD 12.1 on AARCH64), `wchar_t`
appears to be unsigned leading to sign comparison warnings:
```
../src/widecharwidth/widechar_width.h:512:48: warning: comparison of
integers of different signs: 'const wchar_t' and 'int32_t' (aka 'int')
[-Wsign-compare]
return where != std::end(arr) && where->lo <= c;
```
This patch forces the use of wchar_t for the range start/end values in
`widechar_range` and the associated comparison values.
Previously, if the user control-C'd out of a process, we would set a
bogus exit status in the process, but it was difficult to observe this
because we would be cancelling anyways. But set it properly.
If a Control-C is received during expanding a command substitution, we
may execute the job anyways, because we do not check for cancellation
after the expansion. Ensure that does not happen.
This should fix sporadic test failures in the cancellation unit test.
parser_t::eval indicates whether there was a parse error. It can be
easily confused with the status of the execution. Use a real type to
make it more clear.
Looking up a variable by a string literal implicitly constructs a wcstring.
By avoiding that, we get a noticeable reduction of temporary allocations.
$ HOME=. heaptrack ./fish -c true
heaptrack stats: # baseline
allocations: 7635
leaked allocations: 3277
temporary allocations: 602
heaptrack stats: # new
allocations: 7565
leaked allocations: 3267
temporary allocations: 530
This adds a test for the obscure case where an fd is redirected to
itself. This is tricky because the dup2 will not clear the CLO_EXEC bit.
So do it manually; also posix_spawn can't be used in this case.
The IO cleanup left file redirections open in the child. For example,
/bin/cmd < file.txt would redirect stdin but also leave the file open.
Ensure these get closed properly.
Prior to this fix, a file redirection was turned into an io_file_t. This is
annoying because every place where we want to apply the redirection, we
might fail due to open() failing. Switch to opening the file at the point
we resolve the redirection spec. This will simplify a lot of code.
Prior to this change, a process after it has been constructed by
parse_execution, but before it is executed, was given a list of
io_data_t redirections. The problem is that redirections have a
sensitive ownership policy because they hold onto fds. This made it
rather hard to reason about fd lifetime.
Change these to redirection_spec_t. This is a textual description
of a redirection after expansion. It does not represent an open file and
so its lifetime is no longer important.
This enables files to be held only on the stack, and are no longer owned
by a process of indeterminate lifetime.
fish has to ensure that the pipes it creates do not conflict with any
explicit fds named in redirections. Switch this code to using
autoclose_fd_t to make the ownership logic more explicit, and also
introduce fd_set_t to reduce the dependence on io_chain_t.
Prior to this fix, a job would hold onto any IO redirections from its
parent. For example:
begin
echo a
end < file.txt
The "echo a" job would hold a reference to the I/O redirection.
The problem is that jobs then extend the life of pipes until the job is
cleaned up. This can prevent pipes from closing, leading to hangs.
Fix this by not storing the block IO; this ensures that jobs do not
prolong the life of pipes.
Fixes#6397
Currently a job needs to know three things about its "parents:"
1. Any IO redirections for the block or function containing this job
2. The pgid for the parent job
3. Whether the parent job has been fully constructed (to defer self-disown)
These are all tracked in somewhat separate awkward ways. Collapse them
into a single new type job_lineage_t.
In preparation for concurrent execution, invert the control of function and
block execution. Allow a process to return an std::function that performs the
the execution. This can be run on either the main or a background thread
(eventually).
This splits a string into variables according to the shell's
tokenization rules, considering quoting, escaping etc.
This runs an automatic `unescape` on the string so it's presented like
it would be passed to the command. E.g.
printf '%s\n' a\ b
returns the tokens
printf
%s\n
a b
It might be useful to add another mode "--tokenize-raw" that doesn't
do that, but this seems to be the more useful of the two.
Fixes#3823.
Background fillthreads are used when we want to populate a buffer from an
external command. The most common is command substitution.
Prior to this commit, fish would spin up a fillthread whenever required.
This ended up being quite expensive.
Switch to using the iothread pool instead. This enables reusing the same
thread(s), which prevents needing to spawn new threads. This shows a big
perf win on the alias benchmark (766 -> 378 ms).
This reintroduces commits 22230a1a0d
and 9d7d70c204, now with the bug fixed.
The problem was when there was one thread waiting in the pool. We enqueue
an item onto the pool and attempt to wake up the thread. But before the
thread runs, we enqueue another item - this second enqueue will see the
thread waiting and attempt to wake it up as well. If the two work items
were dependent (reader/writer) then we would have a deadlock.
The fix is to check if the number of waiting threads is at least as large
as the queue. If the number of enqueued items exceeds the number of waiting
threads, then spawn a new thread always.
This added the function offset *again*, but it's already included in
the line for the current file.
And yes, I have explicitly tested a function file with a function
defined at a later line.
Fixes#6350
Since #6287, bare variable assignments do not parse, which broke
the "Unsupported use of '='" error message.
This commit catches parse errors that occur on bare variable assignments.
When a statement node fails to parse, then we check if there is at least one
prefixing variable assignment. If so, we emit the old error message.
See also #6347
This adds initial support for statements with prefixed variable assignments.
Statments like this are supported:
a=1 b=$a echo $b # outputs 1
Just like in other shells, the left-hand side of each assignment must
be a valid variable identifier (no quoting/escaping). Array indexing
(PATH[1]=/bin ls $PATH) is *not* yet supported, but can be added fairly
easily.
The right hand side may be any valid string token, like a command
substitution, or a brace expansion.
Since `a=* foo` is equivalent to `begin set -lx a *; foo; end`,
the assignment, like `set`, uses nullglob behavior, e.g. below command
can safely be used to check if a directory is empty.
x=/nothing/{,.}* test (count $x) -eq 0
Generic file completion is done after the equal sign, so for example
pressing tab after something like `HOME=/` completes files in the
root directory
Subcommand completion works, so something like
`GIT_DIR=repo.git and command git ` correctly calls git completions
(but the git completion does not use the variable as of now).
The variable assignment is highlighted like an argument.
Closes#6048
uClibc-ng does not expose C++11 math
functions to the std namespace, breaking
compilation. This is fine as the argument
type is double.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Improve the iothread behavior by enabling an iothread to stick around for
a while waiting for work. This reduces the amount of iothread churn, which
is useful on platforms where threads are expensive.
Also do other modernization like clean up the locking discipline and use
FLOG.
fish will react to certain variable modifications, such as "TZ." Only do
this if the main stack is modified. This has no effect now because there
is always a single stack, but will become important when concurrent
execution is supported.
This adds string-x.rst for each subcommand x of string. The main page
(string.rst) is not changed, except that examples are shown directly after
each subcommand. The subcommand sections in string.rst are created by
textual inclusion of parts of the string-x.rst files.
Subcommand man pages can be viewed with either of:
```
man string collect
man string-collect
string collect <press F1 or Alt-h>
string collect -h
```
While `string -h ...` still prints the full help.
Closes#5968
Presently the completion engine ignores builtins that are part of the
fish syntax. This can be a problem when completing a string that was
based on the output of `commandline -p`. This changes completions to
treat these builtins like any other command.
This also disables generic (filename) completion inside comments and
after strings that do not tokenize.
Additionally, comments are stripped off the output of `commandline -p`.
Fixes#5415Fixes#2705
The history search logic had a not very useful "fast path" which was also
buggy because it neglected to dedup. Switch the "fast path" to just a
history search type which always matches.
Fixes#6278
PATH and CDPATH have special behavior around empty elements. Express this
directly in env_stack_t::set rather than via variable dispatch; this is
cleaner.
This adds support for `fish_trace`, a new variable intended to serve the
same purpose as `set -x` as in bash. Setting this variable to anything
non-empty causes execution to be traced. In the future we may give more
specific meaning to the value of the variable.
The user's prompt is not traced unless you run it explicitly. Events are
also not traced because it is noisy; however autoloading is.
Fixes#3427
When considering an autosuggestion from history, we attempt to validate the
command to ensure that we don't suggest invalid (e.g. path-dependent)
commands. Prior to this fix, we would validate the last command in the
command line (e.g. in `cd /bin && ./stuff` we would validate "./stuff".
This doesn't really make sense; we should be validating the first command
because it has the potential to change the PWD. Switch to validating the
first command.
Also remove some helper functions that became dead through this change.
Until now, something like
`math '7 = 2'`
would complain about a "missing" operator.
Now we print an error about logical operators not being supported and
point the user towards `test`.
Fixes#6096
Every builtin or function shipped with fish supports flag -h or --help to
print a slightly condensed version of its manpage.
Some of those help messages are longer than a typical screen;
this commit pipes the help to a pager to make it easier to read.
As in other places in fish we assume that either $PAGER or "less" is a
valid pager and use that.
In three places (error messages for bg, break and continue) the help is
printed to stderr instead of stdout. To make sure the error message is
visible in the pager, we pass it to builtin_print_help, every call of which
needs to be updated.
Fixes#6227
The `--entire` would enable output even though the `--quiet` should
have silenced it. These two don't make any sense together so print an
error, because the user could have just left off the `-q`.
sys/sysctl.h is deprecated on glibc, so it leads to warnings.
According to fa4ec55c96, it was included for KERN_PROCARGS2 for
process expansion, but process expansion is gone, so it's unused now.
(there is another use of it in common.cpp, but that's only on FreeBSD)
Also 1f06e5f0b9 only included
tokenizer.h (present since the initial commit) if KERN_PROCARGS2
wasn't available, so it can't have been important.
This builds and passes the tests on:
- Archlinux, with glibc 2.30
- Alpine, with musl
- FreeBSD
- NetBSD
Universal exported variables (created by `set -xU`) used to show up
both as universal and global variable in child instances of fish.
As a result, when changing an exported universal variable, the
new value would only be visible after a new login (or deleting the
variable from global scope in each fish instance).
Additionally, something like `set -xU EDITOR vim -g` would be imported
into the global scope as a single word resulting in failures to
execute $EDITOR in fish.
We cannot simply give precedence to universal variables, because
another process might have exported the same variable. Instead, we
only skip importing a variable when it is equivalent to an exported
universal variable with the same name. We compare their values after
joining with spaces, hence skipping those imports does not change the
environment fish passes to its children. Only the representation in
fish is changed from `"vim -g"` to `vim -g`.
Closes#5258.
This eliminates the issue #5348 for universal variables.
Consider a group of short options, like -xzPARAM, where x and z are options and z takes an argument.
This commit enables completion of the argument to the last option (z), both within the same
token (-xzP) or in the next one (-xz P).
complete -C'-xz' will complete only parameters to z.
complete -C'-xz ' will complete only parameters to z if z requires a parameter
otherwise, it will also complete non-option parameters
To do so this implements a heuristic to differentiate such strings from single long options. To
detect whether our token contains some short options, we only require the first character after the
dash (here x) to be an option. Previously, all characters had to be short options. The last option
in our example is z. Everything after the last option is assumed to be a parameter to the last
option.
Assume there is also a single long option -x-foo, then complete -C'-x' will suggest both -x-foo and
-xy. However, when the single option x requires an argument, this will not suggest -x-foo.
However, I assume this will almost never happen in practise since completions very rarely mix
short and single long options.
Fixes#332
In e167714899 we allowed recursive calls
to complete. However, some completions use infinite recursion in their
completions and rely on `complete` to silently stop as soon as it is
called recursively twice without parameter (thus completing the
current commandline). For example:
complete -c su -s -xa "(complete -C(commandline -ct))"
su -c <TAB>
Infinite recursion happens because (commandline -ct) is an empty list,
which would print an error message. This commmit explicitly detects
such recursive calls where `complete` has no parameter and silently
terminates. This enables above completion (like before raising the
recursion limit) while still allowing legitimate cases with limited
recursion.
Closes#6171
This stops reading argument names after another option appears. It does not break any previous uses and in fact fixes uses like
```fish
function foo --argument-names bar --description baz
```
* `function` command handles options after argument names (Fixes#6186)
* Removed unneccesary test
Prior to this fix, self-insert would always wait for a new character.
Track in char_event what sequence generated the readline event, and then
if the sequence is not empty, insert that sequence.
This will support implementing the space binding via pure readline
functions.
Previously, tab-completion would move the cursor to the end of the current token, even
if no completion is inserted. This commit defers moving the cursor until we insert a completion.
Fixes#4124
If interactive, `complete` commands are highlighted like they would
be if typed. Adds a little fun contrast and it's easier to read.
Moved a function out of fish_indent to highlight.h
we now print --long options for ones I arbitrarily decided
are less likely to be remembered.
Also fixed the `--wraps` items at the end not being escaped
Most of our completion scripts are written using the short options
anyhow, and this makes it less likely the output will span several
lines per command
This sequence can be generatd by control-spacebar. Allow it to be bound
properly.
To do this we must be sure that we never round-trip the key sequence
through a C string.
Fish completes parts of words split by the separators, so things like
`dd if=/dev/sd<TAB>` work.
This commit improves interactive completion if completion strings legitimately
contain '=' or ':'. Consider this example where completion will suggest
a🅰️1 and other files in the cwd in addition to a:1
touch a:1; complete -C'ls a:'
This behavior remains unchanged, but this commit allows to quote or escape
separators, so that e.g. `ls "a:<TAB>` and `ls a\:<TAB>` successfully complete
the filename.
This also makes the completion insert those escapes automatically unless
already quoted.
So `ls a<TAB>` will give `ls a\:1`.
Both changes match bash's behavior.
Instead of warning (debug level 1), we now emit an error (debug level 0) if a known bad version of
WSL is detected. However, `FISH_NO_WSL_CHECK` can now be defined to skip both the check and the
startup message.
fish is designed to append to the history file in most cases. However
save_internal_via_appending was never returning success, so we were
always doing the slow rewrite path. Correctly return success.
Fixes#6042
It appears Gcc 4.8 doesn't get this particular expression, so we just
revert to the old `type foo = bar` style from the new `type foo{bar}`.
Fixes#6027.
Meaning empty variables, command substitutions that don't print
anything.
A switch without an argument
```fish
switch
case ...
end
```
is still a syntax error, and more than one argument is still a runtime
error.
The none-argument matches either an empty-string `case ''` or a
catch-all `case '*'`.
Fixes#5677.
Fixes#4943.
Previously when propagating explicitly separated output, we would early-out
if the buffer was empty, where empty meant contains no characters. However
it may contain one or more empty strings, in which case we should propagate
those strings.
Remove this footgun "empty" function and handle this properly.
Fixes#5987
Prior to this fix, fish would attempt to react if a local fish_complete_path
or fish_function_path were set. However this has never been very well tested
and will become impossible with concurrent execution. Always use the global
values.
Soon we will have more complicated logic around whether to call tcsetpgrp.
Prepare to centralize the logic by passing in the new term owner pgrp,
instead of having child_setup_process perform the decision.
This exitted if the cursor was at the end of the line as well (i.e. if
delete-char failed). That's a bit too eager.
Also documentation, which should have already been included.
We used to have a global notion of "is the shell interactive" but soon we
will want to have multiple independent execution threads, only some of
which may be interactive. Start tracking this data per-parser.
history now often writes to the history file asynchronously, but the history
test expects to find the text in the file immediately after running the
command. Hack a bit in history to make this test more reliable.
Prior to this diff, fish had different signal handling functions for
different signals. However it was hard to coordinate when a signal needed
to be the default handler, and when it was custom. In #5962 we overwrote
fish's custom WINCH handler with the default_handler when fish script asked
for WINCH to be handled.
Just have a single big signal handler function. That way it can never be
set to the wrong thing.
Fixes#5969
When executing a job, if the first process is fish internal, then have
fish claim the job's pgroup.
The idea here is that the terminal must be owned by a pgroup containing
the process reading from the terminal. If the first process is fish
internal (a function or builtin) then the pgroup must contain the fish
process.
This is a bit of a workaround of the behavior where the first process that
executes in a job becomes the process group leader. If there's a deferred
process, then we will execute processes out of order so the pgroup can be
wrong. Fix this by setting the process group leader explicitly as fish
when necessary.
Fixes#5855
Especially as, in this case, the documentation is quite massive.
Caught by porting string's test to littlecheck.
See #3404 - this was already supposed to be included.
25afc9b377 made this unnecessary by
having child processes wait for a signal after fork(), but this change
was later reverted. If we artificially slow down fish (e.g. with a sleep)
after the fork call, we see commands getting backgrounded by mistake.
Put back the tcsetgrp() call.
* Prevent not-yet-loaded functions from loaded when erased
Today, `functions --erase $function` does nothing if the function
hasn't been autoloaded yet.
E.g. run, in an interactive session
> functions --erase ls
> type ls
and be amazed that it still shows our default `ls --color=auto`
wrapper function.
This seems counter-intuitive - removing a function ought to remove it,
whether it had been executed before or not.
* doc/changelog
Instead of requiring a flag to enable newline trimming, invert it so the
flag (now `--no-trim-newlines`) disables newline trimming. This way our
default behavior matches that of sh's `"$(cmd)"`.
Also change newline trimming to trim all newlines instead of just one,
again to match sh's behavior.
The `string collect` subcommand behaves quite similarly in practice to
`string split0 -m 0` in that it doesn't split its output, but it also
takes an optional `--trim-newline` flag to trim a single trailing
newline off of the output.
See issue #159.
It's always a bit annoying that `*` requires quoting.
So we allow "x" as an alternative, only it needs to be followed by
whitespace to distinguish it from "0x" hexadecimal notation.
To support distinct parsers having different working directories, we need
to keep the working directory alive, and also retain a non-path reference
to it.
Because an exported universal variable must be exported in all variable
stacks, explicit invalidation is infeasible. Switch the universal variables
to a generation count.
Prior to this fix, fish would invalidate the exported variable list
whenever an exported variable changes. However we soon will not have a
single "exported variable list." If a global variable changes, it is
infeasible to find all exported variable lists and invalidate them.
Switch to a new model where we store a list of generation counts. Every
time an exported variable changes, the node gets a new generation. If the
current generation list does not match the cached one, then we know that
our exported variable list is stale.
This was undocumented, not all that useful and potentially unwanted.
In particular it means that things like
mysql -p(read)
will still keep the password in history.
Also it allows us to simply implement asking for the history deletion
term.
See #5791.
This makes the following changes:
1. Events in background threads are executed in those threads, instead of
being silently dropped
2. Blocked events are now per-parser instead of global
3. Events are posted in builtin_set instead of within the environment stack
The last one means that we no longer support event handlers for implicit
sets like (example) argv. Instead only the `set` builtin (and also `cd`)
post variable-change events.
Events from universal variable changes are still not fully rationalized.
This cuts down on the wcs2string here by ~25%.
The better solution would be to cache narrow versions of $PATH, since
we compute that over and over and over and over again, while it rarely changes.
Or we could add a full path-cache (where which command is), but that's
much harder to invalidate.
See #5905.
This sets the explicit path to the default one, which should be okay,
since the default path never changes (not even if $XDG_CONFIG_HOME
does).
Then it saves a narrow version of that, which saves most of the time
needed to `sync` in most cases.
Fixes#5905.
Note that this isn't technically *w*util, but the differences between
the functions are basically just whether they do the wcs2string
themselves or not.
This fixes a race condition in the topic monitor. A thread may decide to
enter the wait queue, but before it does the generation list changes, and
so our thread will wait forever, resulting in a hang.
It also simplifies the implementation of the topic monitor considerably;
on reflection the whole "metagen" thing isn't providing any value and we
should just compare generations directly.
In the new design, we have a lock-protected list of current generations,
along with a boolean as to whether someone is reading from the pipe. The
reader (only one at a time) is responsible for broadcasting notifications
via a condition variable.
The enum starts at 0 (defined to be!), so we can eliminate this one.
That allows us to remove a reliance on the position of
beginning_of_line, and it would trigger a "type-limits" warning.
Also leave a comment because I actually hit that.
Someone has hit the 10MiB limit (and of course it's the number of
javascript packages), and we don't handle it fantastically currently.
And even though you can't pass a variable of that size in one go, it's
plausible that someone might do it in multiple passes.
See #5267.
This widens the remaining ones that don't take a char
anywhere.
The rest either use a char _variable_ or __FUNCTION__, which from my
reading is narrow and needs to be widened manually. I've been unable
to test it, though.
See #5900.
This solves the main part of (careful linebreak)
issue #5900.
I'm betting all the errors that do use narrow IO are broken, including
a bunch of asserts.
This adds a new mechanism for logging, intended to replace debug().
The entry points are FLOG and FLOGF. FLOG can be used to log a sequence of
arguments, FLOGF is for printf-style formatted strings.
Each call to FLOG and FLOGF requires a category. If logging for a category
is not enabled, there is no effect (and arguments are not evaluated).
Categories may be enabled on the command line via the -d option.
Now that our interactive signal handlers are a strict superset of
non-interactive ones, there is no reason to "reset" signals or take action
when becoming non-interactive. Clean up how signal handlers get installed.
The signal handlers for interactive and non-interactive SIGINT were distinct
and talked to the reader. This wasn't really justified and will complicate
having multiple threads. Unify these into a single signal handler.
is_interactive_read is a suspicious flag which prevents a call to
parser_t::skip_all_blocks from a ^C signal handler. However we end
up skipping the blocks later when we exit the read loop.
This flag seems unnecessary. Bravely remove it.
When setting a variable without a specified scope, we should give priority
to an existing local or global above an existing universal variable with
the same name.
In 16fd780484 there was a regression that
made universal variables have priority.
Fixes#5883
Otherwise we'd undo the history search when you press e.g. execute,
which means you'd execute the search term.
Only `cancel` should walk it back, like it previously did hardcoded to
escape.
Fixes#5891.
These were inadvertently disabled by a bug which was introduced in
cd7e8f4103 . Fix the bug so the tests run
again.
They don't all pass yet; they regressed during the period they were
disabled.
This mainly is conceptually a bit simpler. The comment about making it
cheaper is entirely misplaced since this is quite far away from being
important.
Even expanding 1000 abbrs, it doesn't show up in the profile.
This was, under some circumstances, apparently off by one.
If a suggestion was really long, like
```fish
infocmp | string split , | string trim | string match -re . | while read -d = -l key val; test -z "$val"; and continue; string match -q '*%*' -- $val; and continue; test (string replace -ra '\e([\[\]]|\(B).*[\comJKsu]' '' -- a(tput $key)b) = ab; or echo $key $val; end > xterm
```
(I'm assuming longer than $COLUMNS), it would staircase like with a wrong wcwidth.
This reverts commit 15a5c0ed5f.
I noticed my debug output for 24bit color mode was garbled due to
this being wrong. I spent a little time trying to get the compiler
to tell us about these, but -Wformat doesn't do anything for wchar
printf functions, and __attribute__((format(printf, n, m))) will
cause an error with wchar_t's, so I gave up and decided to manually
check out every '%s' in the entire project. I found (only) one
more.
debug(0, "%s", wchars) will report warnings for incorrect
specifiers but debug(0, L"%s", wchars) is unable. Thus there may
be reason to prefer not using L"..." as an argument if all else
is equal and it's not necessary.
Allows `fish_indent -w **.fish` to restyle all fish files under the
current directory.
(This also has the sideeffect of reducing style.fish time by ~10s, as
we only need to invoke `fish_indent` once, instead of once per-file)
Blocks will soon need to be shared across parsers. Migrate the loop status
(like break or continue) from the block into the libdata. It turns out we
only ever need one, we don't need to track this per-block.
Make it an enum class.
Brace expansion with single words in it is quite useless - `HEAD@{0}`
expanding to `HEAD@0` breaks git.
So we complicate the rule slightly - if there is no variable expansion
or "," inside of braces, they are just treated as literal braces.
Note that this is technically backwards-incompatible, because
echo foo{0}
will now print `foo{0}` instead of `foo0`. However that's a
technicality because the braces were literally useless in that case.
Our tests needed to be adjusted, but that's because they are meant to
exercise this in weird ways.
I don't believe this will break any code in practice.
Fixes#5869.
Prior to this fix, a function_block stored a process_t, which was only used
when printing backtraces. Switch this to an array of arguments, and make
various other cleanups around null terminated argument arrays.
We previously checked if fish_mode_prompt existed as a function, but
that's a bad change for those who already set it to an empty function
to have a mode display elsewhere.
Updated widechar_width takes care of it.
Technically, this does ~3 comparisons more per-character (because it
checks variation selectors and such), but that shouldn't really matter.
get_current_winsize() is intended to be lazy. It does the following:
1. Gets the termsize from the kernel
2. Compares it against the current value
3. If changed, sets COLUMNS and LINES variables
Upon setting these variables, we notice that the termsize has changed
and invalidate the termsize. Thus we were doing this work multiple times
on every screen repaint.
Put back an old hack that just marked the termsize as valid at the end
of get_current_winsize().
This just sets some special characters that we use in the reader, so
it only needs to be done before the reader is set up.
Which, as it stands, is in env_init().
This stops trying to see if the previous line is wider if it is a
prefix of the current one.
Which turns out to be true often enough that it's a net benefit.
This passes character width as an argument for a few functions.
In particular, it hardcodes a width of "1" for a space literal.
There's no reason to compute wcwidth for the length of the prompt.
This measured *all* the characters on the commandline, and saved all
of them in another wcstring_list_t, just to then do... nothing with
that info.
Also, it did wcslen for something that we already have as wcstring,
reserved a vector and did a bunch of work for autosuggestions that
isn't necessary if we have more than one line.
Instead, we do what we need, which is to figure out if we are
multiline and how wide the first line is.
Fixes#5866.
line_shared_prefix explains in its comment that
> If the prefix ends on a combining character, do not include the
previous character in the prefix.
But that's not what it does.
Instead, what it appears to do is to return idx for *every* combining
mark. This seems wrong to begin with, and it also requires checking
wcwidth for *every* character.
So instead we don't do that. If we find the mismatch, we check if it's
a combining mark, and then go back to the previous character (i.e. the
one before the one that the combining mark is for).
My tests found no issues with this, other than a 20% reduction in
pasting time.
The old commit #3f820f0 "Disable ONLCR mapping of NL output to CR-NL"
incorrectly used c_iflag instead of c_oflag, and I copied that error
in my patch. Fixed that. However, there seems to be other problems
trying to use "\x1B[A", which I have not tried to debug, so comment that out.
(However, #3f820f0 seems to mostly work if we fix it to use c_oflag.)
This read something like `o=!_validate_int`, and the flag modifier
reading kept the pointer after the `!`, so it created a long flag
called `_validate_int`, which meant it would not only error out form
```fish
argparse 'i=!_validate_int' 'o=!_validate_int' -- $argv
```
with "Long flag '_validate_int' already defined", but also set
$_flag_validate_int.
Fixes#5864.
As mentioned in #2900, something like
```fish
test -n "$var"; and set -l foo $var
```
is sufficiently idiomatic that it should be allowable.
Also fixes some additional weirdness with semicolons.
This runs build_tools/style.fish, which runs clang-format on C++, fish_indent on fish and (new) black on python.
If anything is wrong with the formatting, we should fix the tools, but automated formatting is worth it.
This removes semicolons at the end of the line and collapses
consecutive ones, while replacing meaningful semicolons with newlines.
I.e.
```fish
echo;
```
becomes
```fish
echo
```
but
```fish
echo; echo
```
becomes
```fish
echo
echo
```
Fixes#5859.
This was a sort of side channel that was only used to propagate redraws
after universal variable changes. We can eliminate it and handle these
more directly.
tsan does funny things to signals, preventing signals from being delivered
in a blocking read. Switch the topic monitor to non-blocking reads under
tsan.
This keeps all unknown options in $argv, so
```fish
argparse -i a/alpha -- -a banana -o val -w
```
results in $_flag_a set to banana, and $argv set to `-o val -w`.
This allows users to use multiple argparse passes, or to simply avoid
specifying all options e.g. in completions - `systemctl` has 46 of
them, most not having any effect on the completions.
Fixes#5367.
This cleans up how functions are stored and autoloaded. It eliminates the
recursive lock. Instead there is a single normal owning_lock that protects
the entirety of the function data. Autoloading is re-implemented via the
new autoloader_t.
autoloader_t will be the reimplementation of autoloading. Crucically it no
longer manages any locking or loading itself; instead all locking and loading
is performed by clients. This makes it easier to test and helps limit its
responsibilities.
autoloading has a "feature" where functions are removed in an LRU-fashion.
But there's hardly any benefit in removing autoloaded functions. Just stop
doing it.
This is a long-standing issue with how `complete --do-complete` does
its argument parsing: It takes an optional argument, so it has to be
attached to the token like `complete --do-complete=foo` or (worse)
`complete -Cfoo`.
But since `complete` doesn't take any bare arguments otherwise (it
would error with "too many arguments" if you did `complete -C foo`) we
can just take one free argument as the argument to `--do-complete`.
It's more of a command than an option anyway, since it entirely
changes what the `complete` call _does_.
* Some comment fixes and renaming of is_iterm2_escape_seq.
The comment for is_iterm2_escape_seq incorrectly says "CSI followed by ]".
This is wrong, because CSI is ESC followed by [ (or the seldom-used 0x9b).
The procedure actually matches Operating System Command (OSC) escape codes.
Since there is nothing iterm2-specific about OSC, is_osc_escape_seq
would be a better name.
Also s_desired_append_char documents a non-existent parameter.
* Update broken iterm2 url in comment.
This was added in 04a96f6 but not strictly required to fix#5803
(verified), with the intention of hiding invisible background jobs
(created by invoking a function within a pipeline) from the user, but
that also broke intentionally created jobs from displaying as well.
I'm thinking it can't be done without keeping track of caller context vs
job context.
Closes#5824.
env_scoped_t lives between environment_t and env_stack_t.
It represents the read-only logic of env_stack_t and will be used to back
the new environment snapshot implementation.
These tests used raw, unescaped parentheses to perform `test` logical
grouping, but the test failures weren't caught because the parser
evaluation errors were not being propagated (fixed in bdbd173e).
It was unconditionally returning `parse_execution_success`. This was
causing certain parser errors to incorrectly return after evaluation
with `$status` equal to `0`, as reported after `eval`, `source`, or
sub-`fish` execution.
Prior to this change, fish used a global flag to decide if we should check
for changes to universal variables. This flag was then checked at arbitrary
locations, potentially triggering variable updates and event handlers for
those updates; this was very hard to reason about.
Switch to triggering a universal variable update at a fixed location,
after running an external command. The common case is that the variable
file has not changed, which we can identify with just a stat() call, so
this is pretty cheap.
This reverts commit cdce8511a1.
This change was unsafe. The prior version (now restored) took the lock and
then copied the data. By returning a reference, the caller holds a
reference to data outside of the lock.
This function isn't worth optimizing. Hardly any functions use this
facility, and for those that do, they typically just capture one or two
variables.
* Convert `function_get_inherit_vars()` to return a reference to the
(possibly) existing map, rather than a copy;
* Preallocate and reuse a static (read-only) map for the (very) common
case of no inherited vars;
* Pass references to the inherit vars map around thereafter, never
triggering the map copy (or even move) constructor.
NB: If it turns out the reference is unsafe, we can switch the inherit vars
to be a shared_ptr and return that instead.
I did not realize builtins could safely call into the parser and inject
jobs during execution. This is much cleaner than hacking around the
required shape of a plain_statement.
- fix the carat position expanding e.g. `command $,`
- improve the error reporting for not-allowed command subtitutions
by figuring out where the expansion failed instead of using
SOURCE_LOCATION_UNKNOWN
- allow nullptr for parse_util_licate_brackets_range() out_string
argument if we don't need it to do any work.
Fixes#5812
`eval` has always been implemented as a function, which was always a bit
of a hack that caused some issues such as triggering the creation of a
new scope. This turns `eval` into a decorator.
The scoping issues with eval prevented it from being usable to actually
implement other shell components in fish script, such as the problems
described in #4442, which should now no longer be the case.
Closes#4443.
While `eval` is still a function, this paves the way for changing that
in the future, and lets the proc/exec functions detect when an eval is
used to allow/disallow certain behaviors and optimizations.
This adds an option --print-rusage-self to the fish executable. When set,
this option prints some getrusage stats to the console in a human-readable
way. This will be used by upcoming benchmarking support.
Followup to 394623b.
Doing it in the parser meant only top-level jobs would be reaped after
being `disown`ed, as subjobs aren't directly handled by the parser.
This is also much cleaner, as now job removal is centralized in
`process_clean_after_marking()`.
Closes#5803.
This prevents the `disown` builtin from directly removing jobs out of
the jobs list to prevent sanity issues, as `disown` may be called within
the context of a subjob (e.g. in a function or block) in which case the
parent job might not yet be done with the reference to the child job.
Instead, a flag is set and the parser removes the job from the list only
after the entire execution chain has completed.
Closes#5720.
When popping a scope from the environment stack, we currently do a lot of
nonsense like looking for changed curses variables. We want to centralize
this in env_stack_t so that it can be migrated to the env_dispatch logic.
Move this logic up one level in preparation for doing that.
This new file is supposed to encapsulate all of the logic around
reacting to variable changes, as opposed to the environment core.
This is to help break up the env.cpp monolith.
Prior to this fix, a job would only inherit a pgrp from its parent if the
first command were external. There seems to be no reason for this
restriction and this causes tcsetgrp() churn, potentially cuasing SIGTTIN.
Switch to unconditionally inheriting a pgrp from parents.
This should fix most of #5765, the only remaining question is
tcsetpgrp from builtins.
Prior to this fix, in every call to job_continue, fish would reclaim the
foreground pgrp. This would cause other jobs in the pipeline (which may
have another pgrp) to receive SIGTTIN / SIGTTOU.
Only reclaim the foreground pgrp if it was held at the point of job_continue.
This partially addresses #5765
In tests we would like to arrange for an executable to invoke certain
system calls, e.g. to claim or relinquish control of the terminal. This is
annoying to do portably via e.g. perl. fish_test_helper is a little
program where we can add custom commands to make it act in certain ways.
This set the term modes to the shell-modes, including disabling
ICRNL (translating \cm to \cj) and echo.
The rationale given was that `reader_interactive_init()` would only be
called >= 250ms later, which I _highly_ doubt considering fish's total
startup time is 8ms for me.
The main idea was that this would stop programs like tmuxinator that
send shortcuts early from failing _iff_ the shortcut was \cj, which
also seems quite unusual.
This works both with `rm -i` and `read` in config.fish, because `read`
explicitly calls `reader_push`, which then initializes the shell modes.
The real fix would involve reordering our init so we set up the
modesetting first, but that's quite involved and the remaining issue
should barely happen, while it's fairly common to have issues with a
prompt in config.fish, and the workaround for the former is simpler, so let's leave it for now.
Partially reverts #2578.
Fixes#2980.
Putting larger members before smaller ones will reduce structure
sizes. bools are 1 byte. on 64bit systems I think they reduced:
wgetopt.h:46: 64 to 56 bytes
builtin_history.cpp:30: 48 to 32 bytes
builtin_status.cpp:91: 32 to 24 bytes
tinyexpr.cpp:69: 40 to 32 bytes
The data stored in these containers is small enough that it is worth
creating distinct sets for each lookup.
In a microbenchmark of these changes, the single-lookup version of the
function with lookups gated on the length of input (bypassed entirely if
the input is longer than the longest key in the container) provided a
1.5x-3.5x speedup over the previous implementation.
Additionally, as the collections are static and their contents are never
modified after startup, it makes no sense to continously calculate the
location of and allocate an iterator for the `!= foo.end()` comparison;
the end iterator is now statically cached.
I'm not expecting massive speed gains out of this change, but the parser
does perform enough of these to make it worth optimizing in this way.
This reverts commit 7a74198aa3.
Believe it or not this commit actually increased copying. When accepting
a value you know you're going to take ownership of, just accept it by
value; then temporaries can invoke the move ctor and blah blah blah.
We really need a lightweight refcounted pass-by-value string to make this
less error prone.
If we switch the bind mode, we add a "force-repaint" there just to
redraw the mode indicator.
That's quite wasteful and annoying, considering that sometimes the prompt can take
half a second.
So we add a "repaint-mode" function that just reexecutes the
mode-prompt and uses the cached values for the others.
Fixes#5783.
As it turns out it didn't work much better, and it fell behind in
support when it comes to things that wcwidth traditionally can't
express like variation selectors and hangul combining characters, but
also simply $fish_*_width.
I've had to tell a few people now to rebuild with widecharwidth after
sending them on a fool's errand to set X variable.
So keeping this option is doing our users a disservice.
* Add "expand-abbr" bind function
This can be used to explictly allow expanding abbreviations.
* Make expanding abbr explicit
NOTE: This accepts them for space only, we currently also do it for \n
and \r.
* Remove now dead code
We no longer trigger an abbr implicitly, so we can remove the code
that does it.
* Fix comment
[ci skip]
Directly access the job list without the intermediate job_iterator_t,
and remove functions that are ripe for abuse by modifying a local
enumeration of the same list instead of operating on the iterators
directly (e.g. proc.cpp iterates jobs, and mid-iteration calls
parser::job_remove(j) with the job (and not the iterator to the job),
causing an invisible invalidation of the pre-existing local iterators.
This printed weird things like
```fish
$ functions -x
functions: Unknown option '-x'
(Type 'help functions' for related documentation)
```
Instead, let's make it
```fish
$ functions -x
functions: Unknown option '-x'
(Type 'help functions' for related documentation)
```
This was printed basically everywhere.
The user knows what they executed on standard input.
A good example:
```fish
set c (subme 513)
```
used to print
```
fish: Too much data emitted by command substitution so it was discarded
set -l x (string repeat -n $argv x)
^
in function 'subme'
called on standard input
with parameter list '513'
in command substitution
called on standard input
```
and now it is
```
fish: Too much data emitted by command substitution so it was discarded
set -l x (string repeat -n $argv x)
^
in function 'subme' with arguments '513'
in command substitution
```
See #5434.
Now:
```
cd: Unknown option '-r'
~/dev/fish-shell/share/functions/cd.fish (line 40):
builtin cd $argv
^
in function 'cd' with arguments '-r'
in function 'f'
in function 'd'
in function 'b' with arguments '-1q --wurst'
in function 'a'
called on standard input
```
See #5434.
This printed things like
```
in function 'f'
called on standard input
in function 'd'
called on standard input
in function 'b'
called on standard input
in function 'a'
called on standard input
```
As a first step, it removes the empty lines so it's now
```
in function 'f'
called on standard input
in function 'd'
called on standard input
in function 'b'
called on standard input
in function 'a'
called on standard input
```
See #5434.
This switches env_var_t to be an immutable value type, and stores its
contents via a shared_ptr. This eliminates string copying when fetching
env_var_t values.
If a function process is deferred, allow it to be unbuffered.
This permits certain simple cases where functions are piped to external
commands to execute without buffering.
This is a somewhat-hacky stopgap measure that can't really be extended
to more general concurrent processes. However it is overall an improvement
in user experience that might help flush out some bugs too.
In a job, a deferred process is the last fish internal process which pipes
to an external command. Execute the deferred process last; this will allow
for streaming its output.
I believe this was selected to be artificially low for the sake
of it displaying well in prompts. But people should expect to get
the same output as can be gotten from `hostname`.
Fixes#5758
The code already allowed for variable width (multicell) *display* of the
newline omitted character, but there was no way to define it as being
more than one `wchar_t`.
This lets us use a string on console sessions (^J aka newline feed)
instead of an ambiguous character like `@` (used in some versions of
vim for ^M) or `~` (what we were using).
The system version of `wcwidth()` reflects the capabilities of the
system's own virtual terminal's view of the width of the character in
question, while fish's enhanced version (`widechar_wcwidth`) is much too
smart for most login terminals, which generally barely support anything
beyond ASCII text.
If, at startup, it is detected that we are running under a physical
console rather than within a terminal emulator running in a desktop
environment, take that as a hint to use the system-provided `wcwidth`.
The commit began passing the length of the wide string rather than the
length of the narrowed string after conversion via `wcstombs`. We *do*
have the actual length, but it's not (necessarily) the same as the
original value. We need to pass the result of `wcstombs` instead.
POSIX dictates here that incomplete conversions, like in
printf %d\n 15.2
or
printf %d 14g
are still printed along with any error.
This seems alright, as it allows users to silence stderr to accept incomplete conversions.
This commit implements it, but what's a bit weird is the ordering between stdout and stderr,
causing the error to be printed _after_, like
15
14
15.1: value not completely converted
14,2: value not completely converted
but that seems like a general issue with how we buffer the streams.
(I know that nonfatal_error is a copy of most of fatal_error - I tried
differently, and va_* is weird)
Fixes#5532.
Before this change, - was sorted with other punctuation before
A-Z. Now, it sorts above the rest of the characters.
This has a practical effect on completions, where when there are
both -s and --long with the same description, the short option
is now before the long option in the pager, which is what is now
selected when navigating `foo -<TAB>`. The long options can be
picked out with `foo --<TAB>`. Before, short options which
duplicated a long option literally could not be selected by
any means from the pager.
Fixes#5634
This tweaks wcsfilecmp such that certain punctuation characters will
come after A-Z.
A big win with `set <TAB>` - the __prefixed fish junk now comes
after the stuff users should care about.
This disables an extra round of escaping in the `string replace -r`
replacement string.
Currently, to add a backslash to an a or b (to "escape" it):
string replace -ra '([ab])' '\\\\\\\$1' a
7 backslashes!
This removes one of the layers, so now 3 or 4 works (each one escaped
for the single-quotes, so pcre receives two, which it reads as one literal):
string replace -ra '([ab])' '\\\\$1' a
This is backwards-incompatible as replacement strings will change
meaning, so we put it behind a feature flag.
The name is kinda crappy, though.
Fixes#5474.
As a simple replacement for `wc -l`.
This counts both lines on stdin _and_ arguments.
So if "file" has three lines, then `count a b c < file` will print 6.
And since it counts newlines, like wc, `echo -n foo | count` prints 0.
Mostly related to usage _(L"foo"), keeping in mind the _
macro does a wcstring().c_str() already.
And a smattering of other trivial micro-optimizations certain
to not help tangibly.
C++11 provides std::min/std::max which we're using all over,
obviating the need for our own templates for this.
util.h now only provides two things: get_time and wcsfilecmp.
This commit removes everything that includes it which doesn't
use either; most because they no longer need mini or maxi from
it but some others were #including it unnecessarily.
Hangul uses three codepoints to combine to one glyph. The first has a
width of 2 (like the final glyph), but the second and third were
assigned a width of 1, which seems to match EastAsianWidth.txt:
> 1160..11FF;N # Lo [160] HANGUL JUNGSEONG FILLER..HANGUL JONGSEONG SSANGNIEUN
Instead, we override that and treat the middle and end codepoint as combiners,
always, because there's no way to figure out what the terminal will
think and that's the way it's supposed to work.
If they stand by themselves or in another combination, they'll indeed
show up with a width of 1 so we'll get it wrong, but that's less
likely and not expressible with wcwidth().
Fixes#5729.