This doesn't pull its weight. Block size is not a particularly big
problem,
and this both complicates the code a bit and would arbitrarily cause issues
if a fish script exceeded 65k lines.
This reverts commit edd6533a14.
This doesn't have any effect on the size of the struct (due to alignment
requirements and padding) but reduces the complexity by turning
Block::wants_pop_env into an emergent property dependent on the type rather than
something we have to manually manage.
We only increment it and check if it's non-zero, we never decrement or check the
actual count. As such, change it to a bool and bring the size of `Block` down
from 32 to 24 bytes.
We almost never access any of this and having it stored directly in the `Block`
struct increases its size (reducing how many we can fit in L1 and L2, and
increasing memory copy traffic).
Gets rid of BlockData::None so we can avoid allocating a Box at all when we have
no data (at the cost of yet-another-wrapper-type), which is the usual case.
This has a few advantages,
* We now statically assert that all fields used by a particular block type are
correctly initialized (i.e. you can't assign the function name but forget to
assign its arguments),
* Conversely, we can match directly on `BlockData` and be guaranteed that the
fields we want to access are initialized and present,
* We reduce the number of assertions, effectively "unwrapping" only once based
off the block type instead of each time we try to access a conditional field,
* We reduce the size of the `Block` struct by coalescing fields that cannot
co-exist, bringing it down from 104 bytes to 88 bytes.
It would be nice to make all of `Block` itself an enum, but it currently
requires `Copy` and we take advantage of that to copy it around everywhere.
Putting these fields directly in `Block` directly would mean a lot more memory
traffic just checking block types.
There's no need for two separate block types when one is merely a variant of the
other. This may have been required under C++ but thanks to sum types (rust's
enums) we don't need to do that any more.
If the backgrounded/stopped job was using the tty, sending it SIGCONT first
might cause it to immediately wake and try to use the tty (which fish still has
control over), causing it to immediately stop again after receiving a SIGTTOU.
We are supposed to send SIGHUP first so that when the process resumes it sees
the queued SIGHUP and executes its registered handler!
There's no guarantee that a condition variable is stateful. The docs for
`Condvar::notify_one()` actually say the opposite:
> If there is a blocked thread on this condition variable, then it will be woken
> up from its call to wait or wait_timeout. Calls to notify_one are not buffered
> in any way.
This test was relying on the main loop obtaining the lock and entering the
condition variable sleep before the thread was scheduled and got around to
notifying the condition variable. If this non-deterministic behavior was not
upheld, the test would time out since it would obtain the lock (either before or
after the variable were updated) then call `condvar.wait()` *after* the variable
had been updated and the condvar signalled, but without (atomically or even at
all) checking to see if the desired wake precondition was fulfilled. As the
child thread had already run and the wake notification was NOT buffered, there
was nothing to wake the running thread.
There really wasn't any way to salvage the test as originally written, since the
write to `ctx.val` was not in any way linked to the acquire/release of the mutex
so regardless of whether or not the main thread obtained the mutex and checked
the value precondition before calling `condvar.wait()`, the child thread's write
could have happened after the check but before the wait() call. As such, the
test has been rewritten to use `wait_while()` but then also updated to bail in
case of a timeout instead of hanging indefinitely (since neither the `ctest`
runner nor the `cargo test` harness was timing out; `cargo test` would only
report that the test had exceeded 60 seconds but as long as it was not executed
with `cargo test -- -Z --ensure-time` (which is only available under nightly),
the test would not halt.
If this test were *intentionally* written to test the scenario that was timing
out, it should be written deterministically in such a way that the main loop
did not run until after it was guaranteed that the variable had been updated
(i.e. by looping until val became 5 or waiting for an AtomicBool indicating the
update had completed to be set), but I'm not sure what the benefit in that would
be since the docs actually guarantee the opposite behavior (the notified state
is explicitly not cached/buffered).
If we have fish code written with the assumption that condvar notifications
prior to *any* call to `Condvar::wait()` *are* buffered, then that code should
of course be revisited in light of this.
Commit 8a7c3ce (Don't abandon line after writing control sequences, 2024-04-06)
was broken by 29f2da8 (Toggle terminal protocols lazily, 2024-05-16), fix that.
Fixes#10529
This makes `path basename` a more useful replacement for the stock `basename`
command, which can be used with `-s .ext` to trim `.ext` from the base name.
Previously, this would have required the equivalent of
path change-extension "" (path basename $path)
but now it can be just
path basename -E $path
Mostly replacing std::<type>::MAX with <type>::MAX.
Surprising here is replacing
.expect(format!(...))
with
.unwrap_or_else(|_| panic!(...))
It explains that this is because the "format!" would always be called.
This enabled the profile in fish_setlocale, which caused startup
profile to always be on, so
```fish
fish --profile file -c 'foo'
```
would show the entire startup as well
Hex float parsing may come about through wcstod, for example:
printf "%f" '0x8p2'
should output 32.0.
Currently we use a not-great fork of hexponent. Hexponent has been dormant for
years, and has some issues: doesn't round properly, allocates unnecessarily,
doesn't handle denormals, is more complicated than necessary.
Just rewrite hex float parsing, fixing those problems and getting us off of this
weird fork.
ThreadId is way slower than it should be for the sense that we use it in; it
doesn't cache the id and allocates an Arc internally.
We don't care about the thread id used in crate::threads correlating with any
other thread id the code uses anywhere (not that it does) because it's only used
for our own bookkeeping. Change to something much simpler instead.
Verified that std::sync::OnceLock<T> compiles to the same assembly at the
*access* site as the Option<T> we were using. The additional overhead upon init
is fine. No need for extra Box<T> indirection for IO_THREAD_POOL.
While obtaining an uncontested mutex from the same thread (without reentrance)
is basically ~free, the use of `MainThread<RefCell<T>>` instead of `Mutex<T>`
makes it clear that there is no actual synchronization taking place, hopefully
making the code easier to understand.
The C++ version of this code simply copied the entire uvar table.
Today we take a reference. It's not clear which one is better.
Removal of locale variables like LC_ALL triggers variable change handlers
which call EnvStackImpl::get. This deadlocks because we still hold the lock
to protect the reference to all uvars. Work around this.
Closes#10513
It is short and simple enough to write yourself if you need it and it encourages
bad behavior by a) always returning owned strings, b) always allocating them in
a vector. If/where possible, it is better to a) use &wstr, b) use an iterator.
In rust, it's an anti-pattern to unnecessarily abstract over allocating
operations. Some of the call sites even called split_string(..).into_iter().
This updates is_windows_subsystem_for_linux() to take a WSL version to test for
(any, v1, or v2) and returns the boolean result depending on the system. I've
benchmarked and when running on regular Linux, this is still just as fast as the
previous binary check; it's only when it's WSL that this takes about 20ns
longer to figure out which variant.
Note that older WSLv2 kernels had a `-microsoft-standard` suffix while newer
ones appear to have a `-microsoft-standard-WSL2` suffix, so we make sure to test
for the least common denominator. (It doesn't matter to us, but note that newer
WSLv2 kernels have four dots in the version string!)
WSL workarounds pertaining to the default Windows terminal or executable
behavior of win32 binaries under a WSL shell are extended to WSLv2 while those
specific to oddities in kernel behavior are confined to WSLv1 only. (It
technically wouldn't hurt to extend them to WSLv2 but there's no good reason to
do so, either.)
A common complaint has been the massive amount of directories Windows appends to
$PATH slowing down fish when it attempts to find a non-existent binary (which it
does a lot more often than someone not in the know might think). The typical
workaround suggested is to trim unneeded entries from $PATH, but this a) has
considerable friction, b) breaks resolution of Windows binaries (you can no
longer use `clip.exe`, `cmd.exe`, etc).
This patch introduces a two-PATH workaround. If the cmd we are executing does
not contain a period (i.e. has no extension) it by definition cannot be a
Windows executable. In this case, we skip searching for it in any of the
auto-mounted, auto-PATH-appended directories like `/mnt/c/Windows/` or
`/mnt/c/Program Files`, but we *do* include those directories if what we're
searching for could be a Windows executable. (For now, instead of hard-coding a
list of known Windows executable extensions like .bat, .cmd, .exe, etc, we just
depend on the presence of an extension at all).
e.g. this is what starting up fish prints with logging enabled (that has been
removed):
bypassing 100 dirs for lookup of kill
bypassing 100 dirs for lookup of zoxide
bypassing 100 dirs for lookup of zoxide
bypassing 100 dirs for lookup of fd
not bypassing dirs for lookup of open.exe
not bypassing dirs for lookup of git.exe
This has resulted in a massive speedup of common fish functions, especially
anywhere we internally use or perform the equivalent of `if command -q foo`.
Note that the `is_windows_subsystem_for_linux()` check will need to be patched to
extend this workaround to WSLv2, but I'll do that separately.
Under WSL:
* Benchmark `external_cmds` improves by 10%
* Benchmark `load_completions` improves by an incredible 77%
On this binding we fail to disable CSI u
bind c-t '
begin
set -lx FZF_DEFAULT_OPTS --height 40% --bind=ctrl-z:ignore
eval fzf | while read -l r; echo read $r; end
end
'
because for "fzf", ParseExecutionContext::setup_group() returns early with the
parent process group (which should be fish's own) , hence "wants_terminal"
is false. This seems questionable, I don't think the eval should make a
difference here.
For now, don't touch it; use the more accurate way of detecting whether
a process may read keyboard input. In many of such cases "wants_terminal"
is false, like
echo (echo 1\n2\n3 | fzf)
Fixes#10504
This hot function dominates the flamegraphs for the completions thread, and any
optimizations are worthwhile.
A variety of different approaches were tested and benchmarked against real-world
fish-history file inputs and this is the one that won out across all rustc
target-cpu variations tried.
Benchmarks and code at https://github.com/mqudsi/fish-yaml-unescape-benchmark
We don't forward this variable for storage in any structs, so there's no reason
to go through an Arc instead of returning the `&'static EnvStack` directly.
NB: This particular change was safe, and passes all tests on its own.
We don't forward this variable for storage in any structs, so there's no reason
to go through an Arc instead of returning the `&'static EnvStack` directly.
These have clearer sync/unsync semantics and now ship with rust itself.
They don't paper over any possible cross-thread issues, and we can specifically
choose which we want for the purpose.
`Parser` is a single-threaded `!Send`, `!Sync` type and does not need to use
`Arc` for anything. We were using it because that's all we had for the parser's
`EnvStack`, but though that is *technically* protected internally by a mutex
(shared with global EnvStack), there's nothing to say that other parsers with a
narrower scope/lifetime on other threads will be necessarily using the same
backing mutex.
We can safely marshal the existing `Arc<EnvStack>` we get from
`environment::principal()` into an `Rc<EnvStack>` since the underlying reference
is always valid. To prove this point, we could have PRINCIPAL_STACK be a static
`EnvStack` and have `environment::principal()` use `Arc::from_raw()` to turn
that into an `Arc<EnvStack>`, but there's no need to factorize this process.
By inverting the order of storage, we can use an `OnceCell`/`unsync::Lazy`
inside the Send/Sync `MainThread<T>` and remove the need for a lock altogether.
It's reasonable since this is only checking to see that the history file
contains the expected format and if it's corrupted but we at least got what we
expect to be the correct key/value pairs, then that's all we can do.
Of course the real motivation is to speed up this very hot function in any way
possible!
On the completions and history thread, the parent function
HistoryFileContents::decode_item() is responsible for ~60% of the CPU time, and
extract_prefix_and_unescape_yaml() alone comprising 14% (of the total).
This change removes allocations in the event that the history item is either
fully or partially plain yaml with no escapes to begin with, and brings down the
execution time of this function to only 7% of the total execution time.
The bulk of the remaining time is spent in wcs2string(), which is called
unconditionally and is naturally alloc-heavy.
This allows running `set` without triggering any event handlers.
That is useful, for example, if you want to set a variable in an event
handler for that variable - we could do it, for example, in the
fish_user_path or fish_key_bindings handlers.
This is something the `block` builtin was supposed to be for, but it
never really worked because it only allows suppressing the event for
the duration, they would fire later. See #9030.
Because it is possible to abuse this, we only have a long-option so
that people see what is up.
Commit a583fe723 ("commandline -f foo" to skip queue and execute immediately,
2024-04-08) fixed the execution order of some bindings but was partially
backed out in 5ba21cd29 (Send repaint requests through the input queue again,
2024-04-19) because repainting outside toplevel yields surprising results
(wrong $status etc).
Transient prompts wants to first repaint and then execute some more readline
commands, all within a single binding. This was broken by the second commit
because that one defers the repaint until after the binding has finished.
Work around this problem by deferring input events again while a readline
event was queued. This is closest to the historical behavior.
The implementation feels hacky; we might find odd situations.
For example,
commandline -f repaint end-of-line
set token (commandline -t)
sets the wrong token.
Probably not a very important case. We could throw an error or make it work
by letting "commandline -t" drain the input queue.
That seems too complicated, better change repaints to not use the input queue
(and fake $status etc). Let's try to do that in future.
Closes#10492
[w]open_dir does not pass O_CREAT, so the mode argument to open is never used.
also, O_CREAT | O_DIRECTORY could not be used (portably) to create a directory.
(on POSIX does not specify what should happen, on Linux it is EINVAL.)
rustc 1.80 now complains about features not declared in Cargo.toml and cfg
keys/values not declared by build.rs to protect against typos or misuse (you
think you're using the right condition but you're not). See
rust-lang/cargo#10554 and rust-lang/rust#82450.
(We're not actually using TSAN under CI at this time, but I do want to re-enable
it at some point — especially if we get multithreaded execution going — using
the rust-native TSAN configuration.)
I'll be updating the `rsconf` crate and patching `build.rs` accordingly to also
handle the warnings about unknown cfg values, but tsan is a feature and not a
cfg and these can be dealt with in `Cargo.toml` directly.
We were passing a slice (and not a vec) to `CString::new()`, meaning it would
allocate a new Vec internally to hold the bytes.
Also document that the resulting CString will be silently truncated at the first
interior NUL.
The function was repeatedly calling `s.char_at(n)` which is O(1) only for UTF-32
strings (so not a problem at the moment). But it was also calling `hex_digit(n)`
twice for each `n` in the 3-digit case, causing unnecessary repeated parsing of
individual characters into their radix-16 numeric equivalents, which could be
avoided just by reusing the already calculated result.
We will continue to use the "normal" fish base directory detection when using
the CMake test harness which properly sets up a sandboxed $HOME for fish to use,
but when running source code tests with a bare `cargo test` we don't want to
write to the actual user's profile.
This also works around test failures when running `cargo test` under CI with a
locked-down $HOME directory (see #10474).
The test_history_formats test was reading from build/tests/ which is an artifact
of the cmake test runner. The source code tests should not depend on the cmake
test harness at all, so this is changed to read from the original test source in
the ./tests/ directory instead.
As documented in #10474, there are issues with 64-bit floating point rounding
under x86 targets without SSE2 extensions, where x87 floating point math causes
imprecise results.
Document the shortcoming and provide some version of the test that passes
regardless of architecture.
FISH_BUILD_DIR (nominally, ./build) is created by cmake. If you only check out
the project via git and then run `cargo build`, this directory won't exist and
many of the tests will fail.
%ld expects a 4-byte parameter on 32-bit architectures and an 8-byte parameter
on 64-bit architectures, but we supplied are trying to supply a 64-bit parameter
that would overflow 32-bit storage.
Use %lld instead which expects a `long long` parameter, which should be 8-bytes
under both architectures.
See #10474
I think given a local terminal running fish on a remote system, we can't
assume that an input sequence like \ea is sent all in one packet. (If we
could that would be perfect.)
Let's readd the default escape delay, to avoid a potential regression, but
make it only apply to raw escape bindings like "bind \e123". Treat sequences
like "bind escape,1,2,3" like regular sequences, so they can be bound on
all terminals.
This partially reverts commit b815319607.
Given "abbr foo something", the input sequence
foo<space><ctrl-z><space>
would re-expand the abbreviation on the second space which is surprising
because the cursor is not at or inside the command token. This looks to be
a regression from 00432df42 (Trigger abbreviations after inserting process
separators, 2024-04-13)
Happily, 69583f303 (Allow restricting abbreviations to specific commands
(#10452), 2024-04-24) made some changes that mean the bad commit seems no
longer necessary. Not sure why it works but I'll take it.
As reported on gitter, commands like "rm (...)" sometimes want a previous
command inside the parentheses. Let's try that. If a user actually wants
to search for a command substitution they can move the cursor outside the
command substitution, or type the search string after pressing ctrl-r?
On a command with multiline quoted string like
begin
echo "line1
line2"
end
we actually indent line2 which seeems misleading because the indentation
changes the behavior when typed into a script.
This has become more prominent since commits
- a37629f86 (fish_clipboard_copy: indent multiline commands, 2024-04-13)
- 611a0572b (builtins type/functions: indent interactively-defined functions, 2024-04-12)
- 222673f33 (edit_command_buffer: send indented commandline to editor, 2024-04-12)
which add indentation to an exported commandline.
Never indent quoted strings, to make sure the rendering matches the semantics.
Note that we do need to indent the opening quote which is fine because
it's on the same line.
While at it, indent command substitutions recursively. That feature should
also be added to fish_indent's formatting mode (which is the default).
Fortunately the formatting mode already works fine with quoted strings;
it does not indent them. Not sure how that's done and whether indentation
can use the same logic.
Given "1(23)4", this function returns an inclusive range, from the opening
to the closing parenthesis. The subcommand is extracted by incrementing
the range start and interpreting the result as an exclusive range.
This is confusing, especially if we want to add multi-character quotes.
Change it to always return the full range (including parentheses) and provide
an easy way to access the command string.
While at it, switch to returning an enum.
This change is perhaps larger and more complex than necessary (sorry)
because it is mainly made with multi-character quotes in mind. Let's see
if that works out.
This removes IsOkAnd and the is_some_and method.
I cannot actually find is_none_or in the stdlib?
I've kept the trait name to avoid changing it now and then later, maybe this should
be moved elsewhere to avoid claiming it's an stdlib thing?
After abandoning a commandline (for example with ctrl-c) it's nice to be
able to restore it. There is little reason to discard the requisite undo
information, so keep it.
This allows making something like
```fish
abbr --add gc --position anywhere --command git back 'reset --hard
HEAD^'
```
to expand "gc" to "reset --hard HEAD^", but only if the command is
git (including "command git gc" or "and git gc").
Fixes#9411
Running
echo foo | vim -
gets us in a weird situation because we put the job in fish's process groups.
It causes us to not set a PGID for this job, so it can't be resumed among
other things.
Stopping the job with ctrl-z and try to exit the shell causes a crash in the
"There are still jobs active" warning because the PID for the job is still 0.
Let's remove the assertion to restore previous behavior, and hopefully fix
this later.
Remove the last non scoped place where we disable protocols (just before
exec(1)); it's not necessary with the current approach because we always
disable inside eval.
There is an edge case where we don't:
fish -ic "exec bash"
leaving bash with CSI u enabled. Disable that also in -ic mode where we
don't have a reader.
In future we should use the same approach for restore_term_mode() but I'm
not sure which one is better.
We enable terminal protocols once at startup, and disable them before exit.
Additionally, we disable them while evaluating commands (see 8164855b7 (Disable
terminal protocols throughout evaluation, 2024-04-02))..
Thirdly, we re-enable protocols inside builtin read (where it's disabled
because we are evaluating something). All of these three are scoped and
statically guaranteed to not leak into each others scopes.
There is another place where we enable protocols non-scoped: when we
receive a notification that a job is stopped. If this is ever hit, things
will be imbalanced and we'll fail to restore the right terminal state,
or (more likely) crash due the assertion in terminal_protocols_enable().
This code path used to be necessary when we disabled protocols only while
actually executing an external command but we changed that in 8164855b7,
so it should no longer be. Remove it.
I haven't been able to find a test case, I'll try to do that later.
The main reason we changed the scope of protocols was focus reporting (#10408).
We have given up on that for now (outside tmux where I can't get it to work)
so we might want to reconsider and go back to the "optimized" approach of
enabling it for as long as possible. But this is simpler, easier to verify.
This tries to open the given file to use as stdin, and if it fails,
for any reason, it uses /dev/null instead.
This is useful in cases where we would otherwise do either of these:
```fish
test -r /path/to/file
and string match foo < /path/to/file
cat /path/to/file 2>/dev/null | string match foo
```
This both makes it nicer and shorter, *and* helps with TOCTTOU - what if the file is removed/changed after the check?
The reason for reading /dev/null instead of a closed fd is that a closed fd will often cause an error.
In case opening /dev/null fails, it still skips the command.
That's really a last resort for when the operating system
has turned out to be a platypus and not a unix.
Fixes#4865
(cherry picked from commit df8b9b7095)
This introduces a feature flag, "test-require-arg", that removes builtin test's zero and one argument special modes.
That means:
- `test -n` returns false
- `test -z` returns true
- `test -x` with any other option errors out with "missing argument"
- `test foo` errors out as expecting an option
`test -n` returning true is a frequent source of confusion, and so we are breaking with posix in this regard.
As always the flag defaults to off and can be turned on. In future it will default to on and then eventually be made read-only.
There is a new FLOG category "deprecated-test", run `fish -d deprecated-test` and it will show any test call that would change in future.
This is similar to f7dac82ed (Escape separators (colon and equals) to improve
completion, 2019-08-23) except we only escape : and = if they are the result of
file completions. This way we avoid issues with custom completions like dd.
This also means that it won't work for things like __fish_complete_suffix
[*] but that can be fixed later, once we can serialize the DONT_ESCAPE flag.
By moving the escaping step earlier, this causes some unit test changes
which should not result in actual behavior change.
See also #6099
[*]: The new \: and \= does not leak from "complete -C" because that command
unescapes its output -- unless --escape is given.
Another consequence of a583fe723 ("commandline -f foo" to skip queue
and execute immediately, 2024-04-08) is that "commandline -f repaint"
will paint the prompt with the current value of $status which might be
set from a shell command in a the currently executing binding, instead of
waiting for the top-level status. This is wrong, at least historically. It
surfaces in bindings like alt-w which always paint a status value of [1]
when on single-lines commandlines.
Another regression is that a redundant repaint in a signal handler outputs
an extra prompt.
Fix both by making repaint commands go over the input queue again. This way,
they are always run with a good commandline state. There is no need to
repaint immediately because I don't think anyone has a data dependency on it
(we currently don't expose the prompt string), it's only for rendering.
We sometimes leak ^[[I and ^[[O focus reporting events when run from VSCode's
"Run python file" button in the top right corner. To reproduce I installed
the ms-python extension set the VSCode default shell to fish and repeatedly
ran a script that does "time.sleep(1)". I believe VSCode synthesizes keys
and triggers a race condition.
We can probably fix this but I'm not sure when I'll get to it (given how
relatively unimportant this feature is).
So let's go back to the old behavior of only enabling focus reporting in tmux.
I believe that tmux is affected by the same VSCode issue (also on 3.7.1 I
think) but I haven't been able to get tmux to emit focus reporting sequences
yet. Still, keep it to not regress cursor shape (#4788). So far this is
the only motivation for focus reporting and I believe it is only relevant
for terminals that can split windows (though there are a bunch that do).
Closes#10448
While it does need to store the string, we also need to use the string after
storing it, so we aren't getting any advantage from passing by value. Just pass
by reference to simplify the call sites.
This is another problem that has been bothering me for years: as mentioned
in 1dd901e52 (Maintain cursor in history prefix search, 2024-04-12), up-arrow
search highlights search matches but the contrast is really bad, especially in
command position, because the search matches --background=brblack is combined
with whatever foreground syntax highlighting the command has. The history
pager had a similar problem (for the selected history item) but circumented
it by disabling syntax highlighting altogether for the selected item.
fish_color_search_match's foreground component is ignored.
Let's use it instead of syntax highlighting.
This fixes the contrast on some default colorschemes but the bryellow
foreground looks weirdly like an error/warning on some terminals. Change it
to white. This needs a hack because we don't have a canonical way to tell
if a uvar has been set by the user. Fortunately the foreground component
hasn't been used at all so far, so we're not so much changing it as much as
initializing it.
On Konsole with
function my-bindings
bind --preset --erase escape
bind escape,i 'echo escape i'
end
set fish_key_bindings my-bindings
the "escape,i" binding doesn't trigger. This is because of our special
handling of the escape key prefix. Other multi-key bindings like "bind j,k"
wait indefinitely for the second character. But not "escape,i"; that one
has historically had a low timeout (fish_escape_delay_ms). The motivation
is probably that we have a "escape" binding as well that shouldn't wait
indefinitely.
We can distinguish between the case of raw escape sequence binding like "\e123"
and a binding that talks about the actual escape key like "escape,i". For the
latter we don't need the special treatment of having a low timeout, so make it
fall back to "fish_sequence_key_delay_ms" which waits indefinitely by default.
When we read bytes like \xfc that don't produce a Unicode code point,
we encode them in a Unicode private use area.
This encoding should be transparent to the user.
We accidentally add it to uvar files as \uf6fc in this case. When reading
it back, read_unquoted_escape() will fail at the "fish_reserved_codepoint(c)"
check. This check is to avoid external input being misinterpreted
as one of our in-band signalling characters like ANY_CHAR (for *).
For encoded raw bytes, this check probably doesn't really matter in terms of
security because the only thing we do with these bytes is convert them back
to raw. So we could allow unescaping them at this point, thus supporting
old uvar files.
However that seems like the wrong direction. PUA encoding should never leak.
So let's instead make sure to serialize it as \xfc instead of \f6fc going
forward.
Fixes#10313
Popular operating systems support shift-delete to delete the selected item
in an autocompletion widgets. We already support this in the history pager.
Let's do the same for up-arrow history search.
Related discussion: https://github.com/fish-shell/fish-shell/pull/9515
On
a;
we don't expand the abbreviation because the cursor is right of semicolon,
not on the command token. Fix this by making sure that we call expand-abbr
with the cursor on the semicolon which is the end of the command token.
(Now that our bind command execution order is less surprising, this is doable.)
This means that we need to fix the cursor after successfully expanding
an abbreviation. Do this by setting the position explicitly even when no
--set-position is in effect.
An earlier version of this patch used
bind space self-insert backward-char expand-abbr or forward-char
The problem with that (as a failing test shows) was that given "abbr m
myabbr", after typing "m space ctrl-z", the cursor would be after the "m",
not after the space. The second space removes the space, not changing the
cursor position, which is weird. I initially tried to fix this by adding
a hack to the undo group logic, to always restore the cursor position from
when begin-undo-group was used.
bind space self-insert begin-undo-group backward-char expand-abbr end-undo-group or forward-char
However this made test_torn_escapes.py fail for mysterious reasons.
I believe this is because that test registers and triggers a SIGUSR1 handler;
since the signal handler will rearrange char events, that probably messes
with the undo group guards.
I resorted to adding a tailor-made readline cmd. We could probably remove
it and give the new behavior to expand-abbr, not sure.
Fixes#9730
File names that have lots of spaces look quite ugly when inserted as
completions because every space will have a backslash.
Add an initial heuristic to decide when to use quotes instead of
backslash escapes.
Quote when
1. it's not an autosuggestion
2. we replace the token or insert a fresh one
3. we will add a space at the end
In future we could relax some of these requirements.
Requirement 2 means we don't quote when appending to an existing token.
Need to find a natural behavior here.
Re 3, if the completion adds no space, users will probably want to add more
characters, which looks a bit weird if the token has a trailing quote.
We could relax this requirement for directory completions, so «ls so»
completes to «ls 'some dir with spaces'/».
Closes#5433
On Konsole, given
bind escape,i 'echo escape i'
bind alt-i 'echo alt-i'
pressing alt-i triggers the wrong binding. This is because we treat "escape
followed by i" as "alt-i". This is to support raw sequences like "\ei"
which are probably meant as "alt-i" -- we match such inputs to both mappings.
This double matching is not necessary for new-style bindings which
unambiguously describe the key presses, so let's activate this sequence
matching only for bindings specified as raw sequences.
Conversely, we currently fail to match an XTerm raw binding for ctrl-enter:
echo 'XTerm.vt100.formatOtherKeys: 0' | xrdb
xterm -e fish
bind \e\[27\;5\;13~ execute
because we decode this to a single char; we match the leading CSI but not
the entire sequence. So this is a raw binding where we accidentally
match full, modified keys. Fix that too (two birds with one stone).
I think commit 8386088b3 (Update commandline state changes eagerly as well,
2024-04-11) broke the alt-s binding.
This is because we update the commandline state snapshot (which is consumed
by builtin commandline and others) only at key points. This seems like a
dubious optimization. With the new streamlined bind execution semantics,
this doesn't really work anymore; any shell command can run any number of
commands like "commandline -i foo" which should synchronize.
Do the simple thing of calculating the snapshot whenever needed.
The search term highlighting looks looks really bad on the default theme
because the command is highlighted as dark blue and the search term adds
a dark background. If this new feature motivates us to finally fix this,
that would be great.
Closes#10430
The new reader_execute_readline_cmd() runs apply_commandline_state_changes()
to make sure that given
bind x "commandline --insert foo; commandline -f backward-char"
the backward-char command knows about the insertion of "foo". This
causes problems when running "sleep 1&" and typing some characters -
the commandline will be cleared when the job finishes. This is because
apply_commandline_state_changes() works with stale information in this case.
Let's call it as soon as we know it's needed. This is less messy and fits
better with the new bind function semantics ("execute things in the order
they are written").
This makes them more convenient to use interactively, similar to the existing
\c and \a versions. The resulting bind output keeps using the canonical
ctrl/alt version.
Not sure about s- because that's somewhat ambiguous, it could be "super".
See the parent commit for some context. Turns out that 8bf8b10f6 (Extended &
human-friendly keys, 2024-03-30) broke this for terminals that speak CSI u.
This is pretty complex, probably not worth it.
When a terminal sends \x1ba, that could be either escape,a or alt-a.
Historically we've handled this with an escape delay that defaults to 30
milliseconds. If we read nothing for that time, it's escape. Otherwise it's
an alt modifier (or an escape sequence).
As a side effect of 8bf8b10f6 (Extended & human-friendly keys, 2024-03-30) we
added a new way of disambiguating escape: whenever we read the escape byte,
we immediately try another (nonblocking) read. If it succeeds, we treat it
as modifier, else it's escape. Before that commit, we didn't have a concept
of modifiers.
The new way works fine for disambiguating escape,a from alt-a (as pressed
by the user) because only for alt-a the data is sent in the same packet.
So we no longer need the escape delay to disambiguate the alt from the
escape key. Let's simplify things by not using it by default.
The escape delay as set by fish_escape_delay_ms also serves another purpose;
it allows to disambiguate "escape,a" from "escape (pause) a". For that use
case we want to keep it.
As mentioned in 8a7c3ceec (Don't abandon line after writing control sequences,
2024-04-06) we need to freshed stdout timestamps after writing to stdout
but before we might redraw, in particular when writing control sequences.
Commit a583fe723 ("commandline -f foo" to skip queue and execute immediately,
2024-04-08) made "commandline -f repaint" redraw immediately, while still
executing the bound shell command; at that time we have written "disabling"
sequences but not refreshed timestamps yet, so do that.
This is probably not needed for commands outside the repaint family.
Needless to say that this is messy, maybe we can simplify things in future.
Ref https://github.com/fish-shell/fish-shell/issues/10409#issuecomment-2044863817
If a key's codepoint is in the PUA1 range, it could
be either from our own named keys (like key::Space)
or from a CSI u key that we haven't assigned a name yet
https://sw.kovidgoyal.net/kitty/keyboard-protocol/#functional-key-definitions
(The latter can still be bound using the \u1234 or the equivalent \e[4660u
raw CSI u sequence.)
It doesn't make sense to insert a PUA character into the commandline when
the user presses PrintScreen; ignore them silently.
This partially reverts b77d1d0e2 (Stop crashing on invalid Unicode input,
2024-02-27). That commit did:
1. convert input byte sequences that map to a PUA codepoint into several
characters, using our on-char-per-byte PUA encoding.
2. do the same for inputs that are codepoints outside the valid Unicode range.
3. render them as replacement character (one per input byte)
In future, we should probably remove these features altogether, and simply
ignore invalid Unicode code points.
Commit c3cd68dda (Process shell commands from bindings like regular char
events, 2024-03-02) mentions a "weird ordering difference".
The issue is that "commandline -f foo" goes through the input
queue while other commands are executed directly.
For example
bind ctrl-g "commandline -f end-of-line; commandline -i x"
is executed in the wrong order. Fix that.
This doesn't yet work for "commandline -f exit" but that can be fixed easily.
It's hard to imagine anyone would rely on the existing behavior. "commandline
-f" in bindings is mostly used for repainting the commandline.
If a binding was input starting with "\e", it's usually a raw control sequence.
Today we display the canonical version like:
bind --preset alt-\[,1,\;,5,C foo
even if the input is
bind --preset \e\[1\;5C foo
Make it look like the input again. This looks more familiar and less
surprising (especially since we canonicalize CSI to "alt-[").
Except that we use the \x01 representation instead of \ca because the
"control" part can be confusing. We're inside an escape sequence so it seems
highly unlikely that an ASCII control character actually comes from the user
holding the control key.
The downside is that this hides the canonical version; it might be surprising
that a raw-escape-sequence binding can be erased using the new syntax and
vice versa.
We don't yet support all keys from
https://sw.kovidgoyal.net/kitty/keyboard-protocol/#functional-key-definitions
Instead of displaying a private-use character, show the character code;
this can be used to map the key even if we don't know a name for it.
bind \uE011 'echo print screen'
bind ctrl-\uE011 'echo do control + print screen'
Note that it's also possible to mape the raw CSI u sequence, like
bind \e\[57361u 'echo print screen'
but we should not encourage that syntax because it does not allow adding
the modifiers like ctrl.
Of course leaking the PUA character code is not ideal.
As implied by the changelog.
Unfortunately it's not obvious how to access the RefCell value in spite
of a potential (albeit unlikely) present mutable borrow. We need to use a
different type to make it work in such cases, hopefully doing that in future.
In future we could even use panic=abort and use this style of cleanup for
panics (instead of RAII).
For numpad 1 with nulock, Alacritty sends
escape,[,5,7,4,0,0,u
which is codepoint \x31, key "1". We have a terminfo mapping for "sright"
which translates to
escape,[,1,;,2,C
The first two characters, escape and [ match. Then we accidentally match the
"1" from the mapping against the entire sequence, because that sequence is
canonicalized to codepoint "1" . The most blatant problem is that we discard
the rest of the sequence. Fix that.
This allows us to re-enable raw CSI u mappings like "bind \e[1u ..."
which is what kitty uses for shell integration.
This allows terminals like foot and kitty to
* scroll to the previous/next prompt with ctrl-shift-{z,x}
* pipe the last command's output to a pager with ctrl-shift-g
Kitty has existing fish shell integration
shell-integration/fish/vendor_conf.d/kitty-shell-integration.fish which we
can simplify now. They keep a state variable to decide which of prompt start,
command start or command end to output. I think with our implementation
this is no longer necessary, at least I couldn't reproduce any difference.
We also don't need to hook into fish_cancel or fish_posterror like they do;
only in the one place where we actually draw the prompt.
As mentioned in the above shell integration script, kitty disables reflow
when it sees an OSC 133 marker, so we need to do it ourselves,
otherwise the prompt will go blank after a terminal resize.
Closes#10352
Commit 8164855b7 (Disable terminal protocols throughout evaluation, 2024-04-02)
changed where we output control sequences (to enable bracketed paste and CSI).
Likewise, f285e85b0 (Enable focus reporting only just before reading from
stdin, 2024-04-06) added control sequence output just before we read().
This output causes problems because it invalidates our stdout/stderr
timestamps, which causes us to think that a rogue background process wrote
to the terminal; we react by abandoning the current line and redrawing the
prompt below. Our fix was to refresh the TTY timestamps after we run a bind
command that might add stdout (#3481).
Since commit c3cd68dda (Process shell commands from bindings like regular
char events, 2024-03-02), this timestamp refresh logic is in the wrong place;
shell commands are run later now; we could move it but wait -
... we also need to make sure to refresh timestamps after outputting control
sequences. Since bracketed paste is enabled after CSI u, we can skip the
latter. Additionally, since we currently output control sequences before
every single top-level interactive command, we no longer need to separately
refresh timestamps in between commands.
Fixes#10409
Some terminals send the focus-in sequences ("^[I") whenever focus reporting is
enabled. We enable focus reporting whenever we are finished running a command.
If we run two commands without reading in between, the focus sequences
will show up on the terminal.
Fix this by enabling focus-reporting as late as possible.
This fixes the problem with `^[I` showing up when running "cat" in
gnome-terminal https://github.com/fish-shell/fish-shell/issues/10411.
This begs the question if we should do the same for CSI u and bracketed paste.
It's difficult to answer that; let's hope we find motivating test cases.
If we enable CSI u too late, we might misinterpret key presses, so for now
we still enable those as early as possible.
Also, since we now read immediately after enabling focus events, we can get
rid of the hack where we defer enabling them until after the first prompt.
When I start a fresh terminal, the ^[I no longer shows up.
It's not clear whether builtin read should be able to do everything
that the normal prompt does but I guess we haven't found a problem yet.
Given that read could be used to read a single character at a type,
it's a bit odd to toggle terminal protocols all the time.
But that's not the typical case (at least not for when stdin is a TTY),
and it seems fine.
Teste with
bind ctrl-4 'echo yay'
Regressed in 8164855b7 (Disable terminal protocols throughout evaluation,
2024-04-02).
Apparently VTE terminals send the "focus in" event whenever we re-enable
focus reporting. That's probably a sensible thing to do.
Anyway, our problem is simply that we accidentally end history search on these
focus events which are implemented as anonymous (unmappable) readline cmds.
Perhaps there should be a separate cmd category.
Focus events show up as key::Invalid which is a weird private use code point;
probably we can get rid of this key..
Fixes#10411
See the changelog additions for user-visible changes.
Since we enable/disable terminal protocols whenever we pass terminal ownership,
tests can no longer run in parallel on the same terminal.
For the same reason, readline shortcuts in the gdb REPL will not work anymore.
As a remedy, use gdbserver, or lobby for CSI u support in libreadline.
Add sleep to some tests, otherwise they fall (both in CI and locally).
There are two weird failures on FreeBSD remaining, disable them for now
https://github.com/fish-shell/fish-shell/pull/10359/checks?check_run_id=23330096362
Design and implementation borrows heavily from Kakoune.
In future, we should try to implement more of the kitty progressive
enhancements.
Closes#10359
To do so add an ad-hoc "commandline --search-field" to operate on pager
search field.
This is primarily motivated because a following commit reuses the
fish_clipboard_paste logic for bracketed paste. This avoids a regression.
Terminal titles are set with an OSC 0 sequence. I don't think we want to
support terminals that react badly to unknown OSC (or CSI) sequences.
So let's remove our feature detection.
This will fix future false negatives along the lines of
https://github.com/fish-shell/fish-shell/pull/10037
This binding is akin to ForwardSingleChar but it is "passive" in that is not
intended to affect the meta state of the shell: autocompletions are not accepted
if the cursor is at the end of input and it does not have any effect in the
completions pager.
Currently, we expand command-abbrs (those with `--position command`) after `if`, but not after `command` or `builtin` or `time`:
```fish
abbr --add gc "git checkout"
```
will expand as `if gc` but not as `command gc`.
This was explicitly tested, but I have no idea why it shouldn't be?
The incorrect order of operations was being used since && binds tighter than ||
in rust (as with most sane languages).
Under Linux, EAGAIN == EWOULDBLOCK so this would always succeed in the case of a
non-blocking fd without making the call to make_fd_nonblocking().
Comparing to the 3.7.0 C++ code, it looks like this was an oversight introduced
in the migration to rust.
In particular, this allows restoring the terminal on crashes, which is
feasible now that we have the panic handler. Since std::process::exit() skips
destructors, we need to reshuffle some code. The "exit_without_destructors"
semantics (which std::process::exit() als has) was mostly necessary for C++
since Rust leaks global variables by default.
When fish crashes due to a panic, the terminal window is closed. Some
terminals keep the window around when the crash is due to a fatal signal,
but today we don't exit via fatal signal on panic.
There is the option to set «panic = "abort"» in Cargo.toml, which
would give us coredumps but also worse stacktraces on stderr.
More importantly it means that we don't unwind, so destructors are skipped
I don't think we want that because we should use destructors to
restore the terminal state.
On crash in interactive fish, read one more line before exiting, so the
stack trace is always visible.
In future, we should move this "read one line before exiting" logic to where
we call "panic!", so I can attach a debugger and see the stacktrace.
Today fish_cursor_selection_mode controls whether selection mode includes
the cursor. Since it's by default only used for Vi mode, perhaps use it to
also decide whether it should be allowed to select one-past the last character.
Not allowing to select to select one-past the last character is much nicer
in Vi mode. Unfortunately Vi mode sometimes needs to temporarily select
past end (using forward-single-char and such), so reset fish_cursor_selection_mode
for the duration of the binding.
Also fix other things like cursor placement after yank/yank-pop.
Closes#10286Closes#3299
A long standing issue is that bindings cannot mix special input functions
and shell commands. For example,
bind x end-of-line "commandline -i x"
silently does nothing. Instead we have to do lift everything to shell commands
bind x "commandline -f end-of-line; commandline -i x"
for no good reason.
Additionally, there is a weird ordering difference between special input
functions and shell commands. Special input functions are pushed into the
the queue whereas shell commands are executed immediately.
This weird ordering means that the above "bind x" still doesn't work as
expected, because "commandline -i" is processed before "end-of-line".
Finally, this is all implemented via weird hack to allow recursive use of
a mutable reference to the reader state.
Fix all of this by processing shell commands the same as both special input
functions and regular chars. Hopefully this doesn't break anything.
Fixes#8186Fixes#10360Closes#9398
Multiline search strings are weirdly broken (inserting control characters
in the command line) and probably not very useful anyway.
On the other hand I often want to compose a multi-line command
from single-line commands I ran previously.
Let's support this case by limiting the initial search string to the current
line; and replace only that line.
Alternatively this could operate on jobs (that is, replace a surrounding
"foo | bar") instead of using line boundaries.
This is resistant to misuse by including O_DIRECTORY in the open flags and it is
a separate function from {w,}open_cloexec() in preparation for making that one
return a `File` instead of an `OwnedFd`.
`intersects()` is "any of" while `contains()` is "all of" and while it makes no
difference when testing a single bit, I believe `contains()` is less brittle
for future maintenance and updates as its meaning is clearer.
</pedantic>
More work in prep for having wopen_cloexec() return `File` directly.
This eliminates checking for an invalid fd and makes both ownership and
mutability clear (some more operations that involve changes to the underlying
state of the fd now require `&mut File` instead of just a `RawFd`).
Code that clearly does not use non-blocking IO is ported to use
`Write::write_all()` directly instead of our rusty port of the `write_loop()`
function (which handles EAGAIN/EWOULDBLOCK in addition to EINTR, while
`write_all()` only handles the latter).
This is a step towards converting `wopen_cloexec()` to return `File` instead of
`OwnedFd`/`AutocloseFd`.¹
In addition to letting us use native standard library functions instead of
unsafe libc calls, we gain additional semantic safety because `File` operations
that manipulate the state of the fd (e.g. `File::seek()`) require a `&mut`
reference to the `File`, whereas using `RawFd` or `OwnedFd` everywhere leaves us
in a position where it's not clear whether or not other references to the same
fd will manipulate its underlying state.
¹ We actually wouldn't even need `wopen_cloexec()` at all (just a widechar
wrapper) as Rust's native `File::open()`/`File::create()` functionality uses
`FD_CLOEXEC` internally.
* builtin/test: Split Token enum into 2-level hierarchy
* builtin/test: Rearrange the Token enum hierarchy
* builtin/test: Separate Token into Unary and Binary
* builtin/test: import IsOkAnd polyfill
* builtin/test: Rename enum variants one more time
I was able to trigger this by flipping around the history pager.
Since the only applicable caller here already stops if it gets None,
just don't assert.
I don't think the existing logic is correct, as the comment says, our internal
state is only matched if we *actually* wrote out the file. But if we ran into an
error, it doesn't match, does it?
The incompatible_msrv one is a false positive because we have polyfills for
is_some_and() and is_ok_or() which are Rust 1.74. I'm not yet sure how to
communicate that to Clippy.
Call fish_should_add_to_history to see if a command should be saved
If it returns 0, it will be saved, if it returns anything else, it
will be ephemeral.
It gets the right-trimmed text as the argument.
If it doesn't exist, we do the historical behavior of checking for a
leading space.
That means you can now turn that off by defining a
`fish_should_add_to_history` that just doesn't check it.
documentation based on #9298
The existing logic did not work because:
- Path::new("/foo/bar").ends_with("/bar") does not return true.
- PathBuf::shrink_to() only (potentially) reallocates the backing
storage, and won't have an effect on the stored value.
As mentioned in the comment the historical behavior is because pressing unknown
control characters like Ctrl+4 inserts confusing characters, so let's back
out that part of b77d1d0e2 (Stop crashing on invalid Unicode input, 2024-02-27).
We still have the code for rendering control characters, for pasted text,
or text recalled from history. It is unclear whether we should strip those.
Some terminals already strip control characters from pasted text -- but not
all of them: see https://codeberg.org/dnkl/foot/pulls/312 for example which
has a follow up called "Don't strip HT when pasting in non-bracketed mode".
Don't force the internal use of `RefCell<T>`, let the caller place that into
`MainThread<>` manually. This lets us remove the reference to `MainThread<>`
from the definition of `Screen` again and reduces the number of
`assert_is_main_thread()` calls.
Fairly straightforward, with the only unfortunate part of this being that
`Screen` isn't as pure and now encodes the facte that we use it with
main-thread-only stdout `Outputter`.
The regex struct is pretty large at 560 bytes, with the entire
Abbreviation being 664 bytes.
If it's an "Option<Regex>", any abbr gets to pay the price. Boxing it
means abbrs without a regex are over 500 bytes smaller.
IfStatement is 680 bytes, much larger than the other
variants (SwitchStatement is next at 232). An enum is as large as its
largest variant, so this saves a bunch, especially since
DecoratedStatement is much more likely than IfStatement.
This will speed up the no-execute benchmark by 1.07x.
Unlike C++, Rust requires "char" to be a valid Unicode code point. As a
workaround, we take the raw (probably UTF-8-encoded) input and convert each
input byte to a char representation from the private use area (see commit
3b15e995e (str2wcs: encode invalid Unicode characters in the private use
area, 2023-04-01)). We convert back whenever we output the string, which
is correct as long as the encoding didn't change since the data was input.
We also need to convert keyboard input; do that.
Quick testing shows that our reader drops PUA characters. Since this patch
converts both invalid Unicode input as well as PUA input into a safe PUA
representation, there's no longer a reason to not add PUA characters to
the commandline, so let's do that to restore traditional behavior.
Render them as � (REPLACEMENT CHARACTER); unfortunately we show one per
input byte instead of one per code point. To fix this we probably need our
own char type.
While at it, remove some special cases that try to prevent insertion of
control characters. I don't think they are necessary. Could be wrong..
This makes it so code like
```fish
echo foo
echo bar
```
is collapsed into
```fish
echo foo
echo bar
```
One empty line is allowed, more is overkill.
We could also allow more than one for e.g. function endings.
We don't need to know that it tried these five before finally getting
one, the list is *right there*.
It is also very unlikely that someone has "xterm" or "ansi" but not "xterm-256color"
For xterm-256color, we don't warn *at all* because we have that one hardcoded.
This allows us to get the terminfo information without linking against curses.
That means we can get by without a bunch of awkward C-API trickery.
There is no global "cur_term" kept by a library for us that we need to invalidate.
Note that it still requires a "unhashed terminfo database", and I don't know how well it handles termcap.
I am not actually sure if there are systems that *can't* have terminfo, everything I looked at
has the ncurses terminfo available to install at least.
We still don't support tabs but as of the parent commit, there are no more
weird glitches, so it should be fine to recall those lines?
This reverts commit cc0e366037.
Inserting Tab or Backspace characters causes weird glitches. Sometimes it's
useful to paste tabs as part of a code block.
Render tabs as "␉" and so on for other ASCII control characters, see
https://unicode-table.com/en/blocks/control-pictures/. This fixes the
width-related glitches.
You can see it in action by inserting some control characters into the
command line:
set chars
for x in (seq 1 0x1F)
set -a chars (printf "%02x\\\\x%02x" $x $x)
end
eval set chars $chars
commandline -i "echo '" $chars
Fixes#6923Fixes#5274Closes#7295
We could extend this approach to display a fallback symbol for every unknown
nonprintable character, not just ASCII control characters.
In future we might want to support tab properly.