Prior to this commit, there was a stack of ReaderDatas, each one has a
reference to a Parser (same Parser in each, for now). However, the current
ReaderData is globally accessible. Because it holds a Parser, effectively
anything can run fish script; this also prevents us from making the Parser
&mut.
Split these up. Create ReaderData, which holds the data portion of the
reader machinery, and then create Reader which holds a ReaderData and a
Parser. Now `reader_current_data()` can only return the data itself; it
cannot execute fish script.
This results in some other nice simplifications.
This is a start on untangling input. Prior to this, a ReaderData and an
Inputter would communicate with each other; this is natural in C++ but
difficult in Rust because the Reader would own an Inputter and therefore
the Inputter could not easily reference the Reader. This was previously
"resolved" via unsafe code.
Fix this by collapsing Inputter into Reader. Now they're the same object!
Migrate Inputter's logic into a trait, so we get some modularity, and then
directly implement the remaining input methods on ReaderData.
Add round options, but I think can also add floor, ceiling, etc. And
the default mode is trunc.
Closes#9117
Co-authored-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
The special input functions self-insert, self-insert-not-first, and
and or used to be handled by inputter_t::readch, but they aren't
anymore with `commandline -f`.
I am unsure if these *would* have worked, I can't come up with a use.
So, for now, do nothing instead of panicking.
This would crash if you ran `commandline -f backward-jump`.
The C++ version would read a char (but badly), this doesn't anymore.
So, at least instead of crashing, just do nothing.
This doesn't pull its weight. Block size is not a particularly big
problem,
and this both complicates the code a bit and would arbitrarily cause issues
if a fish script exceeded 65k lines.
This reverts commit edd6533a14.
This doesn't have any effect on the size of the struct (due to alignment
requirements and padding) but reduces the complexity by turning
Block::wants_pop_env into an emergent property dependent on the type rather than
something we have to manually manage.
We only increment it and check if it's non-zero, we never decrement or check the
actual count. As such, change it to a bool and bring the size of `Block` down
from 32 to 24 bytes.
We almost never access any of this and having it stored directly in the `Block`
struct increases its size (reducing how many we can fit in L1 and L2, and
increasing memory copy traffic).
Gets rid of BlockData::None so we can avoid allocating a Box at all when we have
no data (at the cost of yet-another-wrapper-type), which is the usual case.
This has a few advantages,
* We now statically assert that all fields used by a particular block type are
correctly initialized (i.e. you can't assign the function name but forget to
assign its arguments),
* Conversely, we can match directly on `BlockData` and be guaranteed that the
fields we want to access are initialized and present,
* We reduce the number of assertions, effectively "unwrapping" only once based
off the block type instead of each time we try to access a conditional field,
* We reduce the size of the `Block` struct by coalescing fields that cannot
co-exist, bringing it down from 104 bytes to 88 bytes.
It would be nice to make all of `Block` itself an enum, but it currently
requires `Copy` and we take advantage of that to copy it around everywhere.
Putting these fields directly in `Block` directly would mean a lot more memory
traffic just checking block types.
There's no need for two separate block types when one is merely a variant of the
other. This may have been required under C++ but thanks to sum types (rust's
enums) we don't need to do that any more.
If the backgrounded/stopped job was using the tty, sending it SIGCONT first
might cause it to immediately wake and try to use the tty (which fish still has
control over), causing it to immediately stop again after receiving a SIGTTOU.
We are supposed to send SIGHUP first so that when the process resumes it sees
the queued SIGHUP and executes its registered handler!
There's no guarantee that a condition variable is stateful. The docs for
`Condvar::notify_one()` actually say the opposite:
> If there is a blocked thread on this condition variable, then it will be woken
> up from its call to wait or wait_timeout. Calls to notify_one are not buffered
> in any way.
This test was relying on the main loop obtaining the lock and entering the
condition variable sleep before the thread was scheduled and got around to
notifying the condition variable. If this non-deterministic behavior was not
upheld, the test would time out since it would obtain the lock (either before or
after the variable were updated) then call `condvar.wait()` *after* the variable
had been updated and the condvar signalled, but without (atomically or even at
all) checking to see if the desired wake precondition was fulfilled. As the
child thread had already run and the wake notification was NOT buffered, there
was nothing to wake the running thread.
There really wasn't any way to salvage the test as originally written, since the
write to `ctx.val` was not in any way linked to the acquire/release of the mutex
so regardless of whether or not the main thread obtained the mutex and checked
the value precondition before calling `condvar.wait()`, the child thread's write
could have happened after the check but before the wait() call. As such, the
test has been rewritten to use `wait_while()` but then also updated to bail in
case of a timeout instead of hanging indefinitely (since neither the `ctest`
runner nor the `cargo test` harness was timing out; `cargo test` would only
report that the test had exceeded 60 seconds but as long as it was not executed
with `cargo test -- -Z --ensure-time` (which is only available under nightly),
the test would not halt.
If this test were *intentionally* written to test the scenario that was timing
out, it should be written deterministically in such a way that the main loop
did not run until after it was guaranteed that the variable had been updated
(i.e. by looping until val became 5 or waiting for an AtomicBool indicating the
update had completed to be set), but I'm not sure what the benefit in that would
be since the docs actually guarantee the opposite behavior (the notified state
is explicitly not cached/buffered).
If we have fish code written with the assumption that condvar notifications
prior to *any* call to `Condvar::wait()` *are* buffered, then that code should
of course be revisited in light of this.
Commit 8a7c3ce (Don't abandon line after writing control sequences, 2024-04-06)
was broken by 29f2da8 (Toggle terminal protocols lazily, 2024-05-16), fix that.
Fixes#10529
This makes `path basename` a more useful replacement for the stock `basename`
command, which can be used with `-s .ext` to trim `.ext` from the base name.
Previously, this would have required the equivalent of
path change-extension "" (path basename $path)
but now it can be just
path basename -E $path
Mostly replacing std::<type>::MAX with <type>::MAX.
Surprising here is replacing
.expect(format!(...))
with
.unwrap_or_else(|_| panic!(...))
It explains that this is because the "format!" would always be called.
This enabled the profile in fish_setlocale, which caused startup
profile to always be on, so
```fish
fish --profile file -c 'foo'
```
would show the entire startup as well
Hex float parsing may come about through wcstod, for example:
printf "%f" '0x8p2'
should output 32.0.
Currently we use a not-great fork of hexponent. Hexponent has been dormant for
years, and has some issues: doesn't round properly, allocates unnecessarily,
doesn't handle denormals, is more complicated than necessary.
Just rewrite hex float parsing, fixing those problems and getting us off of this
weird fork.
ThreadId is way slower than it should be for the sense that we use it in; it
doesn't cache the id and allocates an Arc internally.
We don't care about the thread id used in crate::threads correlating with any
other thread id the code uses anywhere (not that it does) because it's only used
for our own bookkeeping. Change to something much simpler instead.
Verified that std::sync::OnceLock<T> compiles to the same assembly at the
*access* site as the Option<T> we were using. The additional overhead upon init
is fine. No need for extra Box<T> indirection for IO_THREAD_POOL.
While obtaining an uncontested mutex from the same thread (without reentrance)
is basically ~free, the use of `MainThread<RefCell<T>>` instead of `Mutex<T>`
makes it clear that there is no actual synchronization taking place, hopefully
making the code easier to understand.
The C++ version of this code simply copied the entire uvar table.
Today we take a reference. It's not clear which one is better.
Removal of locale variables like LC_ALL triggers variable change handlers
which call EnvStackImpl::get. This deadlocks because we still hold the lock
to protect the reference to all uvars. Work around this.
Closes#10513
It is short and simple enough to write yourself if you need it and it encourages
bad behavior by a) always returning owned strings, b) always allocating them in
a vector. If/where possible, it is better to a) use &wstr, b) use an iterator.
In rust, it's an anti-pattern to unnecessarily abstract over allocating
operations. Some of the call sites even called split_string(..).into_iter().
This updates is_windows_subsystem_for_linux() to take a WSL version to test for
(any, v1, or v2) and returns the boolean result depending on the system. I've
benchmarked and when running on regular Linux, this is still just as fast as the
previous binary check; it's only when it's WSL that this takes about 20ns
longer to figure out which variant.
Note that older WSLv2 kernels had a `-microsoft-standard` suffix while newer
ones appear to have a `-microsoft-standard-WSL2` suffix, so we make sure to test
for the least common denominator. (It doesn't matter to us, but note that newer
WSLv2 kernels have four dots in the version string!)
WSL workarounds pertaining to the default Windows terminal or executable
behavior of win32 binaries under a WSL shell are extended to WSLv2 while those
specific to oddities in kernel behavior are confined to WSLv1 only. (It
technically wouldn't hurt to extend them to WSLv2 but there's no good reason to
do so, either.)
A common complaint has been the massive amount of directories Windows appends to
$PATH slowing down fish when it attempts to find a non-existent binary (which it
does a lot more often than someone not in the know might think). The typical
workaround suggested is to trim unneeded entries from $PATH, but this a) has
considerable friction, b) breaks resolution of Windows binaries (you can no
longer use `clip.exe`, `cmd.exe`, etc).
This patch introduces a two-PATH workaround. If the cmd we are executing does
not contain a period (i.e. has no extension) it by definition cannot be a
Windows executable. In this case, we skip searching for it in any of the
auto-mounted, auto-PATH-appended directories like `/mnt/c/Windows/` or
`/mnt/c/Program Files`, but we *do* include those directories if what we're
searching for could be a Windows executable. (For now, instead of hard-coding a
list of known Windows executable extensions like .bat, .cmd, .exe, etc, we just
depend on the presence of an extension at all).
e.g. this is what starting up fish prints with logging enabled (that has been
removed):
bypassing 100 dirs for lookup of kill
bypassing 100 dirs for lookup of zoxide
bypassing 100 dirs for lookup of zoxide
bypassing 100 dirs for lookup of fd
not bypassing dirs for lookup of open.exe
not bypassing dirs for lookup of git.exe
This has resulted in a massive speedup of common fish functions, especially
anywhere we internally use or perform the equivalent of `if command -q foo`.
Note that the `is_windows_subsystem_for_linux()` check will need to be patched to
extend this workaround to WSLv2, but I'll do that separately.
Under WSL:
* Benchmark `external_cmds` improves by 10%
* Benchmark `load_completions` improves by an incredible 77%
On this binding we fail to disable CSI u
bind c-t '
begin
set -lx FZF_DEFAULT_OPTS --height 40% --bind=ctrl-z:ignore
eval fzf | while read -l r; echo read $r; end
end
'
because for "fzf", ParseExecutionContext::setup_group() returns early with the
parent process group (which should be fish's own) , hence "wants_terminal"
is false. This seems questionable, I don't think the eval should make a
difference here.
For now, don't touch it; use the more accurate way of detecting whether
a process may read keyboard input. In many of such cases "wants_terminal"
is false, like
echo (echo 1\n2\n3 | fzf)
Fixes#10504
This hot function dominates the flamegraphs for the completions thread, and any
optimizations are worthwhile.
A variety of different approaches were tested and benchmarked against real-world
fish-history file inputs and this is the one that won out across all rustc
target-cpu variations tried.
Benchmarks and code at https://github.com/mqudsi/fish-yaml-unescape-benchmark
We don't forward this variable for storage in any structs, so there's no reason
to go through an Arc instead of returning the `&'static EnvStack` directly.
NB: This particular change was safe, and passes all tests on its own.
We don't forward this variable for storage in any structs, so there's no reason
to go through an Arc instead of returning the `&'static EnvStack` directly.
These have clearer sync/unsync semantics and now ship with rust itself.
They don't paper over any possible cross-thread issues, and we can specifically
choose which we want for the purpose.
`Parser` is a single-threaded `!Send`, `!Sync` type and does not need to use
`Arc` for anything. We were using it because that's all we had for the parser's
`EnvStack`, but though that is *technically* protected internally by a mutex
(shared with global EnvStack), there's nothing to say that other parsers with a
narrower scope/lifetime on other threads will be necessarily using the same
backing mutex.
We can safely marshal the existing `Arc<EnvStack>` we get from
`environment::principal()` into an `Rc<EnvStack>` since the underlying reference
is always valid. To prove this point, we could have PRINCIPAL_STACK be a static
`EnvStack` and have `environment::principal()` use `Arc::from_raw()` to turn
that into an `Arc<EnvStack>`, but there's no need to factorize this process.
By inverting the order of storage, we can use an `OnceCell`/`unsync::Lazy`
inside the Send/Sync `MainThread<T>` and remove the need for a lock altogether.
It's reasonable since this is only checking to see that the history file
contains the expected format and if it's corrupted but we at least got what we
expect to be the correct key/value pairs, then that's all we can do.
Of course the real motivation is to speed up this very hot function in any way
possible!
On the completions and history thread, the parent function
HistoryFileContents::decode_item() is responsible for ~60% of the CPU time, and
extract_prefix_and_unescape_yaml() alone comprising 14% (of the total).
This change removes allocations in the event that the history item is either
fully or partially plain yaml with no escapes to begin with, and brings down the
execution time of this function to only 7% of the total execution time.
The bulk of the remaining time is spent in wcs2string(), which is called
unconditionally and is naturally alloc-heavy.
This allows running `set` without triggering any event handlers.
That is useful, for example, if you want to set a variable in an event
handler for that variable - we could do it, for example, in the
fish_user_path or fish_key_bindings handlers.
This is something the `block` builtin was supposed to be for, but it
never really worked because it only allows suppressing the event for
the duration, they would fire later. See #9030.
Because it is possible to abuse this, we only have a long-option so
that people see what is up.
Commit a583fe723 ("commandline -f foo" to skip queue and execute immediately,
2024-04-08) fixed the execution order of some bindings but was partially
backed out in 5ba21cd29 (Send repaint requests through the input queue again,
2024-04-19) because repainting outside toplevel yields surprising results
(wrong $status etc).
Transient prompts wants to first repaint and then execute some more readline
commands, all within a single binding. This was broken by the second commit
because that one defers the repaint until after the binding has finished.
Work around this problem by deferring input events again while a readline
event was queued. This is closest to the historical behavior.
The implementation feels hacky; we might find odd situations.
For example,
commandline -f repaint end-of-line
set token (commandline -t)
sets the wrong token.
Probably not a very important case. We could throw an error or make it work
by letting "commandline -t" drain the input queue.
That seems too complicated, better change repaints to not use the input queue
(and fake $status etc). Let's try to do that in future.
Closes#10492