The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
This is basically a subset of type, so we might as well.
To be clear this is `command -s` and friends, if you do `command grep` that's
handled as a keyword.
One issue here is that we can't get "one path or not" because I don't
know how to translate a maybe_t? Do we need to make it a shared_ptr instead?
Vi visual mode selection highlighting behaves unexpectedly when the selection
foreground and background in the highlight spec don't match. The following
unexpected behaviors are:
* The foreground color is not being applied when defined by the
`fish_color_selection` variable.
* `set_color` options (e.g., `--bold`) would not be applied under the cursor
when selection begins in the middle of the command line or when the cursor
moves forward after visually selecting text backward.
With this change, visual selection respects the foreground color and any
`set_color` options are applied consistently regardless of where visual
selection begins and the position of the cursor during selection.
Most of it is duplicated, hence untested.
Functions like mbrtowc are not exposed by the libc crate, so declare them
ourselves.
Since we don't know the definition of C macros, add two big hacks to make
this work:
1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should
be large enough for any implementation.
2. Detect the definition of MB_CUR_MAX in the build script. This requires
more changes for each new libc. We could also use this approach for 1.
Additionally, this commit brings a small behavior change to
read_unquoted_escape(): we cannot decode surrogate code points like \UDE01
into a Rust char, so use � (\UFFFD, replacement character) instead.
Previously, we added such code points to a wcstring; looks like they were
ignored when printed.
wcs2string converts a wide string to a narrow one. The result is
null-terminated and may also contain interior null-characters.
std::string allows this.
Rust's null-terminated string, CString, does not like interior null-characters.
This means we will need to use Vec<u8> or OsString for the places where we
use interior null-characters.
On the other hand, we want to use CString for places that require a
null-terminator, because other Rust types don't guarantee the null-terminator.
Turns out there is basically no overlap between the two use cases, so make
it two functions. Their equivalents in Rust will have the same name, so
we'll only need to adjust the type when porting.
This can be triggered on linux with:
```js
import { spawn } from 'child_process';
const shell = spawn('/home/alfa/dev/fish-shell/build-c++/fish', []);
```
Under node 19.8.1.
*No clue* how that happens, but since this is a workaround we shall
skip it.
Another from the "why are we asserting instead of doing something
sensible" department.
The alternative is to make exit() and return() compute their own exit
code, but tbh I don't want any *other* builtin to hit this either?
Fixes#9659
This shows some of the ugliness of the rust borrow checker when it comes to
safely implementing any sort of recursive access and the need to be overly
explicit about which types are actually used across threads and which aren't.
We're forced to use an `Arc` for `ItemMaker` (née `item_maker_t`) because
there's no other way to make it clear that its lifetime will last longer than
the FdMonitor's. But once we've created an `Arc<T>` we can't call
`Arc::get_mut()` to get an `&mut T` once we've created even a single weak
reference to the Arc (because that weak ref could be upgraded to a strong ref at
any time). This means we need to finish configuring any non-atomic properties
(such as `ItemMaker::always_exit`) before we initialize the callback (which
needs an `Arc<ItemMaker>` to do its thing).
Because rust doesn't like self-referential types and because of the fact that we
now need to create both the `ItemMaker` and the `FdMonitorItem` separately
before we set the callback (at which point it becomes impossible to get a
mutable reference to the `ItemMaker`), `ItemMaker::item` is dropped from the
struct and we instead have the "constructor" for `ItemMaker` take a reference to
an `FdMonitor` instance and directly add itself to the monitor's set, meaning we
don't need to move the item out of the `ItemMaker` in order to add it to the
`FdMonitor` set later.
CXX does not allow generic types like maybe_t. When porting a C++ function
that returns maybe_t to Rust, we return std::unique_ptr instead. Let's make
the transition more seamless by allowing to convert back to maybe_t implicitly.
* wutil: Rewrite `wrealpath` in Rust
* Reduce use of FFI types in `wrealpath`
* Addressed PR comments regarding allocation
* Replace let binding assignment with regular comparison
More ugliness with types that cxx bridge can't recognize as being POD. Using
pointers to get/set `termios` values with an assert to make sure we're using
identical definitions on both sides (in cpp from the system headers and in rust
from the libc crate as exported).
I don't know why cxx bridge doesn't allow `SharedPtr<OpaqueRustType>` but we can
work around it in C++ by converting a `Box<T>` to a `shared_ptr<T>` then convert
it back when it needs to be destructed. I can't find a clean way of doing it
from the cxx bridge wrapper so for now it needs to be done manually in the C++
code.
Types/values that are drop-in ready over ffi are renamed to match the old cpp
names but for types that now differ due to ffi difficulties I've left the `_ffi`
in the function names to indicate that this isn't the "correct" way of using the
types/methods.
The way cxx bridge works, it doesn't recognize any types from another module as
being shared cxx bridge types with generations native to both C++ and Rust,
meaning every module that was going to use function pointers would have to
define its own `c_void` type (because cxx bridge doesn't recognize any of
libc::c_void, std::ffi::c_void, or autocxx::c_void).
FFI on other platforms has long used the equivalent of `uint8_t *` as an
alternative to `void *` for code where `void` was not available or was
undesirable for some reason. We can join the club - this way we can always use
`* {const|mut} u8` in our rust code and `uint8_t *` in our C++ code to pass
around parameters or values over the C abi.
I needed to rename some types already ported to rust so they don't clash with
their still-extant cpp counterparts. Helper ffi functions added to avoid needing
to dynamically allocate an FdMonitorItem for every fd (we use dozens per basic
prompt).
I ported some functions from cpp to rust that are used only in the backend but
without removing their existing cpp counterparts so cpp code can continue to use
their version of them (`wperror` and `make_detached_pthread`).
I ran into issues porting line-by-line logic because rust inverts the behavior
of `std::remove_if(..)` by making it (basically) `Vec::retain_if(..)` so I
replaced bools with an explict enum to make everything clearer.
I'll port the cpp tests for this separately, for now they're using ffi.
Porting closures was ugly. It's nothing hard, but it's very ugly as now each
capturing lambda has been changed into an explicit struct that contains its
parameters (that needs to be dynamically allocated), a standalone callback
(member) function to replace the lambda contents, and a separate trampoline
function to call it from rust over the shared C abi (not really relevant to
x86_64 w/ its single calling convention but probably needed on other platforms).
I don't like that `fd_monitor.rs` has its own `c_void`. I couldn't find a way to
move that to `ffi.rs` but still get cxx bridge to consider it a shared POD.
Every time I moved it to a different module, it would consider it to be an
opaque rust type instead. I worry this means we're going to have multiple
`c_void1`, `c_void2`, etc. types as we continue to port code to use function
pointers.
Also, rust treats raw pointers as foreign so you can't do `impl Send for * const
Foo` even if `Foo` is from the same module. That necessitated a wrapper type
(`void_ptr`) that implements `Send` and `Sync` so we can move stuff between
threads.
The code in fd_monitor_t has been split into two objects, one that is used by
the caller and a separate one associated with the background thread (this is
made nice and clean by rust's ownership model). Objects not needed under the
lock (i.e. accessed by the background thread exclusively) were moved to the
separate `BackgroundFdMonitor` type.
Keeps the location of original function definition, and also stores
where it was copied. `functions` and `type` show both locations,
instead of none. It also retains the line numbers in the stack trace.
By default, fish does not complete files that have leading dots, unless the
wildcard itself has a leading dot. However this also affected completions;
for example `git add` would not offer `.gitlab-ci.yml` because it has a
leading dot.
Relax this for custom completions. Default file expansion still
suppresses leading dots, but now custom completions can create
leading-dot completions and they will be offered.
Fixes#3707.
When we draw the prompt, we move the cursor to the actual
position *we* think it is by issuing a carriage return (via
`move(0,0)`), and then going forward until we hit the spot.
This helps when the terminal and fish disagree on the width of the
prompt, because we are now definitely in the correct place, so we can
only overwrite a bit of the prompt (if it renders longer than we
expected) or leave space after the prompt. Both of these are benign in
comparison to staircase effects we would otherwise get.
Unfortunately, midnight commander ("mc") tries to extract the last
line of the prompt, and does so in a way that is overly naive - it
resets everything to 0 when it sees a `\r`, and doesn't account for
cursor movement. In effect it's playing a terminal, but not committing
to the bit.
Since this has been an open request in mc for quite a while, we hack
around it, by checking the $MC_SID environment variable.
If we see it, we skip the clearing. We end up most likely doing
relative movement from where we think we are, and in most cases it
should be *fine*.
This is early work but I guess there's no harm in pushing it?
Some thoughts on the conventions:
Types that live only inside Rust follow Rust naming convention
("FeatureMetadata").
Types that live on both sides of the language boundary follow the existing
naming ("feature_flag_t").
The alternative is to define a type alias ("using feature_flag_t =
rust::FeatureFlag") but that doesn't seem to be supported in "[cxx::bridge]"
blocks. We could put it in a header ("future_feature_flags.h").
"feature_metadata_t" is a variant of "FeatureMetadata" that can cross
the language boundary. This has the advantage that we can avoid tainting
"FeatureMetadata" with "CxxString" and such. This is an experimental approach,
probably not what we should do in general.
The initial port of feature flags requires a global initialization. Since
fish_indent accesses feature flags, let's make sure to initialize them here.
In future, we can stop initializing things fish_indent doesn't need (like
the topic monitor) but that's no big deal. Global initialization should
always be a benign addition.
The original implementation without the test took me 3 hours (first time
seriously looking into this)
The functions take "wcharz_t" for smooth integration with existing C++ callers.
This is at the expense of Rust callers, which would prefer "&wstr". Would be
nice to declare a function parameter that accepts both but I don't think
that really works since "wcharz_t" drops the lifetime annotation.
This works around an autocxx limitations where different types cannot
have the same name even if they live in different namespace.
ast::job_t conflicts with job_t.
This translated ctrl-k to "\v", which is a "vertical tab", and ctrl-l
to "\f" and ctrl-g to "\a".
There is no "vertical tab" or "alarm" or "\f" *key*, so these
shouldn't be translated. Just drop these and call them `\ck` and such.
(vertical tab specifically is utterly useless and I would be okay with
dropping it entirely, I have never seen it used anywhere)
Commit 3b30d92b6 (Commit transient edit when closing pager, 2022-08-31)
inadvertently introduced two regressions to history search:
1. It made Escape keeps the selected history entry,
instead of restoring the commandline before history search.
2. It made history search commands add undo entries.
Fix both of this issues.
Inadvertently broken in a2d816710f,
this made `cd .` no longer offer `cd ../` (same for general file completions
like `ls .`, which only offers dotfiles)
This meant we didn't actually do our weird en/decoding scheme for e.g.
a C locale, which meant that, when you then switch to a proper locale
the previous variables were broken.
I don't know how to test this automatically - none of my attempts seem
to ever *fail* with the old code, here's what you'd do manually:
- Run fish with an actual C locale (LC_ALL=C
fish_allow_singlebyte_locale=1 fish)
- `set -gx foo 💩`
- `set -e LC_ALL`
- `echo $foo` outputs "💩" if it works and "ð⏎" if it's broken.
Fixes#2613
This means cleaning out old universal variables is now just:
```fish
abbr --erase (abbr --list)
```
which makes upgrading much easier.
Note that this erases the currently defined variable and/or any
universal. It doesn't stop at the former because that makes it *easy*
to remove the universals (no running `abbr --erase` twice), and it
doesn't care about globals because, well, they would be gone on
restart anyway.
Fixes#9468.
Like I mentioned in #9089, 12 entries is a bit few.
So, instead, we do like we do for completions before disclosing and
pick half the screen (but at least X, in this case 12).
This avoids filling the entire screen, and will avoid an unsightly "X
more entries" (which requires scrolling down to fully disclose)
because it matches what the pager does.
Note: For multiline commands we can be pushed further upwards, and in
case of a multi-column layout we could fit more lines. That would
require asking the pager to fit as many as possible and give us back
the index of the last matching entry and rewinding the history search.
That's gonna be left as an exercise for later if it turns out to be necessary.
This now means `abbr --add` has two modes:
```fish
abbr --add name --function foo --regex regex
```
```fish
abbr --add name --regex regex replacement
```
This is because `--function` was seen to be confusing as a boolean flag.
Unfortunately print_hints was true *by default* - so for all builtins
that didn't pass it it would now be false instead.
This resulted in the trailer missing, which includes the line number
and context. So if you ran a script that includes `bind -M` the error
message would now just be "bind: -M: option requires an argument",
with no indication as to where.
This reverts commit 8a50d47a46.
The print_hints variable was always false, so just remove it.
This caused a cascade of other changes where the parser_t variable
becomes unused, so remove it from the call sites.
No functional change expected here.
When we insert characters that don't yet have highlighting, we use the
highlighting to the left, unless there is nothing to our left. The logic to
check if we are the leftmost character uses an overly loose comparison. Let's
make it more specific.
No functional change.
When there are multiple event handlers for a single event, we would print
the same log statement twice. Let's add the function name to make this
less confusing.
This would print
```
abbr -a -- dotdot --regex ^\\.\\.+\$ --function multicd
```
which expands "dotdot" to "--regex ^\\.\\.+\$...".
Instead, we move the name to right before the replacement, and move
the `--` before that:
```
abbr -a --regex ^\\.\\.+\$ --function -- dotdot multicd
```
It might be possible to improve that, but this at least round-trips.
Historical behavior is to stop option parsing at the first non-option argument.
Since we have added more options, it seemed impractical to keep that behavior.
However people are using options in their abbr expansions ("abbr e emacs
-nw"). To support this, we ignore options. However, we only ignore them
if they are not valid "abbr" options. Let's ignore all options in the
expansion definition, which is a small price to pay to keep most existing
configurations working.
Fixes#9410
This does not fix other cases which used to work, like
abbr x -unknown
Those are hopefully not used by anyone, so I don't think we need to maintain
support for that.
Enhances abbreviations with extra features
- global abbreviations
- trigger on regex match as alternative to literal match
- the ability to expand abbreviations with a user-defined function
- the ability to set cursor position after expansion
Also default the marker to '%'. So you may write:
abbr -a L --position anywhere --set-cursor "% | less"
or set an explicit marker:
abbr -a L --position anywhere --set-cursor=! "! | less"
This renames abbreviation triggers from `--trigger-on entry` and
`--trigger-on exec` to `--on-space` and `--on-enter`. These names are less
precise, as abbreviations trigger on any character that terminates a word
or any key binding that triggers exec, but they're also more human friendly
and that's a better tradeoff.
set-cursor enables abbreviations to specify the cursor location after
expansion, by passing in a string which is expected to be found in the
expansion. For example you may create an abbreviation like `L!`:
abbr L! --position anywhere --set-cursor ! "! | less"
and the cursor will be positioned where the "!" is after expansion, with
the "| less" appearing to its right.
This adds support for the `--function` option of abbreviations, so that the
expansion of an abbreviation may be generated dynamically via a fish
function.
Prior to this change, abbreviations were stored as fish variables, often
universal. However we intend to add additional features to abbreviations
which would be very awkward to shoe-horn into variables.
Re-implement abbreviations using a builtin, managing them internally.
Existing abbreviations stored in universal variables are still imported,
for compatibility. However new abbreviations will need to be added to a
function. A follow-up commit will add it.
Now that abbr is a built-in, remove the abbr function; but leave the
abbr.fish file so that stale files from past installs do not override
the abbr builtin.
This allows adjusting a pattern string so that it matches an entire
string, by wrapping the regex in a group like ^(?:...)$
This is a workaround for the fact that PCRE2_ENDANCHORED is unavailable
on PCRE2 prior to 2017, so we have to adjust the pattern instead.
Also introduce an overload of match() which creates its own
match_data_t.
We have had multiple crashes for relative CDPATH entries. Commit 5e274066e
(Always return absolute path in path_get_cdpath, 2019-10-17) tried to fix
all of them but it failed to do justice to its title. Let's fix this to
actually return absolute paths, always. Take care to to normalize the path
because it is used for autosuggestions. The normalization is mostly relevant
for CDPATH=. (the default) but it doesn't hurt others.
Closes#9407