Commit graph

257 commits

Author SHA1 Message Date
ridiculousfish
d0c902a548 Adopt wstr::split in more places
This simplifies some code that was written before wstr::split existed.
2023-04-23 19:34:52 -07:00
ridiculousfish
fa39113bc6 Tweak the behavior of wstr::split to better match C++
Prior to this change, wstr::split had two weird behaviors:

1. Splitting an empty string would yield nothing, rather than an empty
   string.
2. Splitting a string with the separator character as last character
   would not yield an empty string.

For example L!("x:y:").split(':') would return ["x", "y"] instead of
what it does in C++, which is ["x", "y", ""].

Fix these.
2023-04-23 19:33:10 -07:00
ridiculousfish
de8288634a Remove Arc from the global abbreviation set
This wasn't needed.
2023-04-23 15:35:05 -07:00
ridiculousfish
705874f2e4 Revert "Warn about unescape_string_xxx() behavior (and tweak slightly)"
This reverts commit 76dc849fca.

The warning added in that commit is incorrect. The functions
unescape_string_url and unescape_string_var will not panic, because
char_at() return 0 if the index is equal to its length.
2023-04-23 15:28:46 -07:00
ridiculousfish
009650b7b5 Revert "Remove unsafe from exit_without_destructors()"
This reverts commit f9c92753c4.

This commit attempted to replace exit_without_destructors() with
std::process::exit; however this is wrong for two reasons:

1. std::process::exit() runs Rust runtime cleanup stuff we don't want
2. std::process::exit() invokes destructors, meaning atexit handlers,
   which we don't want.
2023-04-23 15:23:12 -07:00
Mahmoud Al-Qudsi
76dc849fca Warn about unescape_string_xxx() behavior (and tweak slightly)
The type system no longer guarantees that the input string is nul-terminated,
meaning accessing beyond the range-checked `i` a char-at-a-time is no longer
safe. (In C++, we would either be using a plain C string which is always
nul-terminated or we would be using (w)string::cstr() which similarly grants
access to its nul-terminated buffer.)

Aside from that, there's no need to explicitly check `if c2 == '\0'` because
'\0' is not a valid hex digit so the `?` tacked on to `convert_hex_digit(c2)?`
will abort and return `None` anyway.

convert_hex_digit() is not appreciably faster than char::to_digit(16) and makes
the code less maintainable since it encodes certain assumptions; since it's also
not used consistently just drop it in favor of the std fn.

Since the output string (per the decode logic) is always shorter than or equal
to the input string, just reserve the input string size upfront to prevent vec
reallocations.
2023-04-23 15:04:37 -05:00
Mahmoud Al-Qudsi
f9c92753c4 Remove unsafe from exit_without_destructors()
std::process::exit() already does what we need and and it is safe to call (since
it is not unsafe for destructors not to be called).
2023-04-23 13:05:56 -05:00
Mahmoud Al-Qudsi
3a2033b992
Fix rust version of is_wsl() check (#9746)
Somewhat counter-intuitively, this code is active when compiling under *Linux*
and is always false when compiling under Windows. The logic was incorrectly
reversed before (it's easier to reason about when you realize that fish doesn't
even compile under Windows because it uses tons of libc functions).

As the code was actually never compiled, it wasn't actually tested for validity
either and there were some issues that prevented it from compiling that have
since been fixed. The logic has also been adjusted a bit to make it possible to
use the rust-native int parsing instead of `libc::strtod()`.

The code has been changed to use `once_cell::race::OnceBool` instead of
`once_cell::sync::Lazy<T>` which imposes a greater runtime burden with locking
and other overhead. We don't care if the code runs more than once on init (if
calls were to race, though they probably don't) - just that the code isn't
subsequently executed on each call. The `once_cell::race` module is a better fit
here, though it doesn't expose the ergonomic `Lazy<T>` façade around its types.
2023-04-23 12:28:23 -05:00
Mahmoud Al-Qudsi
ff28f29e8f
Move thread stuff out of common.rs (#9745)
is_main_thread() and co were previously ported to threads.rs, so remove the
duplicate code and move everything else related to threads there as well. No
need for common.rs to be as long as our old common.cpp!

I left #[deprecated] stubs in common.rs to help redirect anyone porting code
over that we can remove after the port has finished.

Additionally, the fork guards had previously been left as a todo!() item but I
ported that over. They're all called from the now-central threads::init()
function so there isn't a need to call each individual thread-management-fn
manually.

The decision was made a while back to try and embrace/use the native rust thread
functionality and utilities so the manual thread management code has been ripped
out and was replaced with code that marshals the native rust values instead. The
values won't line up with what the C++ code sees, but it never lined up anyway
since each was using a separate counter to keep track of the values.
2023-04-23 12:26:10 -05:00
Johannes Altmanninger
0fbefc6be2 Make IO buffer struct elements public again 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
1bffa823d8 Allow to pass slices of owned strings to trace_if_enabled 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
05ec1039ed Rename autoclose_pipes_t to AutoClosePipes 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
48e728e9fb event: make some types public again 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
6c07af9343 Shorthand for escaping with default options
Should probably do this on the C++ side too.
2023-04-22 22:25:34 +02:00
Johannes Altmanninger
19fe0f6a91 AST: implement try_source_range for union fields
Still not sure where the union fields are going.
I don't think they should implement Node.
2023-04-22 22:25:34 +02:00
Johannes Altmanninger
4c46faea99 Make ParsedSource members public again 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
29891cf771 Finish and fix DirIter API 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
07cc33e7aa parse_util: deduplicate append_syntax_error macro 2023-04-22 22:25:34 +02:00
Johannes Altmanninger
56ad7fe0e5 Silence some more clippy lints
They are at odds with some direct translations.
2023-04-22 22:25:34 +02:00
Johannes Altmanninger
ec176dc07e Port path.h 2023-04-21 13:57:29 +02:00
Johannes Altmanninger
629cbe0115 Env stubs for path port 2023-04-21 13:57:29 +02:00
Johannes Altmanninger
eb1598ea9a Port parser_keywords
This drops some of the optimizations, we should probably add them back.
2023-04-21 13:57:29 +02:00
Johannes Altmanninger
12ce42a2f9 Rename kw() to keyword() also in C++ 2023-04-19 22:43:36 +02:00
Johannes Altmanninger
09ffac5a0a Port parse_util_compute_indents 2023-04-19 10:35:22 +02:00
Johannes Altmanninger
c25cc8df5d Adopt rusty parse_util_unescape_wildcards 2023-04-19 10:32:16 +02:00
Johannes Altmanninger
12afb320a3 Port parse_util
Except for the indent visitor bits.

Tests for parse_util_detect_errors* are not ported yet because they depend
on expand.h (and operation_context.h which depends on env.h).
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
36ba912779 Make some names public 2023-04-19 01:03:16 +02:00
Johannes Altmanninger
dc6aead17b ast.rs: add Leaf::has_source() convenience function for now
This is exposed by our FFI bridge for convenience, so this makes porting
easier.
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
966dc0d997 Fix how we pass error list output parameter when parsing AST
This makes it more convenient to pass None.
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
22c8e9f60d Don't leak ParseErrorList FFI crutch type into Rust
Just like 16ea4380c (redirection.rs: don't leak FFI type into Rust code,
2023-04-09).
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
fc5e97e55e Expose u32 source offsets as usize
Computations should use usize, so this makes things more convenient.
Post-FFI we can make SourceRange fields private, to enforce this even easier.
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
2ca27d2c5b Implement Iterator for Tokenizer 2023-04-19 01:03:16 +02:00
Johannes Altmanninger
6ede7f8009 Delete wcstring_list_t
We don't want it in Rust. Remove it to smoothen the transition.
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
fdeb0d9f06 Port the rest of wcstringutil 2023-04-18 12:54:19 +02:00
Fabian Boehm
3bfe798dbb Fix read_blocked
This caused math to assert out because it never wrote into the buffer.

Now, presumably it wrote somewhere but I don't know where, so fixing
this seems like a good idea.

Fixes #9735.
2023-04-17 17:28:24 +02:00
ridiculousfish
1bf29a5e13 Support constructing a wcstring_list_ffi_t from Rust
This allows passing a vector of strings from Rust to C++
2023-04-16 13:36:13 -07:00
ridiculousfish
f0360efbfa Add path_make_canonical in Rust 2023-04-16 13:36:13 -07:00
ridiculousfish
eecc796b04 Add a widestring split() function
This allows splitting widestrings about a char, similar to C++
split_string.
2023-04-16 13:36:13 -07:00
ridiculousfish
621a3a6a8b Add Rust support for null terminated arrays
This adds support for "null-terminated arrays of nul-terminated strings"
as used in execve, etc.
2023-04-16 13:36:13 -07:00
Xiretza
ed3fdaa665 Change read_blocked parameter type to RawFd for clarity 2023-04-16 22:26:46 +02:00
Xiretza
14fc11b5b8 wcstod: adjust tests for new implementation 2023-04-16 22:26:46 +02:00
Xiretza
aab2f660a7 Port math builtin, tinyexpr and wcstod_underscores to Rust 2023-04-16 22:26:46 +02:00
Xiretza
cc744d30c0 io: add FFI wrappers for io_streams_t fields 2023-04-16 22:26:46 +02:00
Xiretza
ba5e1dfb69 builtins: port more error messages 2023-04-16 22:26:46 +02:00
Xiretza
be2ea8edf0 wcstod: extract wcstod_inner()
This function can be called with any char iterator, not just IntoCharIter
values.
2023-04-16 22:26:46 +02:00
Xiretza
6b687adb40 Implement IntoCharIter for &[char] 2023-04-16 22:26:46 +02:00
Fabian Boehm
a91689e211 Remove unneeded & 2023-04-16 22:22:04 +02:00
ridiculousfish
ead329db60 Replace a bunch of from_ffi with as_wstr calls
from_ffi copies a CxxWString into a new Rust WString, but as_wstr simply
gets the slice of chars directly.

Too many string types!
2023-04-16 12:50:53 -07:00
Johannes Altmanninger
971d257e67 Port AST to Rust
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.

Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.

Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now.  In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t".  This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.

Alternatively, we could define FFI wrappers for recursive AST traversal.

Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:

The only client that requires mutable access is the populator.  To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.

The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).

Like in the C++ implementation, the AST nodes themselves are largely defined
via macros.  Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.

This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList".  To make this work
we need to manually define "ExternType".

There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.

One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.

Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.

The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:

    $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
    Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
      Time (mean ± σ):     195.5 ms ±   2.9 ms    [User: 190.1 ms, System: 4.4 ms]
      Range (min … max):   193.2 ms … 205.1 ms    15 runs

    Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
      Time (mean ± σ):     677.5 ms ±  62.0 ms    [User: 665.4 ms, System: 10.0 ms]
      Range (min … max):   611.7 ms … 805.5 ms    10 runs

    Summary
      './fish.old -c 'source ../share/completions/git.fish'' ran
        3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''

Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
  changed since  it's not user visible.
2023-04-16 17:46:56 +02:00
Johannes Altmanninger
915db44fbd Implement printf formatting for some parser types 2023-04-16 17:46:56 +02:00