This
1. Skips access() if we only have "special" permissions like the owner
that need stat
2. Does the geteuid()/getegid() *once* outside of filter_path, if we
need it
In the extreme case of `path filter --perm user,group` it will remove
3 syscalls per file.
Drop support for history file version 1.
ParseExecutionContext no longer contains an OperationContext because in my
first implementation, ParseExecutionContext didn't have interior mutability.
We should probably try to add it back.
Add a few to-do style comments. Search for "todo!" and "PORTING".
Co-authored-by: Xiretza <xiretza@xiretza.xyz>
(complete, wildcard, expand, history, history/file)
Co-authored-by: Henrik Hørlück Berg <36937807+henrikhorluck@users.noreply.github.com>
(builtins/set)
This reduces noise in the upcoming "Port execution" commit.
I accidentally made IoStreams a "class" instead of a "struct". Would be
easy to correct that but this will be deleted soon, so I don't think we care.
* wildcard: Remove file size from the description
We no longer add descriptions for normal file completions, so this was
only ever reached if this was a command completion, and then it was
only added if the file wasn't a regular file... in which case it can't
be an executable.
So this was dead.
* Make possible_link() a maybe
This gives us the full information, not just "no" or "maybe"
* wildcard: Rationalize file/command completions
This keeps the entry_t as long as possible, and asks it, so especially
on systems with working d_type we can get by without a single stat in
most cases.
Then it guts file_get_desc, because that is only used for command
completions - we have been disabling file descriptions for *years*,
and so this is never called there.
That means we have no need to print descriptions about e.g. broken symlinks, because those are not executable.
Put together, what this means is that we, in most cases, only do
an *access(2)* call instead of a stat, because that might be checking
more permissions.
So we have the following constellations:
- If we have d_type:
- We need a stat() for every _symlink_ to get the type (e.g. dir or regular)
(this is for most symlinks, if we want to know if it's a dir or executable)
- We need an access() for every file for executables
- If we do not have d_type:
- We need a stat() for every file
- We need an lstat() for every file if we do descriptions
(i.e. just for command completion)
- We need an access() for every file for executables
As opposed to the current way, where every file gets one lstat whether
with d_type or not, and an additional stat() for links, *and* an
access.
So we go from two syscalls to one for executables.
* Some more comments
* rust link option
* rust remove size
* rust accessovaganza
* Check for .dll first for WSL
This saves quite a few checks if e.g. System32 is in $PATH (which it
is if you inherit windows paths, IIRC).
Note: Our WSL check currently fails for WSL2, where this would
be *more* important because of how abysmal the filesystem performance
on that is.
This is off by one from the C++ version.
It wasn't super obvious why this worked in the first place.
Looks like args[0] is "-" because we are invoked like
fish -c 'exec "${@}"' - "${@}"
and it looks like "-" is treated like "--" by bash, so we emulate that.
See https://github.com/fish-shell/fish-shell/issues/367#issuecomment-11740812
This function only ever returns true if target_os=linux, so we need to invert
the OS check.
In the first invocation, this function may allocate heap memory.
Clarify that this is safe.
[ja: I don't have the original commit handy so I made up the log message]
The following "Port execution" commit will use RefCell for the wait handle
store. If we hold a borrow while we are running an event (which may run
script code) there will be a borrowing conflict. Avoid this by returning
the borrow earlier.
This makes it so expand_intermediate_segment knows about the case
where it's last, only followed by a "/".
When it is, it can do without the file_id for finding links (we don't
resolve the files we get here), which allows us to remove a stat()
call.
This speeds up the case of `...*/` by quite a bit.
If that last component was a directory with 1000 subdirectories we
could skip 1000 stat calls!
One slight weirdness: We refuse to add links to directories that we already visited, even if they are the last component and we don't actually follow them. That means we can't do the fast path here either, but we do know if something is a link (if we get d_type), so it still works in common cases.
This fixes the following deadlock. The C++ functions path_get_config and
path_get_data lazily determine paths and then cache those in a C++ static
variable. The path determination requires inspecting the environment stack.
If these functions are first called while the environment stack is locked
(in this case, when fetching the $history variable) we can get a deadlock.
The fix is to call them eagerly during env_init. This can be removed once
the corresponding C++ functions are removed.
This issue caused fish_config to fail to report colors and themes.
Add a test.
Unlike our C++ tests, our Rust tests fail as soon as an assertion fails.
Whether this is desired is debatable; it seems fine for
most cases and is easier to implement.
This means that Rust tests usually don't need to print anything besides
what assert!/assert_eq! already provide.
One exception is the history merge test. Let's add a simple err!() macro to
support this. Unlike the C++ err() it does not yet print colors.
Currently all of our macros live in common.rs, to keep the import graph simple.
Notably this exposes config.h to Rust (for UVAR_FILE_SET_MTIME_HACK).
In future we should move the CMake checks into build.rs so we can potentially
get rid of CMake.
On the following "Port execution" commit, ASan will complain if we read
beyond a terminating null byte in get_autosuggestion_performer(). This is
actually working as intended but we need to appease ASan somehow..
get_pwd_slash() uses "if var.is_empty()" but it should be "if !var.is_empty()".
This wasn't a problem so far because in practice most code paths use the
get_pwd_slash() override from EnvStackImpl. The generic one is used in the
upcoming unit tests.
This adopts the Rust postfork code, bridging it from C++ exec module.
We use direct function calls for the bridge, rather than cxx/autocxx, so that we
can be sure that no memory allocations or other shenanigans are happening.
This implements the "postfork" code in Rust, including calling fork(),
exec(), and all the bits that have to happen in between. postfork lives
in the fork_exec module.
It is not yet adopted.
This introduces a new module called fork_exec, which will be for posix_spawn,
postfork, and flog_safe - stuff concerned with actually executing binaries,
and error reporting.
Add a FLOG_SAFE! macro which writes errors to the flog fd in an
async-signal-safe way. This implementation differs from the C++ in that we
allow printing integers directly - no requiring them to be converted to a
buffer first.
This makes it so
```fish
if -e foo
# do something
end
```
complains about `-e` not being a command instead of `end` being used
outside of an if-block.
That means both that `-e` could now be used as a command name (it
already can outside of `if`!) *and* that we get a better error!
The only way to get `if` to be a decorated statement now is to use `if
-h` or `if --help` specifically (with a literal option).
The same goes for switch, while and begin.
It would be possible, alternatively, to disallow `if -e` and point
towards using `test` instead, but the "unknown command" message should
already point towards using `test` more than pointing at the
"end" (that might be quite far away).
- This is untested and unused, string ownership is very much subject to change
- Ports the minimally necessary parts of complete.rs as well
- This should fix an infinite loop in `create_directory` in `path.rs`, the first
`wstat` loop only breaks if it fails with an error that's different from
EAGAIN
- wildcard_match is now closer to the original that is linked in a comment, as
pointer-arithmetic translates very poorly. The act of calling wildcard
patterns wc or wildcard is kinda confusing when wc elsewhere is widechar.
This is in regards to a comment on 290d07a833, which resulted in 46c967903d.
Those commits handled the default path when it is unset on startup.
DEFAULT_PATH is used when PATH is unset at runtime as far as I can tell.
As far as I can tell this has had the non-overidding ordering behavior since inception
(or at least 17 years ago ea998b03f2).
We don't change anything about compilation-setup, we just immediately jump to
Rust, making the eventual final swap to a Rust entrypoint very easy.
There are some string-usage and format-string differences that are generally
quite messy.
In CMake this used a `version` file in the CARGO_MANIFEST_DIR, but
relying on that is problematic due to change-detection, as if we add
`cargo-rerun-if-changed:version`, cargo would rerun every time if the file does
not exist, since cargo would expect the file to be generated by the
build-script. We could generate it, but that relies on the output of `git
describe`, whose dependencies we can only limit to anything in the
`.git`-folder, again causing unnecessary build-script runs.
Instead, this reads the `FISH_BUILD_VERSION`-env-variable at compile time
instead of the `version`-file, and falls back to calling git-describe through
the `git_version`-proc-macro. We thus do not need to deal with extraneous
build-script running.
C++ main used getopt (no w!), which appears to internally print
error-messages. The Rust version will use `wgetopter_t`, and therefore needs to
print this itself.
- It is currently never set, but will be set once `main` is ported
- `should_suppress_stderr_for_tests` used to be PROGRAM_NAME !=
TESTS_PROGRAM_NAME, but the equivalent C++ code was
`!std::wcscmp(program_name, TESTS_PROGRAM_NAME)`, and `wcsmp` returns
zero if they are equal, thus is equivalent to `==` in Rust
Similar to `time`, except that one is more common as a command.
Note that this will also allow `builtin and`, which is somewhat
useless, but then it is also useless outside of a pipeline.
Addition to #9985
This allows e.g. `foo | command time`, while still rejecting `foo | time`.
(this should really be done in the ast itself, but tbh most of
parse_util kinda should)
Fixes#9985
- "1.6.0" now supports formatting let-else statements which we use liberally,
and appears to have some fixes in regards to long-indented-lines with macros
like `wgettext_ft!`
- This commit updates the formatting so that devs with the latest stable don't
see random format-fixes upon running `cargo fmt`
Note: This *requires* an argument after the format string:
```rust
FLOGF!(debug, "foo");
```
won't compile. I think that's okay, because in that case you should
just use FLOG.
An alternative is to make it skip the sprintf.
"FLOGF!" is supposed to treat its first argument as a format
string (but doesn't because that part isn't implemented currently).
That means running something like
```rust
FLOGF!(term_support, "curses var", var_name, "=", value);
```
That would rightly just print "curses var", ignoring the other
arguments.
By contrast, FLOG! is the literal "just join these as a string"
version.
- Make CMake use the correct target-path
- Make build.rs use the correct target dir
Workspaces place it in the project root by default, the alternative to making
this change is to add a `.cargo/config.toml` file with
```toml
[build]
target-dir = "fish-rust/target"
```
Which I think is unnecessary, as we likely want to use the new location anyways.
- This allows running `cargo fmt/clippy/test/etc` from root
- Ideally the root should be the fish-rust package instead of being virtual, but
that requires changed to CMake/Corrosion. This change should instead be
completely compatible with our existing setup.
- This also means we will only have on `Cargo.lock` for all current and future
crates.
This was "function", needs to be "function*s*".
It was only an issue in the option parsing because we set cmd there
again instead of passing it. Maybe these should just be file-level constants?
This is an alternative to the very common pattern of
```rust
streams.err.append(output);
streams.err.append1('\n');
```
Which has negative performance implications, see https://github.com/fish-shell/fish-shell/pull/9229
It takes `Into<WString>` to hopefully avoid allocating anew when the argument is
a WString with leftover capacity
This removes some spurious unsafe and some imports.
Note: We don't use it in `test`, because that can be asked to check
arbitrary file descriptors, while this only checks stdout specifically.
Turns out doing `==` on Enums with values will do a deep comparison,
including the values.
So EventDescription::Signal(SIGTERM) is !=
EventDescription::Signal(SIGWINCH).
That's not what we want here, so this does a bit of a roundabout thing.
The `impl<T> Hash for &T` hashes the string itself[^1].
It is unclear if that is actually faster than just calling `keyfunc` multiple times (they should all be linear).
For context, Rust by default uses SipHash 1-3 db1b1919ba
An alternative would be to store it as raw pointers aka `*const T`, which have a cheaper hash impl.
That has a more complicated implementation + removes lifetimes.
This commit rather removes the premature optimization.
[^1]: Source: https://doc.rust-lang.org/std/ptr/fn.hash.html
- The Err-variants will be used by e.g. wildcard, so might as well change it
now.
- `create_directory` should now not infinitely loop until it fails with an
error message that isn't `EAGAIN`
Padding with an unprintable character is now disallowed, like it was for other
zero-length characters.
`string shorten` now ignores escape sequences and non-printable characters
when calculating the visible width of the ellipsis used (except for `\b`,
which is treated as a width of -1).
Previously `fish_wcswidth` returned a length of -1 when the ellipsis-str
contained any non-printable character, causing the command to poentially
print a larger width than expected.
This also fixes an integer overflows in `string shorten`'s
`max` and `max2`, when the cumulative sum of character widths turned negative
(e.g. with any non-printable characters, or `\b` after the changes above).
The overflow potentially caused strings containing non-printable characters
to be truncated.
This adds test that verify the fixed behaviour.
- Add test to verify piped string replace exit code
Ensure fields parsing error messages are the same.
Note: C++ relied upon the value of the parsed value even when `errno` was set,
that is defined behaviour we should not rely on, and cannot easilt be replicated from Rust.
Therefore the Rust version will change the following error behaviour from:
```shell
> string split --fields=a "" abc
string split: Invalid fields value 'a'
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
To:
```shell
> string split --fields=a "" abc
string split: a: invalid integer
> string split --fields=1a "" abc
string split: 1a: invalid integer
```
Empty hash maps muck around with TLS. Per code review, use a boxed slice
of a tuple instead. This has the nice benefit of printing inherited vars
in sorted order.
This adopts the new function store, replacing the C++ version.
It also reimplements builtin_function in Rust, as these was too coupled to
the function store to handle in a separate commit.
DirIter had a serious bug where it would crash on an invalid path. Make it more
robust and rationalize its error handling. Move it into its own module and add
tests.
Prior to this change, we had a silly wrapper type EventDescription which wrapped
EventType, which actually described the event.
Remove this wrapper and rename EventType to EventDescription (since it describes
more than just the type of event).
The RETURN_IN_ORDER argparse mode (enabled via leading '-') causes non-options
(i.e. positionals) to be returned intermixed with options in the original order,
instead of being permuted to the end. Such positionals are identified via the
option sentinel of char code 1. Use a real named constant for this return,
rather than weird stuff like '\u{1}'
This also allows scoped feature tests that makes testing feature flags thread-safe.
As in you can guarantee that the test actually has the correct feature flag
value, regardless of which other tests are running in parallell.