Commit graph

317 commits

Author SHA1 Message Date
Johannes Altmanninger
12ce42a2f9 Rename kw() to keyword() also in C++ 2023-04-19 22:43:36 +02:00
Johannes Altmanninger
e4f6169a01 clang-format C++ files
Forgot to run this after the wcstring_list_t -> std::vector<wcstring> rename.
2023-04-19 22:43:36 +02:00
Johannes Altmanninger
6ede7f8009 Delete wcstring_list_t
We don't want it in Rust. Remove it to smoothen the transition.
2023-04-19 01:03:16 +02:00
Johannes Altmanninger
971d257e67 Port AST to Rust
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.

Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.

Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now.  In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t".  This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.

Alternatively, we could define FFI wrappers for recursive AST traversal.

Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:

The only client that requires mutable access is the populator.  To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.

The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).

Like in the C++ implementation, the AST nodes themselves are largely defined
via macros.  Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.

This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList".  To make this work
we need to manually define "ExternType".

There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.

One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.

Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.

The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:

    $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
    Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
      Time (mean ± σ):     195.5 ms ±   2.9 ms    [User: 190.1 ms, System: 4.4 ms]
      Range (min … max):   193.2 ms … 205.1 ms    15 runs

    Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
      Time (mean ± σ):     677.5 ms ±  62.0 ms    [User: 665.4 ms, System: 10.0 ms]
      Range (min … max):   611.7 ms … 805.5 ms    10 runs

    Summary
      './fish.old -c 'source ../share/completions/git.fish'' ran
        3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''

Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
  changed since  it's not user visible.
2023-04-16 17:46:56 +02:00
Clemens Wasser
3ae16a5b95 trace: Port trace to Rust 2023-03-28 20:11:42 -07:00
Xiretza
9ac6cbefb1 Port event.cpp to rust
Port src/event.cpp to fish-rust/event.rs and some needed functions.

Co-authored-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
2023-03-12 14:55:50 -05:00
Mahmoud Al-Qudsi
562eeac43e
Port job_group to rust (#9608)
More ugliness with types that cxx bridge can't recognize as being POD. Using
pointers to get/set `termios` values with an assert to make sure we're using
identical definitions on both sides (in cpp from the system headers and in rust
from the libc crate as exported).

I don't know why cxx bridge doesn't allow `SharedPtr<OpaqueRustType>` but we can
work around it in C++ by converting a `Box<T>` to a `shared_ptr<T>` then convert
it back when it needs to be destructed. I can't find a clean way of doing it
from the cxx bridge wrapper so for now it needs to be done manually in the C++
code.

Types/values that are drop-in ready over ffi are renamed to match the old cpp
names but for types that now differ due to ffi difficulties I've left the `_ffi`
in the function names to indicate that this isn't the "correct" way of using the
types/methods.
2023-02-25 16:42:45 -06:00
Mahmoud Al-Qudsi
a1a8bc3d8d Port timer.cpp to rust 2023-02-14 15:54:18 -06:00
Johannes Altmanninger
39f3c894d7 Port tokenizer.cpp to Rust
In hindsight, I should probably have split this into three different commits.
2023-02-09 00:37:22 +01:00
Johannes Altmanninger
7f8d247211 Port parse_constants.h to Rust 2023-02-09 00:37:22 +01:00
Johannes Altmanninger
25816627de Port redirection.cpp to Rust 2023-02-09 00:37:22 +01:00
Johannes Altmanninger
9ca160eac2 Convert parse_error_code_t to a scoped enum
This will make the Rust port's diff smaller.
2023-02-08 21:49:54 +01:00
ridiculousfish
f38543ccb7 Rename ast::job_t to ast::job_pipeline_t
This works around an autocxx limitations where different types cannot
have the same name even if they live in different namespace.

ast::job_t conflicts with job_t.
2023-02-02 19:34:48 -07:00
Fabian Boehm
9ef7fe1a15 Make one error translatable
This is now the same as in `read`
2023-01-13 17:57:04 +01:00
Aaron Gyes
daf5e11179 Spelling fixes
Found with scspell
2022-10-28 20:10:09 -07:00
Mahmoud Al-Qudsi
175caab583 Prevent stack overflow from eval/substitution recursion
It seems to have originally been thought that the only possible way a stack
overflow could happen is via function calls, but there are other possibilities.

Issue #9302 reports how `eval` can be abused to recursively execute a string
substitution ad infinitum, triggering a stack overflow in fish.

This patch extends the stack overflow check to also check the current
`eval_level` against a new constant `FISH_MAX_EVAL_DEPTH`, currently set to a
conservative but hopefully still fair limit of 500. For future reference, with
the default stack size for the main/foreground thread of 8 MiB, we actually have
room for a stack depth around 2800, but that's only with extremely minimal state
stored in each stack frame.

I'm not entirely sure why we don't check `eval_depth` regardless of block type;
it can't be for performance reasons since it's just a simple integer comparison
- and a ridiculously easily one for the branch predictor handle, at that - but
maybe it's to try and support non-recursive nested execution blocks of greater
than `FISH_MAX_STACK_DEPTH`? But even without recursion, the stack can still
overflow so may be we should just bump the limit up some (to 500 like the new
`FISH_MAX_EVAL_DEPTH`?) and check it all the time?

Closes #9302.
2022-10-25 13:40:21 -05:00
Mahmoud Al-Qudsi
85d4834b35 Make maybe_t safer against accidental misuse
Closes #9240.

Squash of the following commits (in reverse-chronological order):

commit 03b5cab3dc40eca9d50a9df07a8a32524338a807
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date:   Sun Sep 25 15:09:04 2022 -0500

    Handle differently declared posix_spawnxxx_t on macOS

    On macOS, posix_spawnattr_t and posix_spawn_file_actions_t are declared as void
    pointers, so we can't use maybe_t's bool operator to test if it has a value.

commit aed83b8bb308120c0f287814d108b5914593630a
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date:   Sun Sep 25 14:48:46 2022 -0500

    Update maybe_t tests to reflect dynamic bool conversion

    maybe_t<T> is now bool-convertible only if T _isn't_ already bool-convertible.

commit 2b5a12ca97b46f96b1c6b56a41aafcbdb0dfddd6
Author: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Date:   Sun Sep 25 14:34:03 2022 -0500

    Make maybe_t a little harder to misuse

    We've had a few bugs over the years stemming from accidental misuse of maybe_t
    with bool-convertible types. This patch disables maybe_t's bool operator if the
    type T is already bool convertible, forcing the (barely worth mentioning) need
    to use maybe_t::has_value() instead.

    This patch both removes maybe_t's bool conversion for bool-convertible types and
    updates the existing codebase to use the explicit `has_value()` method in place
    of existing implicit bool conversions.
2022-10-08 11:56:38 -05:00
Mahmoud Al-Qudsi
5d64b56127 Remove needless usage of maybe_t
builtin_function() never returns `none()`; this must have been leftover from a
previous version of the code.
2022-09-25 14:40:49 -05:00
Fabian Boehm
cfecc4cc35 command_not_found: Add special error for ENOTDIR 2022-09-14 18:01:01 +02:00
Aaron Gyes
14d2a6d8ff IWYU-guided #include rejiggering.
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.

It's been a long time.

One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.

Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
2022-08-20 23:55:18 -07:00
Fabian Boehm
a4fd3c194e Pass location of the *command* node without decorators
Fixes error location for unknown commands
2022-08-12 18:38:47 +02:00
Johannes Altmanninger
8729623cec Make ESCAPE_ALL the default and call its inverse ESCAPE_NO_PRINTABLES
ESCAPE_ALL is not really a helpful name. Also it's the most common flag.
Let's make it the default so we can remove this unhelpful name.

While at it, let's add a default value for the flags argument, which helps
most callers.

The absence of ESCAPE_ALL makes it only escape nonprintable characters
(with some exceptions). We use this for displaying strings in the completion
pager as well as for the human-readable output of "set", "set -S", "bind"
and "functions".

No functional change.
2022-07-27 11:24:35 +02:00
ridiculousfish
1127d7d68f clang-format C++ files
No functional change (hopefully!)
2022-06-01 10:02:09 -07:00
ridiculousfish
f45e16e59d Try to rationalize universal variable syncing
Prior to this commit, setting a universal variable may trigger syncing
against the file which will modify other universal variables. But if we
want to support multiple environments we need the parser to decide when to
sync uvars. Shift the decision of when to sync to the parser itself. When a
universal variable is modified, now we just set a flag and it's up to the
(main) parser when to pick it up. This is hopefully just a refactoring with
no user-visible changes.
2022-05-30 14:09:06 -07:00
Fabian Boehm
4612343d6e
Merge pull request #8958 from faho/builtin-path
This adds a path builtin to deal with paths.

It offers the following subcommands:

    filter to go through a list of paths and only print the ones that pass some filter - exist, are a directory, have read permission, ...
    is as a shortcut for filter -q to only return true if one of the paths passed the filter
    basename, dirname and extension to print certain parts of the path
    change-extension to change the extension to a different one (as a string operation)
    normalize and resolve to canonicalize the paths in various flavors
    sort to sort paths, also only using the basename or dirname as a key

The definition of "extension" here was carefully considered and should line up with how extensions are actually used - ~/.bashrc doesn't have an extension, but ~/.conf.d does (".d").

These subcommands all compose well - they can read from arguments or stdin (like string), they can use null-delimited input or output (input is autodetected - if a NULL happens in the first PATH_MAX bytes it switches automatically).

It is both a failglob exception (so like set if a glob passed to it fails it just doesn't get any arguments for it instead of triggering an error), and passes output to command substitution buffers explicitly split (like string split0) so newlines are easy to handle.
2022-05-29 20:15:03 +02:00
Fabian Homborg
3f7e125b57 Also give path nullglob behavior
This is needed because you might feasibly give e.g. `path filter`
globs to further match, and they might already present no results.
It's also well-handled since path simply does nothing if given no paths.
2022-05-29 17:48:11 +02:00
ridiculousfish
d83e51a8a2 Rename check_cancel_from_fish_signal to fish_is_unwinding_for_exit
"unwinding_for_exit" mixes up SIGHUP handling and also the exit builtin;
this is still pretty messy.
2022-05-28 16:35:40 -07:00
ridiculousfish
ed78fd2a5f Rationalize path-getting
This cleans up the path_get_path function which is used to resolve a
command name against $PATH, by removing the dependence on errno and
being explicit about which error is returned.

Should be no user-visible change here.
2022-04-23 15:24:27 -07:00
Aaron Gyes
77d02c1bd6 parse_execution: remove unused 'job' parameters 2022-04-07 09:36:54 -07:00
Fabian Homborg
f13979bfbb Move executable-check to C++
This was already apparently supposed to work, but didn't because we
just overrode errno again.

This now means that, if a correctly named candidate exists, we don't
start the command-not-found handler.

See #8804
2022-03-31 15:16:01 +02:00
ridiculousfish
7b1321f9a1 Remove cancellation groups
Cancellation groups were meant to reflect the following idea: if you ran a
simple block:

    begin
        cmd1
        cmd2
    end

then under job control, cmd1 and cmd2 would get separate groups; however if
either exits due to SIGINT or SIGQUIT we also want to propagate that to the
outer block. So the outermost block and its interior jobs would share a
cancellation group. However this is more complex than necessary; it's
sufficient for the execution context to just store an int internally.

This ought not to affect anything user-visible.
2022-03-20 14:39:00 -07:00
ridiculousfish
3f585cddfc Refactor job pgroup assignment
This is a cleanup of job groups, rationalizing a bunch of stuff. Some
notable changes (none user-visible hopefully):

1. Previously, if a job group wanted a pgid, then we would assign it to the
   first process to run in the job group. Now we deliberately mark which
   process will own the pgroup, via a new `leads_pgrp` flag in process_t. This
   eliminates a source of ambiguity.

2. Previously, if a job were run inside fish's pgroup, we would set fish's
   pgroup as the group of the job. But this meant we had to check if the job
   had fish's pgroup in lots of places, for example when calling tcsetpgrp.
   Now a job group only has a pgrp if that pgrp is external (i.e. the job is
   under job control).
2022-03-19 14:06:18 -07:00
Aaron Gyes
eb990c07c8 Let's make src/ easier to grok, move builins to src/builtins
+ No functional change here, just renames and #include changes.
+ CMake can't have slashes in the target names. I'm suspciious of
  that weird machinery for test, but I made it work.
+ A couple of builtins did not include their own headers, that
  is no longer the case.
2021-11-09 17:39:10 -08:00
ridiculousfish
389b75fe42 Restyle codebase with clang-format 2021-11-08 12:21:11 -08:00
Aaron Gyes
710639f5d6 builtins: work on error messages
- Introduce BUILTIN_ERR_COMBO2_EXCLUSIVE
- Distill generally more terse, unambiguous error descriptions.
  Remember English is not everyone's language.
- Do not capitalize sentence fragments
- Use the modality where problem input is in a %s: prefix, then
  is explained.
- Do not address the user (the "You cannot do ..." kraderism)
- Spell out 'arguments' rather than 'args' for consistency
- Mention 'function' as a scope
2021-11-03 22:54:55 -07:00
Fabian Homborg
31d6abb177 Don't fire variable set event before entering a for-loop
Since #4376, for-loops would set the loop variable outside, so it
stays valid.

They did this by doing the equivalent of

```fish
set -l foo $foo
for foo in 1 2 3
```

And that first imaginary `set -l` would also fire a set-event.

Since there's no use for it and the variable isn't actually set, we
remove it.

Fixes #8384.
2021-10-28 16:32:58 +02:00
ridiculousfish
3848a68e5c Fix a misspeeling 2021-10-27 14:16:32 -07:00
Fabian Homborg
0c3c3eaa99 Reuse the variable event for for-loops
This used to construct a vector, which was then passed down and filled
with a new event_t each go around the loop. That's useless - we fire
one event here, and it's simply the variable event.

This reduces the overhead of a for-loop by ~10%:

```fish
for i in (seq 100000)
    true
end
```

runs in about 90% of the time now.
2021-10-26 17:38:35 +02:00
Fabian Homborg
d9f094db1a Check if the for variable is invalid before trying to set it 2021-10-26 16:59:03 +02:00
ridiculousfish
2ed0105692 Use std::move to populate a processes's args
This could save quite a few string copies.
2021-10-23 10:35:05 -07:00
ridiculousfish
59b63f3aab Use vec_append when expanding a command into arguments
This saves some lines and some allocations.
2021-10-23 10:10:26 -07:00
ridiculousfish
a634e78633 Remove an extra use of process_type_for_command
This just duplicated a previous call above.
2021-10-23 10:07:24 -07:00
ridiculousfish
2ca66cff53 Disable job control inside command substitutions
This disables job control inside command substitutions. Prior to this
change, a cmdsub might get its own process group. This caused it to fail
to cancel loops properly. For example:

    while true ; echo (sleep 5) ; end

could not be control-C cancelled, because the signal would go to sleep,
and so the loop would continue on. The simplest way to fix this is to
match other shells and not use job control in cmdsubs.

Related is #1362
2021-08-18 22:20:03 +08:00
Fabian Homborg
dd3cdbcfc9 Fix crash if $PWD is used as for-loop variable
for PWD in foo; true; end

prints:

>..src/parse_execution.cpp:461: end_execution_reason_t parse_execution_context_t::run_for_statement(const ast::for_header_t&, const ast::job_list_t&): Assertion `retval == ENV_OK' failed.

because this used the wrong way to see if something is read-only.
2021-07-30 15:33:04 +02:00
ridiculousfish
b914c94cc1 Stop storing 'is_block' inside the parser
is_block is a field which supports 'status is-block', and also controls
whether notifications get posted. However there is no reason to store
this as a distinct field since it is trivially computed from the block
list. Stop storing it. No functional changes in this commit.
2021-07-28 13:56:33 -07:00
Johannes Altmanninger
48c1550f61 Point to builtins begin/end when a failed command starts with "{"
Closes #6415
2021-06-23 21:47:40 +02:00
Johannes Altmanninger
565a7e4bc5 Minor refactoring to use early return in "handle_command_not_found" 2021-06-23 21:47:40 +02:00
Fabian Homborg
c96a07dc96 Revert "Prevent redirecting internal processes to file descriptors above 2"
FDs are inherited, and redirecting those is harmless, and forbidding
that is worse than allowing all.

Fixes #7769.

This reverts commit 11a373f121.
2021-03-03 22:26:33 +01:00
ridiculousfish
11a373f121 Prevent redirecting internal processes to file descriptors above 2
The user may write for example:

    echo foo >&5

and fish would try to output to file descriptor 5, within the fish process
itself. This has unpredictable effects and isn't useful. Make this an
error.

Note that the reverse is "allowed" but ignored:

    echo foo 5>&1

this conceptually dup2s stdout to fd 5, but since no builtin writes to fd
5 we ignore it.
2021-02-20 16:16:45 -08:00
ridiculousfish
cd9a035f02 Add a string_output_stream_t to collect builtin output
This is used when creating a function; this breaks a dependency on the
more complicated buffered_output_stream_t to ease refactoring.
2021-02-04 14:12:14 -08:00