fish-shell

mirror of https://github.com/fish-shell/fish-shell synced 2025-02-15 21:49:26 +00:00

Author	SHA1	Message	Date
Johannes Altmanninger	971d257e67	Port AST to Rust The translation is fairly direct though it adds some duplication, for example there are multiple "match" statements that mimic function overloading. Rust has no overloading, and we cannot have generic methods in the Node trait (due to a Rust limitation, the error is like "cannot be made into an object") so we include the type name in method names. Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor") that takes care of the AST traversal while the AST consumption remains in C++ for now. In future, "IndentVisitor" should absorb the entirety of "indent_visitor_t". This pattern requires that "fish_indent" be exposed includable header to the CXX bridge. Alternatively, we could define FFI wrappers for recursive AST traversal. Rust requires we separate the AST visitors for "mut" and "const" scenarios. Take this opportunity to concretize both visitors: The only client that requires mutable access is the populator. To match the structure of the C++ populator which makes heavy use of function overloading, we need to add a bunch of functions to the trait. Since there is no other mutable visit, this seems acceptable. The "const" visitors never use "will_visit_fields_of()" or "did_visit_fields_of()", so remove them (though this is debatable). Like in the C++ implementation, the AST nodes themselves are largely defined via macros. Union fields like "Statement" and "ArgumentOrRedirection" do currently not use macros but may in future. This commit also introduces a precedent for a type that is defined in one CXX bridge and used in another one - "ParseErrorList". To make this work we need to manually define "ExternType". There is one annoyance with CXX: functions that take explicit lifetime parameters require to be marked as unsafe. This makes little sense because functions that return `&Foo` with implicit lifetime can be misused the same way on the C++ side. One notable change is that we cannot directly port "find_block_open_keyword()" (which is used to compute an error) because it relies on the stack of visited nodes. We cannot modify a stack of node references while we do the "mut" walk. Happily, an idiomatic solution is easy: we can tell the AST visitor to backtrack to the parent node and create the error there. Since "node_t::accept_base" is no longer a template we don't need the "node_visitation_t" trampoline anymore. The added copying at the FFI boundary makes things slower (memcpy dominates the profile) but it's not unusable, which is good news: $ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'" Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish' Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms] Range (min … max): 193.2 ms … 205.1 ms 15 runs Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish' Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms] Range (min … max): 611.7 ms … 805.5 ms 10 runs Summary './fish.old -c 'source ../share/completions/git.fish'' ran 3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish'' Leftovers: - Enum variants are still snakecase; I didn't get around to changing this yet. - "ast_type_to_string()" still returns a snakecase name. This could be changed since it's not user visible.	2023-04-16 17:46:56 +02:00
Johannes Altmanninger	05bad5eda1	Port common.{h,cpp} to Rust Most of it is duplicated, hence untested. Functions like mbrtowc are not exposed by the libc crate, so declare them ourselves. Since we don't know the definition of C macros, add two big hacks to make this work: 1. Replace MB_LEN_MAX and mbstate_t with values (resp types) that should be large enough for any implementation. 2. Detect the definition of MB_CUR_MAX in the build script. This requires more changes for each new libc. We could also use this approach for 1. Additionally, this commit brings a small behavior change to read_unquoted_escape(): we cannot decode surrogate code points like \UDE01 into a Rust char, so use � (\UFFFD, replacement character) instead. Previously, we added such code points to a wcstring; looks like they were ignored when printed.	2023-04-02 15:17:06 +02:00
Johannes Altmanninger	39f3c894d7	Port tokenizer.cpp to Rust In hindsight, I should probably have split this into three different commits.	2023-02-09 00:37:22 +01:00
Johannes Altmanninger	7f8d247211	Port parse_constants.h to Rust	2023-02-09 00:37:22 +01:00
Johannes Altmanninger	9ca160eac2	Convert parse_error_code_t to a scoped enum This will make the Rust port's diff smaller.	2023-02-08 21:49:54 +01:00
Aaron Gyes	daf5e11179	Spelling fixes Found with scspell	2022-10-28 20:10:09 -07:00
ridiculousfish	3eae0a9b6a	clang-format all C++ files This mostly re-sorts headers that got desorted after the IWYU application in `14d2a6d8ff`.	2022-08-21 15:02:19 -07:00
Aaron Gyes	14d2a6d8ff	IWYU-guided #include rejiggering. Let's hope this doesn't causes build failures for e.g. musl: I just know it's good on macOS and our Linux CI. It's been a long time. One fix this brings, is I discovered we #include assert.h or cassert in a lot of places. If those ever happen to be in a file that doesn't include common.h, or we are before common.h gets included, we're unawaringly working with the system 'assert' macro again, which may get disabled for debug builds or at least has different behavior on crash. We undef 'assert' and redefine it in common.h. Those were all eliminated, except in one catch-22 spot for maybe.h: it can't include common.h. A fix might be to make a fish_assert.h that usually common.h exports.	2022-08-20 23:55:18 -07:00
Fabian Boehm	5ef457cfd3	Make tokenizer delimiter errors one long This makes the awkward case fish: Unexpected end of string, square brackets do not match echo f[oo # not valid, no matching ] ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ (that `]` is simply the last character on the line, it's firmly in a comment) less awkward by only marking the starting brace. The implementation here is awkward mostly because the tok_t communicates two things: The error location and how to carry on. So we need to store the error length separately, and this is the first time we've done so. It's possible we can make this simpler.	2022-08-12 18:38:47 +02:00
Aaron Gyes	9c1b3d6712	fix cppcheck missing return errors	2022-04-08 17:59:09 -07:00
ridiculousfish	247d4b2c8f	Rename EXEC_ERR_MSG to INVALID_PIPELINE_CMD_ERR_MSG This error message was used for more than exec. No functional change here.	2022-03-31 15:49:15 -07:00
Aaron Gyes	9abbcc0ba6	ast.cpp: initialization order member 'extras_' is used to initialize 'tokens_'. members are initialized in the order they are declared in a class	2021-12-20 22:43:58 -08:00
Johannes Altmanninger	b5739ddacf	Report sub-token error locations again This fixes a regression about where we report errors: echo error(here old: ^ fixed: ^ Commit `0c22f67bd` (Remove the old parser bits, 2020-07-02) removed uses of "error_offset_within_token" so we always report errors at token start. Add it back, hopefully restoring the 3.1.2 behavior. Note that for cases like echo "$(" we report "unbalanced quotes" because we treat the $( as double quote. Giving a better error seems hard because of the ambguity - we don't know if quote is meant to be inside or outside the command substitution.	2021-12-04 16:52:13 +01:00
Fabian Homborg	66566c1650	ast: Stop dropping down to wchar_t needlessly Very slight performance increase (1% when parsing all .fish scripts in fish-shell), but this removes a useless variable and some .c_str()inging. Theoretically it should also remove some wcslen() calls, but those seem to be optimized out?	2021-11-08 21:24:00 +01:00
ridiculousfish	389b75fe42	Restyle codebase with clang-format	2021-11-08 12:21:11 -08:00
Aaron Gyes	55ab2f6e6d	Remove some unsued macros and a template	2021-10-01 04:50:19 -07:00
ridiculousfish	a6010519df	Put ast_t::populator_t into an anonymous namespace This is part of an effort to reduce the number of weak symbols, so we can help prevent ODR errors. No functional change here.	2021-09-30 11:33:03 -07:00
Rosen Penev	a00ebc65af	remove make_pair There are better alternatives with C++11. Signed-off-by: Rosen Penev <rosenp@gmail.com>	2021-08-05 12:12:28 +02:00
Fabian Homborg	8317b8bb8a	Make ast FLOGs slightly less chatty This removes the "did_visit" message because it doesn't really add anything. For example: ``` ast-construction: make job_list 0x55a6d19729f0 ast-construction: make job_conjunction 0x55a6d1971c00 ast-construction: will_visit job_conjunction 0x55a6d1971c00 ast-construction: will_visit job 0x55a6d1971c18 ast-construction: variable_assignment_list size: 0 ast-construction: will_visit statement 0x55a6d1971c48 ast-construction: make decorated_statement 0x55a6d1972650 ast-construction: will_visit decorated_statement 0x55a6d1972650 ast-construction: make argument_or_redirection 0x55a6d1968310 ast-construction: will_visit argument_or_redirection 0x55a6d1968310 ast-construction: make argument 0x55a6d197b0b0 ast-construction: did_visit argument_or_redirection 0x55a6d1968310 ast-construction: argument_or_redirection_list size: 1 ast-construction: did_visit decorated_statement 0x55a6d1972650 ast-construction: did_visit statement 0x55a6d1971c48 ast-construction: job_continuation_list size: 0 ast-construction: did_visit job 0x55a6d1971c18 ast-construction: job_conjunction_continuation_list size: 0 ast-construction: did_visit job_conjunction 0x55a6d1971c00 ast-construction: job_list size: 1 ``` those "did_visit" messages all correspond to "will_visit" ones. They are effectively block delimiters like `end` or `}`. If we remove them it turns into: ``` ast-construction: make job_list 0x55a6d19729f0 ast-construction: make job_conjunction 0x55a6d1971c00 ast-construction: will_visit job_conjunction 0x55a6d1971c00 ast-construction: will_visit job 0x55a6d1971c18 ast-construction: variable_assignment_list size: 0 ast-construction: will_visit statement 0x55a6d1971c48 ast-construction: make decorated_statement 0x55a6d1972650 ast-construction: will_visit decorated_statement 0x55a6d1972650 ast-construction: make argument_or_redirection 0x55a6d1968310 ast-construction: will_visit argument_or_redirection 0x55a6d1968310 ast-construction: make argument 0x55a6d197b0b0 ast-construction: argument_or_redirection_list size: 1 ast-construction: job_continuation_list size: 0 ast-construction: job_conjunction_continuation_list size: 0 ast-construction: job_list size: 1 ``` Which is still unambiguous because of the indentation. (this is still super verbose and we might want to remove it from the `*` "all" debug category and only allow turning it on explicitly)	2021-05-03 12:01:21 +02:00
Johannes Altmanninger	86707378cc	Also allow unclosed quotes in some places See #7693	2021-02-09 22:38:16 +01:00
Johannes Altmanninger	38b95defbd	Inside an unclosed subshell, do not report other parse errors In an interactive shell, typing "for x in (<RET>" would print an error: fish: Expected end of the statement, but found a parse_token_type_t::tokenizer_error Our tokenizer converts "(" into a special error token, hence this message. Fix two cases by not reporting errors, but only if we allow parsing incomplete input. I'm not really sure if this is necessary, but it's sufficient. Fixes #7693	2021-02-09 22:19:42 +01:00
Fabian Homborg	575d574691	ast: Remove unused variable	2020-11-15 11:19:23 +01:00
Mahmoud Al-Qudsi	90433f6ea3	Minimize AST node vector reallocations Closes #7201	2020-09-08 11:55:10 -05:00
Johannes Altmanninger	a20721a278	Address some warnings from clangd	2020-08-29 12:02:18 +02:00
Johannes Altmanninger	8d3b66fb52	Reflow comment	2020-08-04 21:44:59 +02:00
ridiculousfish	3319e308d0	Make ast::node_t non-virtual Eliminate its vtable to save 8 bytes per node, which is a lot!	2020-07-12 16:57:30 -07:00
ridiculousfish	8d37be2916	ast lists to use new[] instead of vector Because the list is not changed after construction, we do not need the vector's capacity field. This reduces the size of lists from 48 to 32 bytes.	2020-07-12 16:57:30 -07:00
ridiculousfish	225470493b	Make parse_token_type_t an enum class Improves type safety.	2020-07-09 14:22:04 -07:00
ridiculousfish	35cb449aa1	Make parse_statement_decoration_t a class enum	2020-07-07 16:28:39 -07:00
ridiculousfish	5308223212	Migrate next_parse_token into token_stream_t Cleaning up parse_tree.cpp with an eye to remove it.	2020-07-07 14:01:01 -07:00
ridiculousfish	6976d0ee7e	Simplify infinite loop fix when parsing "a=" This reworks the "a=" detection to be simpler. If we detect a variable assignment that produces an error, simply consume it. We also take the opportunity to not highlight it as an error, and add some tests. Original commit is `1ca05d32d3`.	2020-07-05 12:15:18 -07:00
Johannes Altmanninger	1ca05d32d3	Fix infinite loop when parsing "a=" Typing that command in an interactive prompt would make the highlighter thread eat up CPU and memory. Probably not the right fix; I think the token should already have been consumed when the error is detected, then there is no need to consume it when unwinding.	2020-07-05 13:19:48 +02:00
ridiculousfish	4d4455007d	Introduce a new fish ast This is the first commit of a series intended to replace the existing "parse tree" machinery. It adds a new abstract syntax tree and uses a more normal recursive descent parser. Initially there are no users of the new ast. The following commits will replace parse_tree -> ast for all usages.	2020-07-04 14:58:02 -07:00

33 commits