Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.
It's been a long time.
One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.
Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
This makes the awkward case
fish: Unexpected end of string, square brackets do not match
echo f[oo # not valid, no matching ]
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
(that `]` is simply the last character on the line, it's firmly in a comment)
less awkward by only marking the starting brace.
The implementation here is awkward mostly because the tok_t
communicates two things: The error location and how to carry on.
So we need to store the error length separately, and this is the first
time we've done so.
It's possible we can make this simpler.
This fixes a regression about where we report errors:
echo error(here
old: ^
fixed: ^
Commit 0c22f67bd (Remove the old parser bits, 2020-07-02) removed
uses of "error_offset_within_token" so we always report errors at
token start. Add it back, hopefully restoring the 3.1.2 behavior.
Note that for cases like
echo "$("
we report "unbalanced quotes" because we treat the $( as double
quote. Giving a better error seems hard because of the ambguity -
we don't know if quote is meant to be inside or outside the command
substitution.
Very slight performance increase (1% when parsing *all .fish scripts
in fish-shell*), but this removes a useless variable and some
.c_str()inging.
Theoretically it should also remove some wcslen() calls, but those
seem to be optimized out?
This removes the "did_visit" message because it doesn't really add
anything.
For example:
```
ast-construction: make job_list 0x55a6d19729f0
ast-construction: make job_conjunction 0x55a6d1971c00
ast-construction: will_visit job_conjunction 0x55a6d1971c00
ast-construction: will_visit job 0x55a6d1971c18
ast-construction: variable_assignment_list size: 0
ast-construction: will_visit statement 0x55a6d1971c48
ast-construction: make decorated_statement 0x55a6d1972650
ast-construction: will_visit decorated_statement 0x55a6d1972650
ast-construction: make argument_or_redirection 0x55a6d1968310
ast-construction: will_visit argument_or_redirection 0x55a6d1968310
ast-construction: make argument 0x55a6d197b0b0
ast-construction: did_visit argument_or_redirection 0x55a6d1968310
ast-construction: argument_or_redirection_list size: 1
ast-construction: did_visit decorated_statement 0x55a6d1972650
ast-construction: did_visit statement 0x55a6d1971c48
ast-construction: job_continuation_list size: 0
ast-construction: did_visit job 0x55a6d1971c18
ast-construction: job_conjunction_continuation_list size: 0
ast-construction: did_visit job_conjunction 0x55a6d1971c00
ast-construction: job_list size: 1
```
those "did_visit" messages all correspond to "will_visit" ones. They
are effectively block delimiters like `end` or `}`.
If we remove them it turns into:
```
ast-construction: make job_list 0x55a6d19729f0
ast-construction: make job_conjunction 0x55a6d1971c00
ast-construction: will_visit job_conjunction 0x55a6d1971c00
ast-construction: will_visit job 0x55a6d1971c18
ast-construction: variable_assignment_list size: 0
ast-construction: will_visit statement 0x55a6d1971c48
ast-construction: make decorated_statement 0x55a6d1972650
ast-construction: will_visit decorated_statement 0x55a6d1972650
ast-construction: make argument_or_redirection 0x55a6d1968310
ast-construction: will_visit argument_or_redirection 0x55a6d1968310
ast-construction: make argument 0x55a6d197b0b0
ast-construction: argument_or_redirection_list size: 1
ast-construction: job_continuation_list size: 0
ast-construction: job_conjunction_continuation_list size: 0
ast-construction: job_list size: 1
```
Which is still unambiguous because of the indentation.
(this is still *super verbose* and we might want to remove it from the
`*` "all" debug category and only allow turning it on explicitly)
In an interactive shell, typing "for x in (<RET>" would print an error:
fish: Expected end of the statement, but found a parse_token_type_t::tokenizer_error
Our tokenizer converts "(" into a special error token, hence this message.
Fix two cases by not reporting errors, but only if we allow parsing incomplete
input. I'm not really sure if this is necessary, but it's sufficient.
Fixes#7693
This reworks the "a=" detection to be simpler.
If we detect a variable assignment that produces an error,
simply consume it.
We also take the opportunity to not highlight it as an error,
and add some tests.
Original commit is 1ca05d32d3.
Typing that command in an interactive prompt would make the highlighter thread
eat up CPU and memory. Probably not the right fix; I think the token should
already have been consumed when the error is detected, then there is no need
to consume it when unwinding.
This is the first commit of a series intended to replace the existing
"parse tree" machinery. It adds a new abstract syntax tree and uses a more
normal recursive descent parser.
Initially there are no users of the new ast. The following commits will
replace parse_tree -> ast for all usages.