69: Incremental reparsing for single tokens r=matklad a=darksv
Implement incremental reparsing for `WHITESPACE`, `COMMENT`, `DOC_COMMENT`, `IDENT`, `STRING` and `RAW_STRING`. This allows to avoid reparsing whole blocks when a change was made only within these tokens.
Co-authored-by: darksv <darek969-12@o2.pl>
This commit is an example of fixing a common parser error: infinite
loop due to error recovery.
This error typically happens when we parse a list of items and fail to
parse a specific item at the current position.
One choices is to skip a token and try to parse a list item at the
next position. This is a good, but not universal, default. When
parsing a list of arguments in a function call, you, for example,
don't want to skip over `fn`, because it's most likely that it is a
function declaration, and not a mistyped arg:
```
fn foo() {
quux(1, 2
fn bar() {
}
```
Another choice is to bail out of the loop immediately, but it isn't
perfect either: sometimes skipping over garbage helps:
```
quux(1, foo:, 92) // should skip over `:`, b/c that's part of `foo::bar`
```
In general, parser tries to balance these two cases, though we don't
have a definitive strategy yet.
However, if the parser accidentally neither skips over a token, nor
breaks out of the loop, then it becomes stuck in the loop infinitely
(there's an internal counter to self-check this situation and panic
though), and that's exactly what is demonstrated by the test.
To fix such situation, first of all, add the test case to tests/data/parser/{err,fuzz-failures}.
Then, run
```
RUST_BACKTRACE=short cargo test --package libsyntax2
````
to verify that parser indeed panics, and to get an idea what grammar
production is the culprit (look for `_list` functions!).
In this case, I see
```
10: libsyntax2::grammar::expressions::atom::match_arm_list
at crates/libsyntax2/src/grammar/expressions/atom.rs:309
```
and that's look like it might be a culprit. I verify it by adding
`eprintln!("loopy {:?}", p.current());` and indeed I see that this is
printed repeatedly.
Diagnosing this a bit shows that the problem is that
`pattern::pattern` function does not consume anything if the next
token is `let`. That is a good default to make cases like
```
let
let foo = 92;
```
where the user hasn't typed the pattern yet, to parse in a reasonable
they correctly.
For match arms, pretty much the single thing we expect is a pattern,
so, for a fix, I introduce a special variant of pattern that does not
do recovery.
As described in #61, fuzz testing some parts of this would be ~~fun~~
helpful. So, I started with the most trivial fuzzer I could think of:
Put random stuff into File::parse and see what happens.
To speed things up, I also did
cp src/**/*.rs fuzz/corpus/parser/
in the `crates/libsyntax2/` directory (running the fuzzer once will
generate the necessary directories).
56: Unify lookahead naming between parser and lexer. r=matklad a=zachlute
Resolves Issue #26.
I wanted to play around with libsyntax2, and fixing a random issue seemed like a good way to mess around in the code.
This PR mostly does what's suggested in that issue. I elected to go with `at` and `at_str` instead of trying to do any fancy overloading shenanigans, because...uh, well, frankly I don't really know how to do any fancy overloading shenanigans. The only really questionable bit is `nth_is_p`, which could also have potentially been named `nth_at_p`, but `is` seemed more apropos.
I also added simple tests for `Ptr` so I could be less terrified I broke something.
Comments and criticisms very welcome. I'm still pretty new to Rust.
Co-authored-by: Zach Lute <zach.lute@gmail.com>