Fix OOM caused by term search
The issue came from multi Cartesian product for exprs with many (25+) arguments, each having multiple options.
The solution is two fold:
### Avoid blowing up in Cartesian product
**Before the logic was:**
1. Find expressions for each argument/param - there may be many
2. Take the Cartesian product (which blows up in some cases)
4. If there are more than 2 options throw them away by squashing them to `Many`
**Now the logic is:**
1. Find expressions for each argument/param and squash them to `Many` if there are more than 2 as otherwise we are guaranteed to also have more than 2 after taking the product which means squashing them anyway.
2. Take the Cartesian product on iterator
3. Start consuming it one by one
4. If there are more than 2 options throw them away by squashing them to `Many` (same as before)
This is also why I had to update some tests as the expressions get squashed to many more eagerly.
### Use fuel to avoid long search times and high memory usage
Now all the tactics use `should_continue: Fn() -> bool` to chech if they should keep iterating _(Similarly to chalk)_.
This reduces the search times by a magnitude, for example from ~139ms/hole to ~14ms/hole for `ripgrep` crate.
There are slightly less expressions found, but I think speed gain worth it for usability.
Also note that syntactic hits decreases more because of squashing so you simple need to run search multiple times to get full terms.
Also the worst case time (For example `nalgebra` crate cus it has tons of generics) has search times mostly under 200ms.
Benchmarks on `ripgrep` crate
Before:
```
Tail Expr syntactic hits: 291/1692 (17%)
Tail Exprs found: 1253/1692 (74%)
Term search avg time: 139ms
````
After:
```
Tail Expr syntactic hits: 239/1692 (14%)
Tail Exprs found: 1226/1692 (72%)
Term search avg time: 14ms
```
feature: Make generate function assist generate a function as a constructor if the generated function has the name "new" and is an asscociated function.
close#17050
This PR makes `generate function assist` generate a function as a constructor if the generated function has the name "new" and is an asscociated function.
If the asscociate type is a record struct, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
Self { field_1: todo!(), field_2: todo!() }
}
}
```
If the asscociate type is a tuple struct, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
Self(todo!(), todo!())
}
}
```
If the asscociate type is a unit struct, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
Self
}
}
```
If the asscociate type is another adt, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
todo!()
}
}
```
It is bitset semantically --- many categorical things can be true about
a reference at the same time.
In parciular, a reference can be a "test" and a "write" at the same
time.
internal: improve `TokenSet` implementation and add reserved keywords
The current `TokenSet` type represents "A bit-set of `SyntaxKind`s" as a newtype `u128`.
Internally, the flag for each `SyntaxKind` variant in the bit-set is set as the n-th LSB (least significant bit) via a bit-wise left shift operation, where n is the discriminant.
Edit: This is problematic because there's currently ~121 token `SyntaxKind`s, so adding new token kinds for missing reserved keywords increases the number of token `SyntaxKind`s above 128, thus making this ["mask"](7a8374c162/crates/parser/src/token_set.rs (L31-L33)) operation overflow.
~~This is problematic because there's currently 266 SyntaxKinds, so this ["mask"](7a8374c162/crates/parser/src/token_set.rs (L31-L33)) operation silently overflows in release mode.~~
~~This leads to a single flag/bit in the bit-set being shared by multiple `SyntaxKind`s~~.
This PR:
- Changes the wrapped type for `TokenSet` from `u128` to `[u64; 3]` ~~`[u*; N]` (currently `[u16; 17]`) where `u*` can be any desirable unsigned integer type and `N` is the minimum array length needed to represent all token `SyntaxKind`s without any collisions~~.
- Edit: Add assertion that `TokenSet`s only include token `SyntaxKind`s
- Edit: Add ~7 missing [reserved keywords](https://doc.rust-lang.org/stable/reference/keywords.html#reserved-keywords)
- ~~Moves the definition of the `TokenSet` type to grammar codegen in xtask, so that `N` is adjusted automatically (depending on the chosen `u*` "base" type) when new `SyntaxKind`s are added~~.
- ~~Updates the `token_set_works_for_tokens` unit test to include the `__LAST` `SyntaxKind` as a way of catching overflows in tests.~~
~~Currently `u16` is arbitrarily chosen as the `u*` "base" type mostly because it strikes a good balance (IMO) between unused bits and readability of the generated `TokenSet` code (especially the [`union` method](7a8374c162/crates/parser/src/token_set.rs (L26-L28))), but I'm open to other suggestions or a better methodology for choosing `u*` type.~~
~~I considered using a third-party crate for the bit-set, but a direct implementation seems simple enough without adding any new dependencies. I'm not strongly opposed to using a third-party crate though, if that's preferred.~~
~~Finally, I haven't had the chance to review issues, to figure out if there are any parser issues caused by collisions due the current implementation that may be fixed by this PR - I just stumbled upon the issue while adding "new" keywords to solve #16858~~
Edit: fixes#16858