Fix OOM caused by term search
The issue came from multi Cartesian product for exprs with many (25+) arguments, each having multiple options.
The solution is two fold:
### Avoid blowing up in Cartesian product
**Before the logic was:**
1. Find expressions for each argument/param - there may be many
2. Take the Cartesian product (which blows up in some cases)
4. If there are more than 2 options throw them away by squashing them to `Many`
**Now the logic is:**
1. Find expressions for each argument/param and squash them to `Many` if there are more than 2 as otherwise we are guaranteed to also have more than 2 after taking the product which means squashing them anyway.
2. Take the Cartesian product on iterator
3. Start consuming it one by one
4. If there are more than 2 options throw them away by squashing them to `Many` (same as before)
This is also why I had to update some tests as the expressions get squashed to many more eagerly.
### Use fuel to avoid long search times and high memory usage
Now all the tactics use `should_continue: Fn() -> bool` to chech if they should keep iterating _(Similarly to chalk)_.
This reduces the search times by a magnitude, for example from ~139ms/hole to ~14ms/hole for `ripgrep` crate.
There are slightly less expressions found, but I think speed gain worth it for usability.
Also note that syntactic hits decreases more because of squashing so you simple need to run search multiple times to get full terms.
Also the worst case time (For example `nalgebra` crate cus it has tons of generics) has search times mostly under 200ms.
Benchmarks on `ripgrep` crate
Before:
```
Tail Expr syntactic hits: 291/1692 (17%)
Tail Exprs found: 1253/1692 (74%)
Term search avg time: 139ms
````
After:
```
Tail Expr syntactic hits: 239/1692 (14%)
Tail Exprs found: 1226/1692 (72%)
Term search avg time: 14ms
```
fix: keep parentheses when the precedence of inner expr is lower than the outer one
fix#17185
Additionally, this PR simplifies some code in `apply_demorgan`.
If rust-analyzer receives a malformed LSP request, the IO thread
terminates with a meaningful error, but then closes the channel.
Once the channel has closed, the main_loop also terminates, but it
only has RecvError and can't show a meaningful error. As a result,
rust-analyzer would incorrectly claim that the client forgot to
shutdown.
```
$ buggy_lsp_client | rust-analyzer
Error: client exited without proper shutdown sequence
```
Instead, include both error messages when the server shuts down.
Fix source_range for INT_NUMBER in completion
fix#17179.
Previously r-a use `TextRange::empty(self.position.offset)` as `source_range` for `INT_NUMBER`, so the `text_edit` would always be an insertion, which results in #17179.
This PR changed it by using `text_range` of `original_token` (same as `IDENT`).
The documentation for `lens.run.enable` states that it only applies
when `lens.enable` is set. However, the config setting whether to show
the Run lens did not check `lens.enable`, so the Run lens would show
even though lenses were disabled.
In the stabilization attempt of `#[unix_sigpipe = "sig_dfl"]`, a concern
was raised related to using a language attribute for the feature: Long
term, we want `fn lang_start()` to be definable by any crate, not just
libstd. Having a special language attribute in that case becomes
awkward.
So as a first step towards towards the next stabilization attempt, this
PR changes the `#[unix_sigpipe = "..."]` attribute to a compiler flag
`-Zon-broken-pipe=...` to remove that concern, since now the language
is not "contaminated" by this feature.
Another point was also raised, namely that the ui should not leak
**how** it does things, but rather what the **end effect** is. The new
flag uses the proposed naming. This is of course something that can be
iterated on further before stabilization.
fix: Tracing span names should match function names
When viewing traces, it's slightly confusing when the span name doesn't match the function name. Ensure the names are consistent.
(It might be worth moving most of these to use `#[tracing::instrument]` so the name can never go stale. `@davidbarsky` suggested that is marginally slower, so I've just done the simple change here.)
When viewing traces, it's slightly confusing when the span name doesn't
match the function name. Ensure the names are consistent.
(It might be worth moving most of these to use #[tracing::instrument]
so the name can never go stale. @davidbarsky suggested that is marginally
slower, so I've just done the simple change here.)
Before this commit `UseTree::remove_unnecessary_braces` removed the braces
around `{self}` in `use x::y::{self};` but `use x::y::self;` is not valid
rust.
feature: Make generate function assist generate a function as a constructor if the generated function has the name "new" and is an asscociated function.
close#17050
This PR makes `generate function assist` generate a function as a constructor if the generated function has the name "new" and is an asscociated function.
If the asscociate type is a record struct, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
Self { field_1: todo!(), field_2: todo!() }
}
}
```
If the asscociate type is a tuple struct, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
Self(todo!(), todo!())
}
}
```
If the asscociate type is a unit struct, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
Self
}
}
```
If the asscociate type is another adt, it generates the constructor like this.
```rust
impl Foo {
fn new() -> Self {
todo!()
}
}
```
Support hovering limits for adts
Fix#17009
1. Currently, r-a supports limiting the number of struct fields displayed when hovering. This PR extends it to support enum variants and union fields. Since the display of these three (ADTs) is similar, this PR extends 'hover_show_structFields' to 'hover_show_adtFieldsOrVariants'.
2. This PR also resolved the problem that the layout of ADT was not restricted by display limitations when hovering on the Self type.
3. Additionally, this PR changes the default value of display limitations to `10` (instead of the original `null`), which helps users discover this feature.
Make `cargo run` always available for binaries
Previously, items for `cargo test` and `cargo check` would appear as in
the `Select Runnable` quick pick that appears when running
`rust-analyzer: Run`, but `run` would only appear as a runnable if a
`main`` function was selected in the editor. This change adds `cargo
run` as an always available runnable command for binary packages.
This makes it easier to develop cli / tui applications, as now users can
run application from anywhere in their codebase.
chore: add some `tracing` to project loading
I wanted to see what's happening during project loading and if it could be parallelized. I'm thinking maybe, but it's not this PR :)
Try to generate more meaningful names in json converter
I just found out about rust-analyzer json converter, but I think it would be more convenient, if names were more useful, like using the names of the keys.
Let's look at some realistic arbitrary json:
```json
{
"user": {
"address": {
"street": "Main St",
"house": 3
},
"email": "example@example.com"
}
}
```
I think, new generated code is much easier to read and to edit, than the old:
```rust
// Old
struct Struct1{ house: i64, street: String }
struct Struct2{ address: Struct1, email: String }
struct Struct3{ user: Struct2 }
// New
struct Address1{ house: i64, street: String }
struct User1{ address: Address1, email: String }
struct Root1{ user: User1 }
```
Ideally, if we drop the numbers, I can see it being usable just as is (may be rename root)
```rust
struct Address{ house: i64, street: String }
struct User{ address: Address, email: String }
struct Root{ user: User }
```
Sadly, we can't just drop them, because there can be multiple fields (recursive) with the same name, and we can't just easily retroactively add numbers if the name has 2 instances due to parsing being single pass.
We could ignore the `1` and add number only if it's > 1, but I will leave this open to discussion and right now made it the simpler way
In sum, even with numbers, I think this PR still helps in readability
* Added config `runnables.extraTestBinaryArgs` to control the args.
* The default is `--show-output` rather than `--nocapture` to prevent
unreadable output when 2 or more tests fail or print output at once.
* Renamed variables in `CargoTargetSpec::runnable_args()` for clarity.
Fixes <https://github.com/rust-lang/rust-analyzer/issues/12737>.
Instead of using `core::fmt::format` to format panic messages, which may in turn
panic too and cause recursive panics and other messy things, redirect
`panic_fmt` to `const_panic_fmt` like CTFE, which in turn goes to
`panic_display` and does the things normally. See the tests for the full
call stack.
As of #6246, rust-analyzer follows symlinks. This can introduce an
infinite loop if symlinks point to parent directories.
Considering that #6246 was added in 2020 without many bug reports,
this is clearly a rare occurrence. However, I am observing
rust-analyzer hang on projects that have symlinks of the form:
```
test/a_symlink -> ../../
```
Ignore symlinks that only point to the parent directories, as this is
more robust but still allows typical symlink usage patterns.
fix: Replace Just the variable name in Unused Variable Diagnostic Fix
Changes Unused Variable diagnostic to just look at the variable name, not the entire syntax range.
Also added a test for an unused variable in an array destructure.
Closes#17053
It is bitset semantically --- many categorical things can be true about
a reference at the same time.
In parciular, a reference can be a "test" and a "write" at the same
time.
internal : redesign rust-analyzer::config
This PR aims to cover the infrastructural requirements for the `rust-analyzer.toml` ( #13529 ) issue. This means, that
1. We no longer have a single config base. The once single `ConfigData` has been divided into 4 : A tree of `.ratoml` files, a set of configs coming from the client ( this is what was called before the `CrateData` except that now values do not default to anything when they are not defined) , a set of configs that will reflect what the contents of a `ratoml` file defined in user's config directory ( e.g `~/.config/rust-analyzer/.rust-analyzer.toml` and finally a tree root that is populated by default values only.
2. Configs have also been divided into 3 different blocks : `global` , `local` , `client`. The current status of a config may change until #13529 got merged.
Once again many thanks to `@cormacrelf` for doing all the serde work.
internal: improve `TokenSet` implementation and add reserved keywords
The current `TokenSet` type represents "A bit-set of `SyntaxKind`s" as a newtype `u128`.
Internally, the flag for each `SyntaxKind` variant in the bit-set is set as the n-th LSB (least significant bit) via a bit-wise left shift operation, where n is the discriminant.
Edit: This is problematic because there's currently ~121 token `SyntaxKind`s, so adding new token kinds for missing reserved keywords increases the number of token `SyntaxKind`s above 128, thus making this ["mask"](7a8374c162/crates/parser/src/token_set.rs (L31-L33)) operation overflow.
~~This is problematic because there's currently 266 SyntaxKinds, so this ["mask"](7a8374c162/crates/parser/src/token_set.rs (L31-L33)) operation silently overflows in release mode.~~
~~This leads to a single flag/bit in the bit-set being shared by multiple `SyntaxKind`s~~.
This PR:
- Changes the wrapped type for `TokenSet` from `u128` to `[u64; 3]` ~~`[u*; N]` (currently `[u16; 17]`) where `u*` can be any desirable unsigned integer type and `N` is the minimum array length needed to represent all token `SyntaxKind`s without any collisions~~.
- Edit: Add assertion that `TokenSet`s only include token `SyntaxKind`s
- Edit: Add ~7 missing [reserved keywords](https://doc.rust-lang.org/stable/reference/keywords.html#reserved-keywords)
- ~~Moves the definition of the `TokenSet` type to grammar codegen in xtask, so that `N` is adjusted automatically (depending on the chosen `u*` "base" type) when new `SyntaxKind`s are added~~.
- ~~Updates the `token_set_works_for_tokens` unit test to include the `__LAST` `SyntaxKind` as a way of catching overflows in tests.~~
~~Currently `u16` is arbitrarily chosen as the `u*` "base" type mostly because it strikes a good balance (IMO) between unused bits and readability of the generated `TokenSet` code (especially the [`union` method](7a8374c162/crates/parser/src/token_set.rs (L26-L28))), but I'm open to other suggestions or a better methodology for choosing `u*` type.~~
~~I considered using a third-party crate for the bit-set, but a direct implementation seems simple enough without adding any new dependencies. I'm not strongly opposed to using a third-party crate though, if that's preferred.~~
~~Finally, I haven't had the chance to review issues, to figure out if there are any parser issues caused by collisions due the current implementation that may be fixed by this PR - I just stumbled upon the issue while adding "new" keywords to solve #16858~~
Edit: fixes#16858
Better inline preview for postfix completion
Better inline preview for postfix completion, a proper implementation of c5686c8941.
Here editors may filter completion item with the text within `delete_range`, so we need to include the `receiver text` in the `lookup` (aka `FilterText` in LSP spec) for editors to find the completion item. (See https://github.com/rust-lang/rust-analyzer/issues/17036#issuecomment-2056614180, Thanks to [pascalkuthe](https://github.com/pascalkuthe))
fix: Fix inlay hint resolution being broken
So, things broke because we now store a hash (u64) in the resolution payload, but javascript and hence JSON only support integers of up to 53 bits (anything beyond gets truncated in various ways) which caused almost all hashes to always differ when resolving them. This masks the hash to 53 bits to work around that.
Fixes https://github.com/rust-lang/rust-analyzer/issues/16962
fix: VFS should not confuse paths with source roots that have the same prefix
Previously, the VFS would assign paths to the source root that had the longest string prefix match. This would break when we had source roots in subdirectories:
```
/foo
/foo/bar
```
Given a file `/foo/bar_baz.rs`, we would attribute it to the `/foo/bar` source root, which is wrong.
As a result, we would attribute paths to the wrong crate when a crate was in a subdirectory of another one. This is more common in larger monorepos, but could occur in any Rust project.
Fix this in the VFS, and add a test.
internal: make function builder create ast directly
I am working on #17050.
In the process, I noticed a place in the code that could be refactored.
Currently, the `function builder` creates the `ast` through the `function template` , but those two processes can be combined into one function.
I thought I should work on this first and created a PR.
Do not accept the following
```rust
macro_rules! lexes {($($_:tt)*) => {}}
lexes!(🐛"foo");
```
Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro expansion of literal prefixes.
Fix#123696.
Fix off-by-one error converting to LSP UTF8 offsets with multi-byte char
On this file,
```rust
fn main() {
let 된장 = 1;
}
```
when using `"positionEncodings":["utf-16"]` I get an "unused variable" diagnostic on the variable
name (codepoint offset range `8..10`). So far so good.
When using `positionEncodings":["utf-8"]`, I expect to get the equivalent range in bytes (LSP:
"Character offsets count UTF-8 code units (e.g bytes)."), which is `8..14`, because both
characters are 3 bytes in UTF-8. However I actually get `10..14`.
Looks like this is because we accidentally treat a 1-based index as an offset value: when
converting from our internal char-indices to LSP byte offsets, we look at one character to many.
This causes wrong results if the extra character is a multi-byte one, such as when computing
the start coordinate of 된장.
Fix that by actually passing an offset. While at it, fix the variable name of the line number,
which is not an offset (yet).
Originally reported at https://github.com/kakoune-lsp/kakoune-lsp/issues/740
On this file,
```rust
fn main() {
let 된장 = 1;
}
```
when using `"positionEncodings":["utf-16"]` I get an "unused variable" diagnostic on the variable
name (codepoint offset range `8..10`). So far so good.
When using `positionEncodings":["utf-8"]`, I expect to get the equivalent range in bytes (LSP:
"Character offsets count UTF-8 code units (e.g bytes)."), which is `8..14`, because both
characters are 3 bytes in UTF-8. However I actually get `10..14`.
Looks like this is because we accidentally treat a 1-based index as an offset value: when
converting from our internal char-indices to LSP byte offsets, we look at one character to many.
This causes wrong results if the extra character is a multi-byte one, such as when computing
the start coordinate of 된장.
Fix that by actually passing an offset. While at it, fix the variable name of the line number,
which is not an offset (yet).
Originally reported at https://github.com/kakoune-lsp/kakoune-lsp/issues/740
Changed the completion item source_range to match
the replaced text. Though in VS Code it may not be
disturbing because the snippet is previewed in a
box, but in Helix editor, it's previewed by applying
the main text edit.
pattern analysis: Use contiguous indices for enum variants
The main blocker to using the in-tree version of the `pattern_analysis` crate is that rustc requires enum indices to be contiguous because it uses `IndexVec`/`BitSet` for performance. Currently we swap these out for `FxHashMap`/`FxHashSet` when the `rustc` feature is off, but we can't do that if we use the in-tree crate.
This PR solves the problem by using contiguous indices on the r-a side too.
Fix crate IDs when multiple workspaces are loaded
Previously, we assumed that the crate numbers in a `rust-project.json` always matched the `CrateId` values in the crate graph. This isn't true when there are multiple workspaces, because the crate graphs are merged and the `CrateId` values in the merged graph are different.
This broke flycheck (see first commit), because we were unable to find the workspace when a file changed, so we every single flycheck, producing duplicate compilation errors.
Instead, use the crate root module path to look up the relevant flycheck. This makes `ProjectWorkspace::Json` consistenet with `ProjectWorkspace::Cargo`.
Also, define a separate JSON crate number type, to prevent bugs like this happening again.
feat: Add `rust-analyzer.cargo.allTargets` to configure passing `--all-targets` to cargo invocations
Closes#16859
## Unresolved question:
Should this be a setting for build scripts only ? All the other `--all-targets` I found where already covered by `checkOnSave.allTargets`
Previously, items for `cargo test` and `cargo check` would appear as in
the `Select Runnable` quick pick that appears when running
`rust-analyzer: Run`, but `run` would only appear as a runnable if a
`main`` function was selected in the editor. This change adds `cargo
run` as an always available runnable command for binary packages.
This makes it easier to develop cli / tui applications, as now users can
run application from anywhere in their codebase.
Handle panicking like rustc CTFE does
Instead of using `core::fmt::format` to format panic messages, which may in turn panic too and cause recursive panics and other messy things, redirect `panic_fmt` to `const_panic_fmt` like CTFE, which in turn goes to `panic_display` and does the things normally. See the tests for the full call stack.
The tests don't work yet, I probably missed something in minicore.
fixes#16907 in my local testing, I also need to add a test for it
fix: Prevent stack overflow in recursive const types
In the evaluation of const values of recursive types certain declarations could cause an endless call-loop within the interpreter (hir-ty’s create_memory_map), which would lead to a stack overflow.
This commit adds a check that prevents values that contain an address in their value (such as TyKind::Ref) from being allocated at the address they contain.
The commit also adds a test for this edge case.
Instead of using `core::fmt::format` to format panic messages, which may in turn
panic too and cause recursive panics and other messy things, redirect
`panic_fmt` to `const_panic_fmt` like CTFE, which in turn goes to
`panic_display` and does the things normally. See the tests for the full
call stack.