perf: Speed up search for short associated functions, especially very common identifiers such as `new`
`@Veykril` said in https://github.com/rust-lang/rust-analyzer/pull/17908#issuecomment-2292958068 that people complain searches for `new()` are slow (they are right), so here I am to help!
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) branch of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
You can find the strict branch at https://github.com/ChayimFriedman2/rust-analyzer/tree/speedup-new-usages-strict.
Should fix#7404, I guess (will check now).
fix: run flycheck without rev_deps when target is specified
Since querying for a crate's target is a call to salsa and therefore blocking, flycheck task is now deferred out of main thread by using `GlobalState`s `deferred_task_queue`. Fixes#17829 and https://github.com/rust-lang/rustlings/issues/2071
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
This avoids the need to analyze the file when we are not inside a macro call.
This is especially important for the optimization in the next commit(s), as there the common case will be to descent into macros but then not analyze.
Remove the ability to configure the user config path
Being able to do this makes little sense as this is effectively a cyclic dependency (and we do not want to fixpoint this really).
Priming caches is a performance win, but it takes a lock on the salsa
database and prevents rust-analyzer from responding to e.g. go-to-def
requests.
This causes confusion for users, who see the spinner next to
rust-analyzer in the VS Code footer stop, so they start attempting to
navigate their code.
Instead, set the `quiescent` status in LSP to false during cache
priming, so the VS Code spinner persists until we can respond to any
LSP request.
chore(config): remove `invocationLocation` in favor of `invocationStrategy`
These flags were added to help rust-analyzer integrate with repos requiring non-Cargo invocations. The consensus is that having two independent settings are no longer needed. This change removes `invocationLocation` in favor of `invocationStrategy` and changes the internal representation of `InvocationStrategy::Once` to hold the workspace root.
Closes#17848.
These flags were added to help rust-analyzer integrate with repos
requiring non-Cargo invocations. The consensus is that having two
independent settings are no longer needed. This change removes
`invocationLocation` in favor of `invocationStrategy` and changes
the internal representation of `InvocationStrategy::Once` to hold
the workspace root.
fix: Wrong BoundVar index when lowering impl trait parameter of parent generics
Fixes#17711
From the following test code;
```rust
//- minicore: deref
use core::ops::Deref;
struct Struct<'a, T>(&'a T);
trait Trait {}
impl<'a, T: Deref<Target = impl Trait>> Struct<'a, T> {
fn foo(&self) -> &Self { self }
fn bar(&self) {
let _ = self.foo();
}
}
```
when we call `register_obligations_for_call` for `let _ = self.foo();`,
07659783fd/crates/hir-ty/src/infer/expr.rs (L1939-L1952)
we are querying `generic_predicates` and it has `T: Deref<Target = impl Trait>` predicate from the parent `impl Struct`;
07659783fd/crates/hir-ty/src/lower.rs (L375-L399)
but as we can see above, lowering `TypeRef = impl Trait` doesn't take into account the parent generic parameters, so the `BoundVar` index here is `0`, as `fn foo` has no generic args other than parent's,
But this `BoundVar` is pointing at `'a` in `<'a, T: Deref<Target = impl Trait>>`.
So, in the first code reference `register_obligations_for_call`'s L:1948 - `.substitute(Interner, parameters)`, we are substituting `'a` with `Ty`, not `Lifetime` and this makes panic inside the chalk.
This PR fixes this wrong `BoundVar` index in such cases
Add scip/lsif flag to exclude vendored libaries
#17809 changed StaticIndex to include vendored libraries. This PR adds a flag to disable that behavior.
At work, our monorepo has too many rust targets to index all at once, so we split them up into several shards. Since all of our libraries are vendored, if rust-analyzer includes them, sharding no longer has much benefit, because every shard will have to index the entire transitive dependency graphs of all of its targets. We get around the issue presented in #17809 because some other shard will index the libraries directly.
fix: Properly account for editions in names
This PR touches a lot of parts. But the main changes are changing `hir_expand::Name` to be raw edition-dependently and only when necessary (unrelated to how the user originally wrote the identifier), and changing `is_keyword()` and `is_raw_identifier()` to be edition-aware (this was done in #17896, but the FIXMEs were fixed here).
It is possible that I missed some cases, but most IDE parts should properly escape (or not escape) identifiers now.
The rules of thumb are:
- If we show the identifier to the user, its rawness should be determined by the edition of the edited crate. This is nice for IDE features, but really important for changes we insert to the source code.
- For tests, I chose `Edition::CURRENT` (so we only have to (maybe) update tests when an edition becomes stable, to avoid churn).
- For debugging tools (helper methods and logs), I used `Edition::LATEST`.
Reviewing notes:
This is a really big PR but most of it is mechanical translation. I changed `Name` displayers to require an edition, and followed the compiler errors. Most methods just propagate the edition requirement. The interesting cases are mostly in `ide-assists`, as sometimes the correct crate to fetch the edition from requires awareness (there may be two). `ide-completions` and `ide-diagnostics` were solved pretty easily by introducing an edition field to their context. `ide` contains many features, for most of them it was propagated to the top level function and there the edition was fetched based on the file.
I also fixed all FIXMEs from #17896. Some required introducing an edition parameter (usually not for many methods after the changes to `Name`), some were changed to a new method `is_any_identifier()` because they really want any possible keyword.
Fixes#17895.
Fixes#17774.
This PR touches a lot of parts. But the main changes are changing
`hir_expand::Name` to be raw edition-dependently and only when necessary
(unrelated to how the user originally wrote the identifier),
and changing `is_keyword()` and `is_raw_identifier()` to be edition-aware
(this was done in #17896, but the FIXMEs were fixed here).
It is possible that I missed some cases, but most IDE parts should properly
escape (or not escape) identifiers now.
The rules of thumb are:
- If we show the identifier to the user, its rawness should be determined
by the edition of the edited crate. This is nice for IDE features,
but really important for changes we insert to the source code.
- For tests, I chose `Edition::CURRENT` (so we only have to (maybe) update
tests when an edition becomes stable, to avoid churn).
- For debugging tools (helper methods and logs), I used `Edition::LATEST`.
Allow flycheck process to exit gracefully
Assuming it isn't cancelled. Closes#17902.
The only place CommandHandle::join() is used is when the flycheck command
finishes, so this commit changes the behavior of the method itself.
The only reason I can see for the existing behavior is if the command is somehow holding onto a build lock longer than it should, this would force it to be released. But it would be a pretty heavy-handed way to solve that issue. I'm not aware of this occurring in practice.
Test for word boundary in `FindUsages`
This speeds up short identifiers search significantly, while unlikely to have an effect on long identifiers (the analysis takes much longer than some character comparison).
Tested by finding all references to `eq()` (from `PartialEq`) in the rust-analyzer repo. Total time went down from 100s to 10s (a 10x reduction!).
Feel free to close this if you consider this a non-issue, as most short identifiers are local.
internal: Replace once_cell with std's recently stabilized OnceCell/Lock and LazyCell/Lock
This doesn't get rid of the once_cell dependency, unfortunately, since we have dependencies that use it, but it's a nice to do cleanup. And when our deps will eventually get rid of once_cell we will get rid of it for free.
This speeds up short identifiers search significantly, while unlikely to have an effect on long identifiers (the analysis takes much longer than some character comparison).
Tested by finding all references to `eq()` (from `PartialEq`) in the rust-analyzer repo. Total time went down from 100s to 10s (a 10x reduction!).
This doesn't get rid of the once_cell dependency, unfortunately, since we have dependencies that use it, but it's a nice to do cleanup. And when our deps will eventually get rid of once_cell we will get rid of it for free.
Assuming it isn't cancelled. Closes#17902.
The only place CommandHandle::join is used is when the flycheck command
finishes, so this commit changes the behavior of the method itself.
fix: Panic while hovering associated function with type annotation on generic param that not inherited from its container type
Fixes#17871
We call `generic_args_sans_defaults` here;
64a140527b/crates/hir-ty/src/display.rs (L1021-L1034)
but the following substitution inside that function panic in #17871;
64a140527b/crates/hir-ty/src/display.rs (L1468)
it's because the `Binders.binder` inside `default_parameters` has a same length with the generics of the function we are hovering on, but the generics of it is split into two, `fn_params` and `parent_params`.
Because of this, it may panic if the function has one or more default parameters and both `fn_params` and `parent_params` are non-empty, like the case in the title of this PR.
So, we must call `generic_args_sans_default` first and then split it into `fn_params` and `parent_params`
fix: Panic while canonicalizing erroneous projection type
Fixes#17866
The root cause of #17866 is quite horrifyng 😨
```rust
trait T {
type A;
}
type Foo = <S as T>::A; // note that S isn't defined
fn main() {
Foo {}
}
```
While inferencing alias type `Foo = <S as T>::A`;
78c2bdce86/crates/hir-ty/src/infer.rs (L1388-L1398)
the error type `S` in it is substituted by inference var in L1396 above as below;
78c2bdce86/crates/hir-ty/src/infer/unify.rs (L866-L869)
This new inference var's index is `1`, as the type inferecing procedure here previously inserted another inference var into same `InferenceTable`.
But after that, the projection type made from the above then passed to the following function;
78c2bdce86/crates/hir-ty/src/traits.rs (L88-L96)
here, a whole new `InferenceTable` is made, without any inference var and in the L94, this table calls;
78c2bdce86/crates/hir-ty/src/infer/unify.rs (L364-L370)
And while registering `AliasEq` `obligation`, this obligation contains inference var `?1` made from the previous table, but this table has only one inference var `?0` made at L365.
So, the chalk panics when we try to canonicalize that obligation to register it, because the obligation contains an inference var `?1` that the canonicalizing table doesn't have.
Currently, we are calling `InferenceTable::new()` to do some normalizing, unifying or coercing things to some targets that might contain inference var that the new table doesn't have.
I think that this is quite dangerous footgun because the inference var is just an index that does not contain the information which table does it made from, so sometimes this "foreign" index might cause panic like this case, or point at the wrong variable.
This PR mitigates such behaviour simply by inserting sufficient number of inference vars to new table to avoid such problem.
This strategy doesn't harm current r-a's intention because the inference vars that passed into new tables are just "unresolved" variables in current r-a, so this is just making sure that such "unresolved" variables exist in the new table
internal: Be more resilient to bad language item definitions in binop inference
Fixes#16287Fixes#16286
There's one more in `write_fn_trait_method_resolution`, but I'm not sure if it won't cause further problems in `infer_closures`.
fix: Resolve included files to their calling modules in IDE layer
Fixes https://github.com/rust-lang/rust-analyzer/issues/17390 at the expense of reporting duplicate diagnostics for modules that have includes in them when both the calling and called file are included.
internal: Reply to requests with defaults when vfs is still loading
There is no reason for us to hit the database with queries when we certainly haven't reached a stable state yet. Instead we just reply with default request results until we are in a state where we can do meaningful work. This should save us from wasting resources while starting up at worst, and at best save us from creating query and interning entries that are non-meaningful which ultimately just end up wasting memory.
internal: Optimize the usage of channel senders
Used `Sender` directly instead of a boxed closure. There is no need to use the boxed closure. This also allows the caller to decide to do something other than `unwrap` (not a fan of it BTW).
Reuse recursion limit as expansion limit
A configurable recursion limit was introduced by looking at the recursion_limit crate attribute. Instead of relying on a global constant we will reuse this value for expansion limit as well.
Addresses: https://github.com/rust-lang/rust-analyzer/issues/8640#issuecomment-2271740272
feat: Implement TAIT and fix ATPIT a bit
Closes#16296 (Commented on the issue)
In #16852, I implemented ATPIT, but as I didn't discern ATPIT and other non-assoc TAIT, I guess that it has been working for some TAITs.
As the definining usage of TAIT requires it should be appear in the Def body's type(const blocks' type annotations or functions' signatures), this can be done in simlilar way with ATPIT
And this PR also corrects some defining-usage resolution for ATPIT
A configurable recursion limit was introduced by looking at the
recursion_limit crate attribute. Instead of relying on a global constant
we will reuse this value for expansion limit as well.
Include vendored crates in StaticIndex
`StaticIndex::compute` filters out modules from libraries. This makes an exceptions for vendored libraries, ie libraries actually defined inside the workspace being indexed.
This aims to solve https://bugzilla.mozilla.org/show_bug.cgi?id=1846041 In general StaticIndex is meant for code browsers, which likely want to index all visible source files.
fix: tyck for non-ADT types when searching refs for `Self` kw
See e0276dc5dd (r1389848845)
For ADTs, to handle `{error}` in generic args, we should to convert them to ADT for comparisons; for others, we can directly compare the types.
StaticIndex::compute filters out modules from libraries. This makes an
exceptions for vendored libraries, ie libraries actually defined inside
the workspace being indexed.
This aims to solve https://bugzilla.mozilla.org/show_bug.cgi?id=1846041
In general StaticIndex is meant for code browsers, which likely want to
index all visible source files.
With the lack of a README on the individually published library crates and the somewhat cryptic `ra_ap_` prefix it is hard to figure out where those crates belong to, so mentioning "rust-analyzer" feels like auseful hint there.
internal: Load VFS config changes in parallel
Simple attempt to make some progress f or https://github.com/rust-lang/rust-analyzer/issues/17373
No clue if those atomic orderings are right, though I don't think they are really too relevant either.
A more complete fix would probably need to replace our `ProjectFolders` handling a bit.
fix: remove AbsPath requirement from linkedProjects
Should (fingers crossed!) fix https://github.com/rust-lang/rust-analyzer/issues/17664. I opened the `rustc` workspace with the [suggested configuration](e552c168c7/src/etc/rust_analyzer_settings.json) and I was able to successfully open some rustc crates (`rustc_incremental`) and have IDE functionality.
`@Veykril:` can you try these changes and let me know if it fixed rustc?
feat: Introduce workspace `rust-analyzer.toml`s
In order to globally configure a project it was, prior to this PR, possible to have a `ratoml` at the root path of a project. This is not the case anymore. Instead we now let ratoml files that are placed at the root of any workspace have a new scope called `workspace`. Although there is not a difference between a `workspace` scope and and a `global` scope, future PRs will change that.
feat: Use spans for builtin and declarative macro expansion errors
This should generally improve some error reporting for macro expansion errors. Especially for `compile_error!` within proc-macros
feat(ide-completion): explictly show `async` keyword on `impl trait` methods
OLD:
<img width="676" alt="image" src="https://github.com/user-attachments/assets/f6fa626f-6b6d-4c22-af27-b0755e7a6bf8">
Now:
<img width="684" alt="image" src="https://github.com/user-attachments/assets/efbaac0e-c805-4dd2-859d-3e44b2886dbb">
---
This is an preparation for https://github.com/rust-lang/rust-analyzer/issues/17719.
```rust
use std::future::Future;
trait DesugaredAsyncTrait {
fn foo(&self) -> impl Future<Output = usize> + Send;
fn bar(&self) -> impl Future<Output = usize> + Send;
}
struct Foo;
impl DesugaredAsyncTrait for Foo {
fn foo(&self) -> impl Future<Output = usize> + Send {
async { 1 }
}
//
async fn bar(&self) -> usize {
1
}
}
fn main() {
let fut = Foo.bar();
fn _assert_send<T: Send>(_: T) {}
_assert_send(fut);
}
```
If we don't distinguish `async` or not. It would be confusing to generate sugared version `async fn foo ....` and original form `fn foo` for `async fn in trait` that is defined in desugar form.
fix: let glob imports override other globs' visibility
Follow up to #14930Fixes#11858Fixes#14902Fixes#17704
I haven't reworked the code here at all - I don't feel confident in the codebase to do so - just rebased it onto the current main branch and fixed conflicts.
I'm not _entirely_ sure I understand the structure of the `check` function in `crates/hir-def/src/nameres` tests. I think the change to the test expectation from #14930 is correct, marking the `crate::reexport::inner` imports with `i`, and I understand it to mean there's a specific token in the import that we can match it to (in this case, `Trait`, `function` and `makro` of `pub use crate::defs::{Trait, function, makro};` respectively), but I had some trouble understanding the meaning of the different parts of `PerNs` to be sure.
Does this make sense?
I tested building and using RA locally with `cargo xtask install` and after this change the documentation for `arrow_array::ArrowPrimitiveType` seems to be picked up correctly!