2020-03-04 11:48:50 +00:00
|
|
|
//! Implementation of find-usages functionality.
|
|
|
|
//!
|
|
|
|
//! It is based on the standard ide trick: first, we run a fast text search to
|
2024-05-15 16:45:06 +00:00
|
|
|
//! get a super-set of matches. Then, we confirm each match using precise
|
2020-03-04 11:48:50 +00:00
|
|
|
//! name resolution.
|
|
|
|
|
2023-05-02 14:12:22 +00:00
|
|
|
use std::mem;
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
use std::{cell::LazyCell, cmp::Reverse};
|
2020-03-04 11:05:14 +00:00
|
|
|
|
2024-08-03 16:00:36 +00:00
|
|
|
use base_db::{salsa::Database, SourceDatabase, SourceRootDatabase};
|
2023-01-11 16:10:04 +00:00
|
|
|
use hir::{
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
sym, Adt, AsAssocItem, DefWithBody, FileRange, FileRangeWrapper, HasAttrs, HasContainer,
|
|
|
|
HasSource, HirFileIdExt, InFile, InFileWrapper, InRealFile, ItemContainer, ModuleSource,
|
|
|
|
PathResolution, Semantics, Visibility,
|
2023-01-11 16:10:04 +00:00
|
|
|
};
|
2022-09-16 14:26:54 +00:00
|
|
|
use memchr::memmem::Finder;
|
2022-09-13 12:47:26 +00:00
|
|
|
use parser::SyntaxKind;
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
use rustc_hash::{FxHashMap, FxHashSet};
|
2024-07-17 15:35:40 +00:00
|
|
|
use span::EditionedFileId;
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
use syntax::{
|
|
|
|
ast::{self, HasName},
|
|
|
|
match_ast, AstNode, AstToken, SmolStr, SyntaxElement, SyntaxNode, TextRange, TextSize,
|
|
|
|
ToSmolStr,
|
|
|
|
};
|
2023-05-02 14:12:22 +00:00
|
|
|
use triomphe::Arc;
|
2020-03-04 11:05:14 +00:00
|
|
|
|
2020-03-04 11:14:48 +00:00
|
|
|
use crate::{
|
2021-03-27 20:51:00 +00:00
|
|
|
defs::{Definition, NameClass, NameRefClass},
|
2022-07-20 11:59:31 +00:00
|
|
|
traits::{as_trait_assoc_def, convert_to_def_in_trait},
|
2020-03-04 11:14:48 +00:00
|
|
|
RootDatabase,
|
|
|
|
};
|
2020-03-04 11:05:14 +00:00
|
|
|
|
2021-01-12 14:51:02 +00:00
|
|
|
#[derive(Debug, Default, Clone)]
|
2021-01-12 14:56:24 +00:00
|
|
|
pub struct UsageSearchResult {
|
2024-07-17 15:35:40 +00:00
|
|
|
pub references: FxHashMap<EditionedFileId, Vec<FileReference>>,
|
2021-01-11 23:05:07 +00:00
|
|
|
}
|
|
|
|
|
2021-01-12 14:56:24 +00:00
|
|
|
impl UsageSearchResult {
|
2021-01-12 14:51:02 +00:00
|
|
|
pub fn is_empty(&self) -> bool {
|
|
|
|
self.references.is_empty()
|
|
|
|
}
|
|
|
|
|
|
|
|
pub fn len(&self) -> usize {
|
|
|
|
self.references.len()
|
|
|
|
}
|
|
|
|
|
2024-07-17 15:35:40 +00:00
|
|
|
pub fn iter(&self) -> impl Iterator<Item = (EditionedFileId, &[FileReference])> + '_ {
|
|
|
|
self.references.iter().map(|(&file_id, refs)| (file_id, &**refs))
|
2021-01-12 14:51:02 +00:00
|
|
|
}
|
|
|
|
|
2021-01-11 23:05:07 +00:00
|
|
|
pub fn file_ranges(&self) -> impl Iterator<Item = FileRange> + '_ {
|
2021-01-12 14:51:02 +00:00
|
|
|
self.references.iter().flat_map(|(&file_id, refs)| {
|
|
|
|
refs.iter().map(move |&FileReference { range, .. }| FileRange { file_id, range })
|
|
|
|
})
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-01-12 14:56:24 +00:00
|
|
|
impl IntoIterator for UsageSearchResult {
|
2024-07-17 15:35:40 +00:00
|
|
|
type Item = (EditionedFileId, Vec<FileReference>);
|
|
|
|
type IntoIter = <FxHashMap<EditionedFileId, Vec<FileReference>> as IntoIterator>::IntoIter;
|
2021-01-12 14:51:02 +00:00
|
|
|
|
|
|
|
fn into_iter(self) -> Self::IntoIter {
|
|
|
|
self.references.into_iter()
|
2021-01-11 23:05:07 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
#[derive(Debug, Clone)]
|
|
|
|
pub struct FileReference {
|
2022-07-24 10:04:15 +00:00
|
|
|
/// The range of the reference in the original file
|
2021-01-11 23:05:07 +00:00
|
|
|
pub range: TextRange,
|
2022-07-24 10:04:15 +00:00
|
|
|
/// The node of the reference in the (macro-)file
|
2023-12-05 14:42:39 +00:00
|
|
|
pub name: FileReferenceNode,
|
2024-04-16 15:10:36 +00:00
|
|
|
pub category: ReferenceCategory,
|
2020-03-04 11:06:37 +00:00
|
|
|
}
|
|
|
|
|
2023-12-05 14:42:39 +00:00
|
|
|
#[derive(Debug, Clone)]
|
|
|
|
pub enum FileReferenceNode {
|
|
|
|
Name(ast::Name),
|
|
|
|
NameRef(ast::NameRef),
|
|
|
|
Lifetime(ast::Lifetime),
|
|
|
|
FormatStringEntry(ast::String, TextRange),
|
|
|
|
}
|
|
|
|
|
|
|
|
impl FileReferenceNode {
|
|
|
|
pub fn text_range(&self) -> TextRange {
|
|
|
|
match self {
|
|
|
|
FileReferenceNode::Name(it) => it.syntax().text_range(),
|
|
|
|
FileReferenceNode::NameRef(it) => it.syntax().text_range(),
|
|
|
|
FileReferenceNode::Lifetime(it) => it.syntax().text_range(),
|
|
|
|
FileReferenceNode::FormatStringEntry(_, range) => *range,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pub fn syntax(&self) -> SyntaxElement {
|
|
|
|
match self {
|
|
|
|
FileReferenceNode::Name(it) => it.syntax().clone().into(),
|
|
|
|
FileReferenceNode::NameRef(it) => it.syntax().clone().into(),
|
|
|
|
FileReferenceNode::Lifetime(it) => it.syntax().clone().into(),
|
|
|
|
FileReferenceNode::FormatStringEntry(it, _) => it.syntax().clone().into(),
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pub fn into_name_like(self) -> Option<ast::NameLike> {
|
|
|
|
match self {
|
|
|
|
FileReferenceNode::Name(it) => Some(ast::NameLike::Name(it)),
|
|
|
|
FileReferenceNode::NameRef(it) => Some(ast::NameLike::NameRef(it)),
|
|
|
|
FileReferenceNode::Lifetime(it) => Some(ast::NameLike::Lifetime(it)),
|
|
|
|
FileReferenceNode::FormatStringEntry(_, _) => None,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pub fn as_name_ref(&self) -> Option<&ast::NameRef> {
|
|
|
|
match self {
|
|
|
|
FileReferenceNode::NameRef(name_ref) => Some(name_ref),
|
|
|
|
_ => None,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pub fn as_lifetime(&self) -> Option<&ast::Lifetime> {
|
|
|
|
match self {
|
|
|
|
FileReferenceNode::Lifetime(lifetime) => Some(lifetime),
|
|
|
|
_ => None,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pub fn text(&self) -> syntax::TokenText<'_> {
|
|
|
|
match self {
|
|
|
|
FileReferenceNode::NameRef(name_ref) => name_ref.text(),
|
|
|
|
FileReferenceNode::Name(name) => name.text(),
|
|
|
|
FileReferenceNode::Lifetime(lifetime) => lifetime.text(),
|
|
|
|
FileReferenceNode::FormatStringEntry(it, range) => {
|
|
|
|
syntax::TokenText::borrowed(&it.text()[*range - it.syntax().text_range().start()])
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2024-04-16 15:10:36 +00:00
|
|
|
bitflags::bitflags! {
|
|
|
|
#[derive(Copy, Clone, Default, PartialEq, Eq, Hash, Debug)]
|
|
|
|
pub struct ReferenceCategory: u8 {
|
|
|
|
// FIXME: Add this variant and delete the `retain_adt_literal_usages` function.
|
|
|
|
// const CREATE = 1 << 0;
|
|
|
|
const WRITE = 1 << 0;
|
|
|
|
const READ = 1 << 1;
|
|
|
|
const IMPORT = 1 << 2;
|
|
|
|
const TEST = 1 << 3;
|
|
|
|
}
|
2020-03-04 11:06:37 +00:00
|
|
|
}
|
|
|
|
|
2020-03-04 11:48:50 +00:00
|
|
|
/// Generally, `search_scope` returns files that might contain references for the element.
|
|
|
|
/// For `pub(crate)` things it's a crate, for `pub` things it's a crate and dependant crates.
|
|
|
|
/// In some cases, the location of the references is known to within a `TextRange`,
|
|
|
|
/// e.g. for things like local variables.
|
2021-09-02 15:30:02 +00:00
|
|
|
#[derive(Clone, Debug)]
|
2020-03-04 11:05:14 +00:00
|
|
|
pub struct SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
entries: FxHashMap<EditionedFileId, Option<TextRange>>,
|
2020-03-04 11:05:14 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
impl SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
fn new(entries: FxHashMap<EditionedFileId, Option<TextRange>>) -> SearchScope {
|
2020-03-04 11:05:14 +00:00
|
|
|
SearchScope { entries }
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a search scope spanning the entire crate graph of files.
|
2021-03-22 16:11:33 +00:00
|
|
|
fn crate_graph(db: &RootDatabase) -> SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
let mut entries = FxHashMap::default();
|
2021-03-22 16:11:33 +00:00
|
|
|
|
|
|
|
let graph = db.crate_graph();
|
|
|
|
for krate in graph.iter() {
|
|
|
|
let root_file = graph[krate].root_file_id;
|
|
|
|
let source_root_id = db.file_source_root(root_file);
|
|
|
|
let source_root = db.source_root(source_root_id);
|
2024-07-17 15:35:40 +00:00
|
|
|
entries.extend(
|
|
|
|
source_root.iter().map(|id| (EditionedFileId::new(id, graph[krate].edition), None)),
|
|
|
|
);
|
2021-03-22 16:11:33 +00:00
|
|
|
}
|
|
|
|
SearchScope { entries }
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a search scope spanning all the reverse dependencies of the given crate.
|
2021-03-22 16:11:33 +00:00
|
|
|
fn reverse_dependencies(db: &RootDatabase, of: hir::Crate) -> SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
let mut entries = FxHashMap::default();
|
2022-04-09 11:40:48 +00:00
|
|
|
for rev_dep in of.transitive_reverse_dependencies(db) {
|
|
|
|
let root_file = rev_dep.root_file(db);
|
|
|
|
let source_root_id = db.file_source_root(root_file);
|
|
|
|
let source_root = db.source_root(source_root_id);
|
2024-07-17 15:35:40 +00:00
|
|
|
entries.extend(
|
|
|
|
source_root.iter().map(|id| (EditionedFileId::new(id, rev_dep.edition(db)), None)),
|
|
|
|
);
|
2022-04-09 11:40:48 +00:00
|
|
|
}
|
2021-03-22 16:11:33 +00:00
|
|
|
SearchScope { entries }
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a search scope spanning the given crate.
|
2021-03-22 16:11:33 +00:00
|
|
|
fn krate(db: &RootDatabase, of: hir::Crate) -> SearchScope {
|
|
|
|
let root_file = of.root_file(db);
|
|
|
|
let source_root_id = db.file_source_root(root_file);
|
|
|
|
let source_root = db.source_root(source_root_id);
|
2024-07-17 15:35:40 +00:00
|
|
|
SearchScope {
|
|
|
|
entries: source_root
|
|
|
|
.iter()
|
|
|
|
.map(|id| (EditionedFileId::new(id, of.edition(db)), None))
|
|
|
|
.collect(),
|
|
|
|
}
|
2021-03-22 16:11:33 +00:00
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a search scope spanning the given module and all its submodules.
|
2023-07-09 21:20:18 +00:00
|
|
|
pub fn module_and_children(db: &RootDatabase, module: hir::Module) -> SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
let mut entries = FxHashMap::default();
|
2021-03-22 16:11:33 +00:00
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
let (file_id, range) = {
|
2024-03-12 12:24:52 +00:00
|
|
|
let InFile { file_id, value } = module.definition_source_range(db);
|
2023-12-02 18:32:53 +00:00
|
|
|
if let Some(InRealFile { file_id, value: call_source }) = file_id.original_call_node(db)
|
|
|
|
{
|
2022-04-06 11:58:40 +00:00
|
|
|
(file_id, Some(call_source.text_range()))
|
|
|
|
} else {
|
2024-03-12 12:24:52 +00:00
|
|
|
(file_id.original_file(db), Some(value))
|
2022-04-06 11:58:40 +00:00
|
|
|
}
|
|
|
|
};
|
2024-03-12 12:24:52 +00:00
|
|
|
entries.entry(file_id).or_insert(range);
|
2022-04-06 11:58:40 +00:00
|
|
|
|
|
|
|
let mut to_visit: Vec<_> = module.children(db).collect();
|
2021-03-22 16:11:33 +00:00
|
|
|
while let Some(module) = to_visit.pop() {
|
2023-06-17 08:58:52 +00:00
|
|
|
if let Some(file_id) = module.as_source_file_id(db) {
|
|
|
|
entries.insert(file_id, None);
|
2022-04-06 11:58:40 +00:00
|
|
|
}
|
2021-03-22 16:11:33 +00:00
|
|
|
to_visit.extend(module.children(db));
|
|
|
|
}
|
|
|
|
SearchScope { entries }
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build an empty search scope.
|
2020-03-04 11:05:14 +00:00
|
|
|
pub fn empty() -> SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
SearchScope::new(FxHashMap::default())
|
2020-03-04 11:05:14 +00:00
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a empty search scope spanning the given file.
|
2024-07-17 15:35:40 +00:00
|
|
|
pub fn single_file(file: EditionedFileId) -> SearchScope {
|
2020-03-04 11:05:14 +00:00
|
|
|
SearchScope::new(std::iter::once((file, None)).collect())
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a empty search scope spanning the text range of the given file.
|
2021-03-11 14:39:41 +00:00
|
|
|
pub fn file_range(range: FileRange) -> SearchScope {
|
|
|
|
SearchScope::new(std::iter::once((range.file_id, Some(range.range))).collect())
|
2021-02-27 14:59:52 +00:00
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Build a empty search scope spanning the given files.
|
2024-07-17 15:35:40 +00:00
|
|
|
pub fn files(files: &[EditionedFileId]) -> SearchScope {
|
2020-07-22 04:01:21 +00:00
|
|
|
SearchScope::new(files.iter().map(|f| (*f, None)).collect())
|
|
|
|
}
|
|
|
|
|
2020-03-04 11:46:40 +00:00
|
|
|
pub fn intersection(&self, other: &SearchScope) -> SearchScope {
|
|
|
|
let (mut small, mut large) = (&self.entries, &other.entries);
|
|
|
|
if small.len() > large.len() {
|
|
|
|
mem::swap(&mut small, &mut large)
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
let intersect_ranges =
|
|
|
|
|r1: Option<TextRange>, r2: Option<TextRange>| -> Option<Option<TextRange>> {
|
|
|
|
match (r1, r2) {
|
|
|
|
(None, r) | (r, None) => Some(r),
|
|
|
|
(Some(r1), Some(r2)) => r1.intersect(r2).map(Some),
|
|
|
|
}
|
|
|
|
};
|
2020-03-04 11:46:40 +00:00
|
|
|
let res = small
|
|
|
|
.iter()
|
2022-04-06 11:58:40 +00:00
|
|
|
.filter_map(|(&file_id, &r1)| {
|
|
|
|
let &r2 = large.get(&file_id)?;
|
|
|
|
let r = intersect_ranges(r1, r2)?;
|
|
|
|
Some((file_id, r))
|
2020-03-04 11:46:40 +00:00
|
|
|
})
|
|
|
|
.collect();
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
SearchScope::new(res)
|
2020-03-04 11:46:40 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
impl IntoIterator for SearchScope {
|
2024-07-17 15:35:40 +00:00
|
|
|
type Item = (EditionedFileId, Option<TextRange>);
|
|
|
|
type IntoIter = std::collections::hash_map::IntoIter<EditionedFileId, Option<TextRange>>;
|
2020-03-04 11:46:40 +00:00
|
|
|
|
|
|
|
fn into_iter(self) -> Self::IntoIter {
|
|
|
|
self.entries.into_iter()
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
impl Definition {
|
|
|
|
fn search_scope(&self, db: &RootDatabase) -> SearchScope {
|
2024-06-06 23:52:25 +00:00
|
|
|
let _p = tracing::info_span!("search_scope").entered();
|
2021-03-15 08:32:06 +00:00
|
|
|
|
2021-11-10 21:02:50 +00:00
|
|
|
if let Definition::BuiltinType(_) = self {
|
2021-03-22 16:11:33 +00:00
|
|
|
return SearchScope::crate_graph(db);
|
2021-03-15 08:32:06 +00:00
|
|
|
}
|
|
|
|
|
2021-09-02 15:30:02 +00:00
|
|
|
// def is crate root
|
2021-11-10 21:02:50 +00:00
|
|
|
if let &Definition::Module(module) = self {
|
2023-06-01 12:46:36 +00:00
|
|
|
if module.is_crate_root() {
|
2021-09-02 15:30:02 +00:00
|
|
|
return SearchScope::reverse_dependencies(db, module.krate());
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-03-04 11:46:40 +00:00
|
|
|
let module = match self.module(db) {
|
2020-03-04 11:05:14 +00:00
|
|
|
Some(it) => it,
|
|
|
|
None => return SearchScope::empty(),
|
|
|
|
};
|
2021-03-22 16:11:33 +00:00
|
|
|
let InFile { file_id, value: module_source } = module.definition_source(db);
|
|
|
|
let file_id = file_id.original_file(db);
|
2020-03-04 11:05:14 +00:00
|
|
|
|
2020-03-04 11:46:40 +00:00
|
|
|
if let Definition::Local(var) = self {
|
2021-09-14 00:49:06 +00:00
|
|
|
let def = match var.parent(db) {
|
|
|
|
DefWithBody::Function(f) => f.source(db).map(|src| src.syntax().cloned()),
|
|
|
|
DefWithBody::Const(c) => c.source(db).map(|src| src.syntax().cloned()),
|
|
|
|
DefWithBody::Static(s) => s.source(db).map(|src| src.syntax().cloned()),
|
2022-08-06 16:50:21 +00:00
|
|
|
DefWithBody::Variant(v) => v.source(db).map(|src| src.syntax().cloned()),
|
2023-06-05 11:27:19 +00:00
|
|
|
// FIXME: implement
|
|
|
|
DefWithBody::InTypeConst(_) => return SearchScope::empty(),
|
2020-03-04 11:05:14 +00:00
|
|
|
};
|
2021-09-14 00:49:06 +00:00
|
|
|
return match def {
|
2024-01-31 08:57:17 +00:00
|
|
|
Some(def) => SearchScope::file_range(
|
|
|
|
def.as_ref().original_file_range_with_macro_call_body(db),
|
|
|
|
),
|
2021-03-22 16:11:33 +00:00
|
|
|
None => SearchScope::single_file(file_id),
|
|
|
|
};
|
2020-03-04 11:05:14 +00:00
|
|
|
}
|
|
|
|
|
2021-05-08 20:34:55 +00:00
|
|
|
if let Definition::SelfType(impl_) = self {
|
2021-09-14 00:49:06 +00:00
|
|
|
return match impl_.source(db).map(|src| src.syntax().cloned()) {
|
2024-01-31 08:57:17 +00:00
|
|
|
Some(def) => SearchScope::file_range(
|
|
|
|
def.as_ref().original_file_range_with_macro_call_body(db),
|
|
|
|
),
|
2021-05-08 20:34:55 +00:00
|
|
|
None => SearchScope::single_file(file_id),
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2021-01-08 11:28:02 +00:00
|
|
|
if let Definition::GenericParam(hir::GenericParam::LifetimeParam(param)) = self {
|
2021-09-14 00:49:06 +00:00
|
|
|
let def = match param.parent(db) {
|
|
|
|
hir::GenericDef::Function(it) => it.source(db).map(|src| src.syntax().cloned()),
|
|
|
|
hir::GenericDef::Adt(it) => it.source(db).map(|src| src.syntax().cloned()),
|
|
|
|
hir::GenericDef::Trait(it) => it.source(db).map(|src| src.syntax().cloned()),
|
2023-03-03 15:24:07 +00:00
|
|
|
hir::GenericDef::TraitAlias(it) => it.source(db).map(|src| src.syntax().cloned()),
|
2021-09-14 00:49:06 +00:00
|
|
|
hir::GenericDef::TypeAlias(it) => it.source(db).map(|src| src.syntax().cloned()),
|
|
|
|
hir::GenericDef::Impl(it) => it.source(db).map(|src| src.syntax().cloned()),
|
|
|
|
hir::GenericDef::Const(it) => it.source(db).map(|src| src.syntax().cloned()),
|
2020-12-16 20:35:15 +00:00
|
|
|
};
|
2021-09-14 00:49:06 +00:00
|
|
|
return match def {
|
2024-01-31 08:57:17 +00:00
|
|
|
Some(def) => SearchScope::file_range(
|
|
|
|
def.as_ref().original_file_range_with_macro_call_body(db),
|
|
|
|
),
|
2021-03-22 16:11:33 +00:00
|
|
|
None => SearchScope::single_file(file_id),
|
|
|
|
};
|
2020-03-04 11:05:14 +00:00
|
|
|
}
|
|
|
|
|
2021-03-21 19:08:08 +00:00
|
|
|
if let Definition::Macro(macro_def) = self {
|
2022-03-08 22:52:26 +00:00
|
|
|
return match macro_def.kind(db) {
|
2021-09-02 15:30:02 +00:00
|
|
|
hir::MacroKind::Declarative => {
|
2024-07-16 10:05:16 +00:00
|
|
|
if macro_def.attrs(db).by_key(&sym::macro_export).exists() {
|
2021-09-02 15:30:02 +00:00
|
|
|
SearchScope::reverse_dependencies(db, module.krate())
|
|
|
|
} else {
|
|
|
|
SearchScope::krate(db, module.krate())
|
|
|
|
}
|
|
|
|
}
|
|
|
|
hir::MacroKind::BuiltIn => SearchScope::crate_graph(db),
|
|
|
|
hir::MacroKind::Derive | hir::MacroKind::Attr | hir::MacroKind::ProcMacro => {
|
2021-03-22 16:11:33 +00:00
|
|
|
SearchScope::reverse_dependencies(db, module.krate())
|
2021-09-02 15:30:02 +00:00
|
|
|
}
|
|
|
|
};
|
2021-03-21 19:08:08 +00:00
|
|
|
}
|
|
|
|
|
2022-07-24 12:32:39 +00:00
|
|
|
if let Definition::DeriveHelper(_) = self {
|
|
|
|
return SearchScope::reverse_dependencies(db, module.krate());
|
|
|
|
}
|
|
|
|
|
2021-03-22 16:11:33 +00:00
|
|
|
let vis = self.visibility(db);
|
2021-03-21 19:08:08 +00:00
|
|
|
if let Some(Visibility::Public) = vis {
|
2021-03-22 16:11:33 +00:00
|
|
|
return SearchScope::reverse_dependencies(db, module.krate());
|
|
|
|
}
|
2024-01-05 10:00:29 +00:00
|
|
|
if let Some(Visibility::Module(module, _)) = vis {
|
2022-04-06 11:58:40 +00:00
|
|
|
return SearchScope::module_and_children(db, module.into());
|
2020-03-04 11:05:14 +00:00
|
|
|
}
|
|
|
|
|
2021-03-22 16:11:33 +00:00
|
|
|
let range = match module_source {
|
2020-03-04 11:05:14 +00:00
|
|
|
ModuleSource::Module(m) => Some(m.syntax().text_range()),
|
2021-01-20 19:05:48 +00:00
|
|
|
ModuleSource::BlockExpr(b) => Some(b.syntax().text_range()),
|
2020-03-04 11:05:14 +00:00
|
|
|
ModuleSource::SourceFile(_) => None,
|
|
|
|
};
|
2021-03-22 16:11:33 +00:00
|
|
|
match range {
|
|
|
|
Some(range) => SearchScope::file_range(FileRange { file_id, range }),
|
|
|
|
None => SearchScope::single_file(file_id),
|
|
|
|
}
|
2020-03-04 11:05:14 +00:00
|
|
|
}
|
|
|
|
|
2022-07-20 13:02:08 +00:00
|
|
|
pub fn usages<'a>(self, sema: &'a Semantics<'_, RootDatabase>) -> FindUsages<'a> {
|
2021-06-28 14:41:35 +00:00
|
|
|
FindUsages {
|
|
|
|
def: self,
|
2023-01-11 16:10:04 +00:00
|
|
|
assoc_item_container: self.as_assoc_item(sema.db).map(|a| a.container(sema.db)),
|
2021-06-28 14:41:35 +00:00
|
|
|
sema,
|
|
|
|
scope: None,
|
|
|
|
include_self_kw_refs: None,
|
|
|
|
search_self_mod: false,
|
|
|
|
}
|
2020-08-19 16:58:48 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-09-26 12:38:58 +00:00
|
|
|
#[derive(Clone)]
|
2020-08-19 16:58:48 +00:00
|
|
|
pub struct FindUsages<'a> {
|
2021-06-11 17:23:59 +00:00
|
|
|
def: Definition,
|
2020-08-19 16:58:48 +00:00
|
|
|
sema: &'a Semantics<'a, RootDatabase>,
|
2023-07-09 21:20:18 +00:00
|
|
|
scope: Option<&'a SearchScope>,
|
2023-01-24 13:11:02 +00:00
|
|
|
/// The container of our definition should it be an assoc item
|
|
|
|
assoc_item_container: Option<hir::AssocItemContainer>,
|
|
|
|
/// whether to search for the `Self` type of the definition
|
2021-05-08 20:34:55 +00:00
|
|
|
include_self_kw_refs: Option<hir::Type>,
|
2023-01-24 13:11:02 +00:00
|
|
|
/// whether to search for the `self` module
|
2021-06-28 14:41:35 +00:00
|
|
|
search_self_mod: bool,
|
2020-08-19 16:58:48 +00:00
|
|
|
}
|
|
|
|
|
2023-07-09 21:20:18 +00:00
|
|
|
impl<'a> FindUsages<'a> {
|
2021-06-28 14:41:35 +00:00
|
|
|
/// Enable searching for `Self` when the definition is a type or `self` for modules.
|
2023-06-29 14:27:28 +00:00
|
|
|
pub fn include_self_refs(mut self) -> Self {
|
2021-06-11 17:23:59 +00:00
|
|
|
self.include_self_kw_refs = def_to_ty(self.sema, &self.def);
|
2021-06-28 14:41:35 +00:00
|
|
|
self.search_self_mod = true;
|
2021-04-03 21:01:49 +00:00
|
|
|
self
|
|
|
|
}
|
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Limit the search to a given [`SearchScope`].
|
2023-07-09 21:20:18 +00:00
|
|
|
pub fn in_scope(self, scope: &'a SearchScope) -> Self {
|
2020-08-19 16:58:48 +00:00
|
|
|
self.set_scope(Some(scope))
|
|
|
|
}
|
2021-02-16 18:27:08 +00:00
|
|
|
|
2022-04-06 11:58:40 +00:00
|
|
|
/// Limit the search to a given [`SearchScope`].
|
2023-07-09 21:20:18 +00:00
|
|
|
pub fn set_scope(mut self, scope: Option<&'a SearchScope>) -> Self {
|
2020-08-19 16:58:48 +00:00
|
|
|
assert!(self.scope.is_none());
|
|
|
|
self.scope = scope;
|
|
|
|
self
|
|
|
|
}
|
|
|
|
|
2021-09-26 12:38:58 +00:00
|
|
|
pub fn at_least_one(&self) -> bool {
|
2020-08-19 22:54:44 +00:00
|
|
|
let mut found = false;
|
2021-01-11 23:05:07 +00:00
|
|
|
self.search(&mut |_, _| {
|
2020-08-19 22:54:44 +00:00
|
|
|
found = true;
|
|
|
|
true
|
|
|
|
});
|
|
|
|
found
|
2020-08-19 16:58:48 +00:00
|
|
|
}
|
|
|
|
|
2021-01-12 14:56:24 +00:00
|
|
|
pub fn all(self) -> UsageSearchResult {
|
|
|
|
let mut res = UsageSearchResult::default();
|
2021-01-11 23:05:07 +00:00
|
|
|
self.search(&mut |file_id, reference| {
|
2021-01-12 14:51:02 +00:00
|
|
|
res.references.entry(file_id).or_default().push(reference);
|
2020-08-19 22:54:44 +00:00
|
|
|
false
|
|
|
|
});
|
|
|
|
res
|
|
|
|
}
|
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
fn scope_files<'b>(
|
|
|
|
db: &'b RootDatabase,
|
|
|
|
scope: &'b SearchScope,
|
|
|
|
) -> impl Iterator<Item = (Arc<str>, EditionedFileId, TextRange)> + 'b {
|
|
|
|
scope.entries.iter().map(|(&file_id, &search_range)| {
|
|
|
|
let text = db.file_text(file_id.file_id());
|
|
|
|
let search_range =
|
|
|
|
search_range.unwrap_or_else(|| TextRange::up_to(TextSize::of(&*text)));
|
|
|
|
|
|
|
|
(text, file_id, search_range)
|
|
|
|
})
|
|
|
|
}
|
|
|
|
|
|
|
|
fn match_indices<'b>(
|
|
|
|
text: &'b str,
|
|
|
|
finder: &'b Finder<'b>,
|
|
|
|
search_range: TextRange,
|
|
|
|
) -> impl Iterator<Item = TextSize> + 'b {
|
|
|
|
finder.find_iter(text.as_bytes()).filter_map(move |idx| {
|
|
|
|
let offset: TextSize = idx.try_into().unwrap();
|
|
|
|
if !search_range.contains_inclusive(offset) {
|
|
|
|
return None;
|
|
|
|
}
|
|
|
|
// If this is not a word boundary, that means this is only part of an identifier,
|
|
|
|
// so it can't be what we're looking for.
|
|
|
|
// This speeds up short identifiers significantly.
|
|
|
|
if text[..idx]
|
|
|
|
.chars()
|
|
|
|
.next_back()
|
|
|
|
.is_some_and(|ch| matches!(ch, 'A'..='Z' | 'a'..='z' | '_'))
|
|
|
|
|| text[idx + finder.needle().len()..]
|
|
|
|
.chars()
|
|
|
|
.next()
|
|
|
|
.is_some_and(|ch| matches!(ch, 'A'..='Z' | 'a'..='z' | '_' | '0'..='9'))
|
|
|
|
{
|
|
|
|
return None;
|
|
|
|
}
|
|
|
|
Some(offset)
|
|
|
|
})
|
|
|
|
}
|
|
|
|
|
|
|
|
fn find_nodes<'b>(
|
|
|
|
sema: &'b Semantics<'_, RootDatabase>,
|
|
|
|
name: &str,
|
|
|
|
node: &syntax::SyntaxNode,
|
|
|
|
offset: TextSize,
|
|
|
|
) -> impl Iterator<Item = SyntaxNode> + 'b {
|
|
|
|
node.token_at_offset(offset)
|
|
|
|
.find(|it| {
|
|
|
|
// `name` is stripped of raw ident prefix. See the comment on name retrieval below.
|
|
|
|
it.text().trim_start_matches("r#") == name
|
|
|
|
})
|
|
|
|
.into_iter()
|
|
|
|
.flat_map(move |token| {
|
|
|
|
sema.descend_into_macros_exact_if_in_macro(token)
|
|
|
|
.into_iter()
|
|
|
|
.filter_map(|it| it.parent())
|
|
|
|
})
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Performs a special fast search for associated functions. This is mainly intended
|
|
|
|
/// to speed up `new()` which can take a long time.
|
|
|
|
///
|
|
|
|
/// The trick is instead of searching for `func_name` search for `TypeThatContainsContainerName::func_name`.
|
|
|
|
/// We cannot search exactly that (not even in tokens), because `ContainerName` may be aliased.
|
|
|
|
/// Instead, we perform a textual search for `ContainerName`. Then, we look for all cases where
|
|
|
|
/// `ContainerName` may be aliased (that includes `use ContainerName as Xyz` and
|
|
|
|
/// `type Xyz = ContainerName`). We collect a list of all possible aliases of `ContainerName`.
|
|
|
|
/// The list can have false positives (because there may be multiple types named `ContainerName`),
|
|
|
|
/// but it cannot have false negatives. Then, we look for `TypeThatContainsContainerNameOrAnyAlias::func_name`.
|
|
|
|
/// Those that will be found are of high chance to be actual hits (of course, we will need to verify
|
|
|
|
/// that).
|
|
|
|
///
|
|
|
|
/// Returns true if completed the search.
|
|
|
|
// FIXME: Extend this to other cases, such as associated types/consts/enum variants (note those can be `use`d).
|
|
|
|
fn short_associated_function_fast_search(
|
|
|
|
&self,
|
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
|
|
|
search_scope: &SearchScope,
|
|
|
|
name: &str,
|
|
|
|
) -> bool {
|
2024-08-25 01:35:58 +00:00
|
|
|
if self.scope.is_some() {
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
let _p = tracing::info_span!("short_associated_function_fast_search").entered();
|
|
|
|
|
|
|
|
let container = (|| {
|
|
|
|
let Definition::Function(function) = self.def else {
|
|
|
|
return None;
|
|
|
|
};
|
|
|
|
if function.has_self_param(self.sema.db) {
|
|
|
|
return None;
|
|
|
|
}
|
|
|
|
match function.container(self.sema.db) {
|
|
|
|
// Only freestanding `impl`s qualify; methods from trait
|
|
|
|
// can be called from within subtraits and bounds.
|
|
|
|
ItemContainer::Impl(impl_) => {
|
|
|
|
let has_trait = impl_.trait_(self.sema.db).is_some();
|
|
|
|
if has_trait {
|
|
|
|
return None;
|
|
|
|
}
|
|
|
|
let adt = impl_.self_ty(self.sema.db).as_adt()?;
|
|
|
|
Some(adt)
|
|
|
|
}
|
|
|
|
_ => None,
|
|
|
|
}
|
|
|
|
})();
|
|
|
|
let Some(container) = container else {
|
|
|
|
return false;
|
|
|
|
};
|
|
|
|
|
|
|
|
fn has_any_name(node: &SyntaxNode, mut predicate: impl FnMut(&str) -> bool) -> bool {
|
|
|
|
node.descendants().any(|node| {
|
|
|
|
match_ast! {
|
|
|
|
match node {
|
|
|
|
ast::Name(it) => predicate(it.text().trim_start_matches("r#")),
|
|
|
|
ast::NameRef(it) => predicate(it.text().trim_start_matches("r#")),
|
|
|
|
_ => false
|
|
|
|
}
|
|
|
|
}
|
|
|
|
})
|
|
|
|
}
|
|
|
|
|
|
|
|
// This is a fixpoint algorithm with O(number of aliases), but most types have no or few aliases,
|
|
|
|
// so this should stay fast.
|
|
|
|
//
|
|
|
|
/// Returns `(aliases, ranges_where_Self_can_refer_to_our_type)`.
|
|
|
|
fn collect_possible_aliases(
|
|
|
|
sema: &Semantics<'_, RootDatabase>,
|
|
|
|
container: Adt,
|
|
|
|
) -> Option<(FxHashSet<SmolStr>, Vec<FileRangeWrapper<EditionedFileId>>)> {
|
|
|
|
fn insert_type_alias(
|
|
|
|
db: &RootDatabase,
|
|
|
|
to_process: &mut Vec<(SmolStr, SearchScope)>,
|
|
|
|
alias_name: &str,
|
|
|
|
def: Definition,
|
|
|
|
) {
|
|
|
|
let alias = alias_name.trim_start_matches("r#").to_smolstr();
|
|
|
|
tracing::debug!("found alias: {alias}");
|
|
|
|
to_process.push((alias, def.search_scope(db)));
|
|
|
|
}
|
|
|
|
|
|
|
|
let _p = tracing::info_span!("collect_possible_aliases").entered();
|
|
|
|
|
|
|
|
let db = sema.db;
|
|
|
|
let container_name = container.name(db).unescaped().display(db).to_smolstr();
|
|
|
|
let search_scope = Definition::from(container).search_scope(db);
|
|
|
|
let mut seen = FxHashSet::default();
|
|
|
|
let mut completed = FxHashSet::default();
|
|
|
|
let mut to_process = vec![(container_name, search_scope)];
|
|
|
|
let mut is_possibly_self = Vec::new();
|
|
|
|
let mut total_files_searched = 0;
|
|
|
|
|
|
|
|
while let Some((current_to_process, current_to_process_search_scope)) = to_process.pop()
|
|
|
|
{
|
|
|
|
let is_alias = |alias: &ast::TypeAlias| {
|
|
|
|
let def = sema.to_def(alias)?;
|
|
|
|
let ty = def.ty(db);
|
|
|
|
let is_alias = ty.as_adt()? == container;
|
|
|
|
is_alias.then_some(def)
|
|
|
|
};
|
|
|
|
|
|
|
|
let finder = Finder::new(current_to_process.as_bytes());
|
|
|
|
for (file_text, file_id, search_range) in
|
|
|
|
FindUsages::scope_files(db, ¤t_to_process_search_scope)
|
|
|
|
{
|
|
|
|
let tree = LazyCell::new(move || sema.parse(file_id).syntax().clone());
|
|
|
|
|
|
|
|
for offset in FindUsages::match_indices(&file_text, &finder, search_range) {
|
|
|
|
let usages =
|
|
|
|
FindUsages::find_nodes(sema, ¤t_to_process, &tree, offset)
|
|
|
|
.filter(|it| {
|
|
|
|
matches!(it.kind(), SyntaxKind::NAME | SyntaxKind::NAME_REF)
|
|
|
|
});
|
|
|
|
for usage in usages {
|
|
|
|
if let Some(alias) = usage.parent().and_then(|it| {
|
|
|
|
let path = ast::PathSegment::cast(it)?.parent_path();
|
|
|
|
let use_tree = ast::UseTree::cast(path.syntax().parent()?)?;
|
|
|
|
use_tree.rename()?.name()
|
|
|
|
}) {
|
|
|
|
if seen.insert(InFileWrapper::new(
|
|
|
|
file_id,
|
|
|
|
alias.syntax().text_range(),
|
|
|
|
)) {
|
|
|
|
tracing::debug!("found alias: {alias}");
|
2024-08-22 11:19:56 +00:00
|
|
|
cov_mark::hit!(container_use_rename);
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
// FIXME: `use`s have no easy way to determine their search scope, but they are rare.
|
|
|
|
to_process.push((
|
|
|
|
alias.text().to_smolstr(),
|
|
|
|
current_to_process_search_scope.clone(),
|
|
|
|
));
|
|
|
|
}
|
|
|
|
} else if let Some(alias) =
|
|
|
|
usage.ancestors().find_map(ast::TypeAlias::cast)
|
|
|
|
{
|
|
|
|
if let Some(name) = alias.name() {
|
|
|
|
if seen.insert(InFileWrapper::new(
|
|
|
|
file_id,
|
|
|
|
name.syntax().text_range(),
|
|
|
|
)) {
|
|
|
|
if let Some(def) = is_alias(&alias) {
|
2024-08-22 11:19:56 +00:00
|
|
|
cov_mark::hit!(container_type_alias);
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
insert_type_alias(
|
|
|
|
sema.db,
|
|
|
|
&mut to_process,
|
|
|
|
name.text().as_str(),
|
|
|
|
def.into(),
|
|
|
|
);
|
2024-08-22 11:19:56 +00:00
|
|
|
} else {
|
|
|
|
cov_mark::hit!(same_name_different_def_type_alias);
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// We need to account for `Self`. It can only refer to our type inside an impl.
|
|
|
|
let impl_ = 'impl_: {
|
|
|
|
for ancestor in usage.ancestors() {
|
|
|
|
if let Some(parent) = ancestor.parent() {
|
|
|
|
if let Some(parent) = ast::Impl::cast(parent) {
|
|
|
|
// Only if the GENERIC_PARAM_LIST is directly under impl, otherwise it may be in the self ty.
|
|
|
|
if matches!(
|
|
|
|
ancestor.kind(),
|
|
|
|
SyntaxKind::ASSOC_ITEM_LIST
|
|
|
|
| SyntaxKind::WHERE_CLAUSE
|
|
|
|
| SyntaxKind::GENERIC_PARAM_LIST
|
|
|
|
) {
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if parent
|
|
|
|
.trait_()
|
|
|
|
.is_some_and(|trait_| *trait_.syntax() == ancestor)
|
|
|
|
{
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Otherwise, found an impl where its self ty may be our type.
|
|
|
|
break 'impl_ Some(parent);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
None
|
|
|
|
};
|
|
|
|
(|| {
|
|
|
|
let impl_ = impl_?;
|
|
|
|
is_possibly_self.push(sema.original_range(impl_.syntax()));
|
|
|
|
let assoc_items = impl_.assoc_item_list()?;
|
|
|
|
let type_aliases = assoc_items
|
|
|
|
.syntax()
|
|
|
|
.descendants()
|
|
|
|
.filter_map(ast::TypeAlias::cast);
|
|
|
|
for type_alias in type_aliases {
|
|
|
|
let Some(ty) = type_alias.ty() else { continue };
|
|
|
|
let Some(name) = type_alias.name() else { continue };
|
|
|
|
let contains_self = ty
|
|
|
|
.syntax()
|
|
|
|
.descendants_with_tokens()
|
|
|
|
.any(|node| node.kind() == SyntaxKind::SELF_TYPE_KW);
|
|
|
|
if !contains_self {
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if seen.insert(InFileWrapper::new(
|
|
|
|
file_id,
|
|
|
|
name.syntax().text_range(),
|
|
|
|
)) {
|
|
|
|
if let Some(def) = is_alias(&type_alias) {
|
2024-08-22 11:19:56 +00:00
|
|
|
cov_mark::hit!(self_type_alias);
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
insert_type_alias(
|
|
|
|
sema.db,
|
|
|
|
&mut to_process,
|
|
|
|
name.text().as_str(),
|
|
|
|
def.into(),
|
|
|
|
);
|
2024-08-22 11:19:56 +00:00
|
|
|
} else {
|
|
|
|
cov_mark::hit!(same_name_different_def_type_alias);
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
Some(())
|
|
|
|
})();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
completed.insert(current_to_process);
|
|
|
|
|
|
|
|
total_files_searched += current_to_process_search_scope.entries.len();
|
|
|
|
// FIXME: Maybe this needs to be relative to the project size, or at least to the initial search scope?
|
|
|
|
if total_files_searched > 20_000 && completed.len() > 100 {
|
|
|
|
// This case is extremely unlikely (even searching for `Vec::new()` on rust-analyzer does not enter
|
|
|
|
// here - it searches less than 10,000 files, and it does so in five seconds), but if we get here,
|
|
|
|
// we at a risk of entering an almost-infinite loop of growing the aliases list. So just stop and
|
|
|
|
// let normal search handle this case.
|
|
|
|
tracing::info!(aliases_count = %completed.len(), "too much aliases; leaving fast path");
|
|
|
|
return None;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// Impls can contain each other, so we need to deduplicate their ranges.
|
|
|
|
is_possibly_self.sort_unstable_by_key(|position| {
|
|
|
|
(position.file_id, position.range.start(), Reverse(position.range.end()))
|
|
|
|
});
|
|
|
|
is_possibly_self.dedup_by(|pos2, pos1| {
|
|
|
|
pos1.file_id == pos2.file_id
|
|
|
|
&& pos1.range.start() <= pos2.range.start()
|
|
|
|
&& pos1.range.end() >= pos2.range.end()
|
|
|
|
});
|
|
|
|
|
|
|
|
tracing::info!(aliases_count = %completed.len(), "aliases search completed");
|
|
|
|
|
|
|
|
Some((completed, is_possibly_self))
|
|
|
|
}
|
|
|
|
|
|
|
|
fn search(
|
|
|
|
this: &FindUsages<'_>,
|
|
|
|
finder: &Finder<'_>,
|
|
|
|
name: &str,
|
|
|
|
files: impl Iterator<Item = (Arc<str>, EditionedFileId, TextRange)>,
|
|
|
|
mut container_predicate: impl FnMut(
|
|
|
|
&SyntaxNode,
|
|
|
|
InFileWrapper<EditionedFileId, TextRange>,
|
|
|
|
) -> bool,
|
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
|
|
|
) {
|
|
|
|
for (file_text, file_id, search_range) in files {
|
|
|
|
let tree = LazyCell::new(move || this.sema.parse(file_id).syntax().clone());
|
|
|
|
|
|
|
|
for offset in FindUsages::match_indices(&file_text, finder, search_range) {
|
|
|
|
let usages = FindUsages::find_nodes(this.sema, name, &tree, offset)
|
|
|
|
.filter_map(ast::NameRef::cast);
|
|
|
|
for usage in usages {
|
|
|
|
let found_usage = usage
|
|
|
|
.syntax()
|
|
|
|
.parent()
|
|
|
|
.and_then(ast::PathSegment::cast)
|
|
|
|
.map(|path_segment| {
|
|
|
|
container_predicate(
|
|
|
|
path_segment.parent_path().syntax(),
|
|
|
|
InFileWrapper::new(file_id, usage.syntax().text_range()),
|
|
|
|
)
|
|
|
|
})
|
|
|
|
.unwrap_or(false);
|
|
|
|
if found_usage {
|
|
|
|
this.found_name_ref(&usage, sink);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
let Some((container_possible_aliases, is_possibly_self)) =
|
|
|
|
collect_possible_aliases(self.sema, container)
|
|
|
|
else {
|
|
|
|
return false;
|
|
|
|
};
|
|
|
|
|
2024-08-22 11:19:56 +00:00
|
|
|
cov_mark::hit!(short_associated_function_fast_search);
|
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
// FIXME: If Rust ever gains the ability to `use Struct::method` we'll also need to account for free
|
|
|
|
// functions.
|
|
|
|
let finder = Finder::new(name.as_bytes());
|
|
|
|
// The search for `Self` may return duplicate results with `ContainerName`, so deduplicate them.
|
|
|
|
let mut self_positions = FxHashSet::default();
|
|
|
|
tracing::info_span!("Self_search").in_scope(|| {
|
|
|
|
search(
|
|
|
|
self,
|
|
|
|
&finder,
|
|
|
|
name,
|
|
|
|
is_possibly_self.into_iter().map(|position| {
|
|
|
|
(
|
|
|
|
self.sema.db.file_text(position.file_id.file_id()),
|
|
|
|
position.file_id,
|
|
|
|
position.range,
|
|
|
|
)
|
|
|
|
}),
|
|
|
|
|path, name_position| {
|
|
|
|
let has_self = path
|
|
|
|
.descendants_with_tokens()
|
|
|
|
.any(|node| node.kind() == SyntaxKind::SELF_TYPE_KW);
|
|
|
|
if has_self {
|
|
|
|
self_positions.insert(name_position);
|
|
|
|
}
|
|
|
|
has_self
|
|
|
|
},
|
|
|
|
sink,
|
|
|
|
)
|
|
|
|
});
|
|
|
|
tracing::info_span!("aliases_search").in_scope(|| {
|
|
|
|
search(
|
|
|
|
self,
|
|
|
|
&finder,
|
|
|
|
name,
|
|
|
|
FindUsages::scope_files(self.sema.db, search_scope),
|
|
|
|
|path, name_position| {
|
|
|
|
has_any_name(path, |name| container_possible_aliases.contains(name))
|
|
|
|
&& !self_positions.contains(&name_position)
|
|
|
|
},
|
|
|
|
sink,
|
|
|
|
)
|
|
|
|
});
|
|
|
|
|
|
|
|
true
|
|
|
|
}
|
|
|
|
|
2024-07-17 15:35:40 +00:00
|
|
|
pub fn search(&self, sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool) {
|
2024-06-06 23:52:25 +00:00
|
|
|
let _p = tracing::info_span!("FindUsages:search").entered();
|
2020-08-19 16:58:48 +00:00
|
|
|
let sema = self.sema;
|
2020-03-04 11:17:41 +00:00
|
|
|
|
|
|
|
let search_scope = {
|
2023-01-11 16:10:04 +00:00
|
|
|
// FIXME: Is the trait scope needed for trait impl assoc items?
|
|
|
|
let base =
|
|
|
|
as_trait_assoc_def(sema.db, self.def).unwrap_or(self.def).search_scope(sema.db);
|
2020-10-09 17:55:30 +00:00
|
|
|
match &self.scope {
|
2020-03-04 11:17:41 +00:00
|
|
|
None => base,
|
2020-10-09 17:55:30 +00:00
|
|
|
Some(scope) => base.intersection(scope),
|
2020-03-04 11:17:41 +00:00
|
|
|
}
|
|
|
|
};
|
2020-03-04 11:14:48 +00:00
|
|
|
|
2021-12-20 16:48:47 +00:00
|
|
|
let name = match self.def {
|
|
|
|
// special case crate modules as these do not have a proper name
|
2023-06-01 12:46:36 +00:00
|
|
|
Definition::Module(module) if module.is_crate_root() => {
|
2023-08-29 08:08:34 +00:00
|
|
|
// FIXME: This assumes the crate name is always equal to its display name when it
|
|
|
|
// really isn't
|
|
|
|
// we should instead look at the dependency edge name and recursively search our way
|
|
|
|
// up the ancestors
|
2021-12-20 16:48:47 +00:00
|
|
|
module
|
|
|
|
.krate()
|
|
|
|
.display_name(self.sema.db)
|
2024-07-16 10:05:16 +00:00
|
|
|
.map(|crate_name| crate_name.crate_name().symbol().as_str().into())
|
2021-12-20 16:48:47 +00:00
|
|
|
}
|
|
|
|
_ => {
|
|
|
|
let self_kw_refs = || {
|
|
|
|
self.include_self_kw_refs.as_ref().and_then(|ty| {
|
|
|
|
ty.as_adt()
|
|
|
|
.map(|adt| adt.name(self.sema.db))
|
|
|
|
.or_else(|| ty.as_builtin().map(|builtin| builtin.name()))
|
|
|
|
})
|
|
|
|
};
|
2022-08-16 01:08:45 +00:00
|
|
|
// We need to unescape the name in case it is written without "r#" in earlier
|
|
|
|
// editions of Rust where it isn't a keyword.
|
2024-07-16 10:43:58 +00:00
|
|
|
self.def
|
|
|
|
.name(sema.db)
|
|
|
|
.or_else(self_kw_refs)
|
|
|
|
.map(|it| it.unescaped().display(sema.db).to_smolstr())
|
2021-12-20 16:48:47 +00:00
|
|
|
}
|
|
|
|
};
|
|
|
|
let name = match &name {
|
|
|
|
Some(s) => s.as_str(),
|
2020-08-19 22:54:44 +00:00
|
|
|
None => return,
|
2020-03-04 11:17:41 +00:00
|
|
|
};
|
2020-03-04 11:14:48 +00:00
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
// FIXME: This should probably depend on the number of the results (specifically, the number of false results).
|
|
|
|
if name.len() <= 7 && self.short_associated_function_fast_search(sink, &search_scope, name)
|
|
|
|
{
|
|
|
|
return;
|
2021-12-20 16:48:47 +00:00
|
|
|
}
|
2022-04-06 11:58:40 +00:00
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
let finder = &Finder::new(name);
|
|
|
|
let include_self_kw_refs =
|
|
|
|
self.include_self_kw_refs.as_ref().map(|ty| (ty, Finder::new("Self")));
|
2022-11-05 11:04:21 +00:00
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for (text, file_id, search_range) in Self::scope_files(sema.db, &search_scope) {
|
2023-09-13 20:01:04 +00:00
|
|
|
self.sema.db.unwind_if_cancelled();
|
2024-08-16 06:53:37 +00:00
|
|
|
let tree = LazyCell::new(move || sema.parse(file_id).syntax().clone());
|
2021-12-20 16:48:47 +00:00
|
|
|
|
|
|
|
// Search for occurrences of the items name
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for offset in Self::match_indices(&text, finder, search_range) {
|
2024-01-18 12:59:49 +00:00
|
|
|
tree.token_at_offset(offset).for_each(|token| {
|
2023-12-05 14:42:39 +00:00
|
|
|
let Some(str_token) = ast::String::cast(token.clone()) else { return };
|
|
|
|
if let Some((range, nameres)) =
|
2024-01-06 23:17:48 +00:00
|
|
|
sema.check_for_format_args_template(token, offset)
|
2023-12-05 14:42:39 +00:00
|
|
|
{
|
2024-01-18 12:59:49 +00:00
|
|
|
if self.found_format_args_ref(file_id, range, str_token, nameres, sink) {}
|
2023-12-05 14:42:39 +00:00
|
|
|
}
|
|
|
|
});
|
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for name in
|
|
|
|
Self::find_nodes(sema, name, &tree, offset).filter_map(ast::NameLike::cast)
|
|
|
|
{
|
2023-02-14 08:34:19 +00:00
|
|
|
if match name {
|
|
|
|
ast::NameLike::NameRef(name_ref) => self.found_name_ref(&name_ref, sink),
|
|
|
|
ast::NameLike::Name(name) => self.found_name(&name, sink),
|
|
|
|
ast::NameLike::Lifetime(lifetime) => self.found_lifetime(&lifetime, sink),
|
|
|
|
} {
|
|
|
|
return;
|
2020-03-04 11:17:41 +00:00
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
2021-04-21 13:42:47 +00:00
|
|
|
}
|
2021-12-20 16:48:47 +00:00
|
|
|
// Search for occurrences of the `Self` referring to our type
|
2022-09-16 14:26:54 +00:00
|
|
|
if let Some((self_ty, finder)) = &include_self_kw_refs {
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for offset in Self::match_indices(&text, finder, search_range) {
|
|
|
|
for name_ref in
|
|
|
|
Self::find_nodes(sema, "Self", &tree, offset).filter_map(ast::NameRef::cast)
|
2023-02-14 08:34:19 +00:00
|
|
|
{
|
|
|
|
if self.found_self_ty_name_ref(self_ty, &name_ref, sink) {
|
|
|
|
return;
|
2021-05-08 20:34:55 +00:00
|
|
|
}
|
2021-04-21 13:42:47 +00:00
|
|
|
}
|
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
|
|
|
}
|
2021-06-28 14:41:35 +00:00
|
|
|
|
2021-12-20 16:48:47 +00:00
|
|
|
// Search for `super` and `crate` resolving to our module
|
2023-01-10 18:48:51 +00:00
|
|
|
if let Definition::Module(module) = self.def {
|
|
|
|
let scope =
|
|
|
|
search_scope.intersection(&SearchScope::module_and_children(self.sema.db, module));
|
2021-12-20 16:48:47 +00:00
|
|
|
|
2023-06-01 12:46:36 +00:00
|
|
|
let is_crate_root = module.is_crate_root().then(|| Finder::new("crate"));
|
2023-01-10 18:48:51 +00:00
|
|
|
let finder = &Finder::new("super");
|
2021-12-20 16:48:47 +00:00
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for (text, file_id, search_range) in Self::scope_files(sema.db, &scope) {
|
2023-09-13 20:01:04 +00:00
|
|
|
self.sema.db.unwind_if_cancelled();
|
2024-08-16 06:53:37 +00:00
|
|
|
let tree = LazyCell::new(move || sema.parse(file_id).syntax().clone());
|
2021-12-20 16:48:47 +00:00
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for offset in Self::match_indices(&text, finder, search_range) {
|
|
|
|
for name_ref in Self::find_nodes(sema, "super", &tree, offset)
|
|
|
|
.filter_map(ast::NameRef::cast)
|
2023-02-14 08:34:19 +00:00
|
|
|
{
|
|
|
|
if self.found_name_ref(&name_ref, sink) {
|
|
|
|
return;
|
2023-01-10 18:48:51 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if let Some(finder) = &is_crate_root {
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for offset in Self::match_indices(&text, finder, search_range) {
|
|
|
|
for name_ref in Self::find_nodes(sema, "crate", &tree, offset)
|
|
|
|
.filter_map(ast::NameRef::cast)
|
2023-02-14 08:34:19 +00:00
|
|
|
{
|
|
|
|
if self.found_name_ref(&name_ref, sink) {
|
|
|
|
return;
|
2021-12-20 16:48:47 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-06-28 14:41:35 +00:00
|
|
|
// search for module `self` references in our module's definition source
|
|
|
|
match self.def {
|
2021-11-10 21:02:50 +00:00
|
|
|
Definition::Module(module) if self.search_self_mod => {
|
2021-06-28 14:41:35 +00:00
|
|
|
let src = module.definition_source(sema.db);
|
|
|
|
let file_id = src.file_id.original_file(sema.db);
|
|
|
|
let (file_id, search_range) = match src.value {
|
|
|
|
ModuleSource::Module(m) => (file_id, Some(m.syntax().text_range())),
|
|
|
|
ModuleSource::BlockExpr(b) => (file_id, Some(b.syntax().text_range())),
|
|
|
|
ModuleSource::SourceFile(_) => (file_id, None),
|
|
|
|
};
|
|
|
|
|
2021-12-20 16:48:47 +00:00
|
|
|
let search_range = if let Some(&range) = search_scope.entries.get(&file_id) {
|
|
|
|
match (range, search_range) {
|
|
|
|
(None, range) | (range, None) => range,
|
|
|
|
(Some(range), Some(search_range)) => match range.intersect(search_range) {
|
|
|
|
Some(range) => Some(range),
|
|
|
|
None => return,
|
|
|
|
},
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
return;
|
|
|
|
};
|
|
|
|
|
2024-07-17 15:35:40 +00:00
|
|
|
let text = sema.db.file_text(file_id.file_id());
|
2021-06-28 14:41:35 +00:00
|
|
|
let search_range =
|
2023-04-22 07:48:37 +00:00
|
|
|
search_range.unwrap_or_else(|| TextRange::up_to(TextSize::of(&*text)));
|
2021-06-28 14:41:35 +00:00
|
|
|
|
2024-08-16 06:53:37 +00:00
|
|
|
let tree = LazyCell::new(|| sema.parse(file_id).syntax().clone());
|
2022-09-16 14:26:54 +00:00
|
|
|
let finder = &Finder::new("self");
|
2021-06-28 14:41:35 +00:00
|
|
|
|
Speed up search for short associated functions, especially very common identifiers such as `new`
The search is used by IDE features such as rename and find all references.
The search is slow because we need to verify each candidate, and that requires analyzing it; the key to speeding it up is to avoid the analysis where possible.
I did that with a bunch of tricks that exploits knowledge about the language and its possibilities. The first key insight is that associated methods may only be referenced in the form `ContainerName::func_name` (parentheses are not necessary!) (Rust doesn't include a way to `use Container::func_name`, and even if it will in the future most usages are likely to stay in that form.
Searching for `::` will help only a bit, but searching for `Container` can help considerably, since it is very rare that there will be two identical instances of both a container and a method of it.
However, things are not as simple as they sound. In Rust a container can be aliased in multiple ways, and even aliased from different files/modules. If we will try to resolve the alias, we will lose any gain from the textual search (although very common method names such as `new` will still benefit, most will suffer because there are more instances of a container name than its associated item).
This is where the key trick enters the picture. The key insight is that there is still a textual property: a container namer cannot be aliased, unless its name is mentioned in the alias declaration, or a name of alias of it is mentioned in the alias declaration.
This becomes a fixpoint algorithm: we expand our list of aliases as we collect more and more (possible) aliases, until we eventually reach a fixpoint. A fixpoint is not guaranteed (and we do have guards for the rare cases where it does not happen), but it is almost so: most types have very few aliases, if at all.
We do use some semantic information while analyzing aliases. It's a balance: too much semantic analysis, and the search will become slow. But too few of it, and we will bring many incorrect aliases to our list, and risk it expands and expands and never reach a fixpoint. At the end, based on benchmarks, it seems worth to do a lot to avoid adding an alias (but not too much), while it is worth to do a lot to avoid the need to semantically analyze func_name matches (but again, not too much).
After we collected our list of aliases, we filter matches based on this list. Only if a match can be real, we do semantic analysis for it.
The results are promising: searching for all references on `new()` in `base-db` in the rust-analyzer repository, which previously took around 60 seconds, now takes as least as two seconds and a half (roughly), while searching for `Vec::new()`, almost an upper bound to how much a symbol can be used, that used to take 7-9 minutes(!) now completes in 100-120 seconds, and with less than half of non-verified results (aka. false positives).
This is the less strictly correct (but faster) of this patch; it can miss some (rare) cases (there is a test for that - `goto_ref_on_short_associated_function_complicated_type_magic_can_confuse_our_logic()`). There is another branch that have no false negatives but is slower to search (`Vec::new()` never reaches a fixpoint in aliases collection there). I believe it is possible to create a strategy that will have the best of both worlds, but it will involve significant complexity and I didn't bother, especially considering that in the vast majority of the searches the other branch will be more than enough. But all in all, I decided to bring this branch (of course if the maintainers will agree), since our search is already not 100% accurate (it misses macros), and I believe there is value in the additional perf.
2024-08-18 22:39:31 +00:00
|
|
|
for offset in Self::match_indices(&text, finder, search_range) {
|
|
|
|
for name_ref in
|
|
|
|
Self::find_nodes(sema, "self", &tree, offset).filter_map(ast::NameRef::cast)
|
2023-02-14 08:34:19 +00:00
|
|
|
{
|
|
|
|
if self.found_self_module_name_ref(&name_ref, sink) {
|
|
|
|
return;
|
2021-06-28 14:41:35 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
_ => {}
|
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
2020-10-09 17:55:30 +00:00
|
|
|
|
2021-05-08 20:34:55 +00:00
|
|
|
fn found_self_ty_name_ref(
|
|
|
|
&self,
|
|
|
|
self_ty: &hir::Type,
|
|
|
|
name_ref: &ast::NameRef,
|
2024-07-17 15:35:40 +00:00
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
2021-05-08 20:34:55 +00:00
|
|
|
) -> bool {
|
2024-08-06 13:52:43 +00:00
|
|
|
// See https://github.com/rust-lang/rust-analyzer/pull/15864/files/e0276dc5ddc38c65240edb408522bb869f15afb4#r1389848845
|
|
|
|
let ty_eq = |ty: hir::Type| match (ty.as_adt(), self_ty.as_adt()) {
|
|
|
|
(Some(ty), Some(self_ty)) => ty == self_ty,
|
|
|
|
(None, None) => ty == *self_ty,
|
|
|
|
_ => false,
|
|
|
|
};
|
|
|
|
|
2021-06-13 03:54:16 +00:00
|
|
|
match NameRefClass::classify(self.sema, name_ref) {
|
2021-05-08 20:34:55 +00:00
|
|
|
Some(NameRefClass::Definition(Definition::SelfType(impl_)))
|
2024-08-06 13:52:43 +00:00
|
|
|
if ty_eq(impl_.self_ty(self.sema.db)) =>
|
2021-05-08 20:34:55 +00:00
|
|
|
{
|
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name_ref.syntax());
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::NameRef(name_ref.clone()),
|
2024-04-16 15:10:36 +00:00
|
|
|
category: ReferenceCategory::empty(),
|
2021-05-08 20:34:55 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
|
|
|
}
|
|
|
|
_ => false,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-06-28 14:41:35 +00:00
|
|
|
fn found_self_module_name_ref(
|
|
|
|
&self,
|
|
|
|
name_ref: &ast::NameRef,
|
2024-07-17 15:35:40 +00:00
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
2021-06-28 14:41:35 +00:00
|
|
|
) -> bool {
|
|
|
|
match NameRefClass::classify(self.sema, name_ref) {
|
2021-11-10 21:02:50 +00:00
|
|
|
Some(NameRefClass::Definition(def @ Definition::Module(_))) if def == self.def => {
|
2021-06-28 14:41:35 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name_ref.syntax());
|
2024-04-16 15:10:36 +00:00
|
|
|
let category = if is_name_ref_in_import(name_ref) {
|
|
|
|
ReferenceCategory::IMPORT
|
|
|
|
} else {
|
|
|
|
ReferenceCategory::empty()
|
|
|
|
};
|
2021-06-28 14:41:35 +00:00
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::NameRef(name_ref.clone()),
|
2024-04-16 15:10:36 +00:00
|
|
|
category,
|
2021-06-28 14:41:35 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
|
|
|
}
|
|
|
|
_ => false,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2023-12-05 14:42:39 +00:00
|
|
|
fn found_format_args_ref(
|
|
|
|
&self,
|
2024-07-17 15:35:40 +00:00
|
|
|
file_id: EditionedFileId,
|
2023-12-05 14:42:39 +00:00
|
|
|
range: TextRange,
|
|
|
|
token: ast::String,
|
|
|
|
res: Option<PathResolution>,
|
2024-07-17 15:35:40 +00:00
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
2023-12-05 14:42:39 +00:00
|
|
|
) -> bool {
|
|
|
|
match res.map(Definition::from) {
|
|
|
|
Some(def) if def == self.def => {
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
|
|
|
name: FileReferenceNode::FormatStringEntry(token, range),
|
2024-04-16 15:10:36 +00:00
|
|
|
category: ReferenceCategory::READ,
|
2023-12-05 14:42:39 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
|
|
|
}
|
|
|
|
_ => false,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-12-16 20:35:15 +00:00
|
|
|
fn found_lifetime(
|
|
|
|
&self,
|
|
|
|
lifetime: &ast::Lifetime,
|
2024-07-17 15:35:40 +00:00
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
2020-12-16 20:35:15 +00:00
|
|
|
) -> bool {
|
|
|
|
match NameRefClass::classify_lifetime(self.sema, lifetime) {
|
2021-06-11 17:23:59 +00:00
|
|
|
Some(NameRefClass::Definition(def)) if def == self.def => {
|
2021-01-11 23:05:07 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(lifetime.syntax());
|
2021-02-09 15:03:39 +00:00
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::Lifetime(lifetime.clone()),
|
2024-04-16 15:10:36 +00:00
|
|
|
category: ReferenceCategory::empty(),
|
2021-02-09 15:03:39 +00:00
|
|
|
};
|
2021-01-11 23:05:07 +00:00
|
|
|
sink(file_id, reference)
|
2020-12-16 20:35:15 +00:00
|
|
|
}
|
2021-05-08 20:34:55 +00:00
|
|
|
_ => false,
|
2020-12-16 20:35:15 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-10-09 17:55:30 +00:00
|
|
|
fn found_name_ref(
|
|
|
|
&self,
|
|
|
|
name_ref: &ast::NameRef,
|
2024-07-17 15:35:40 +00:00
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
2020-10-09 17:55:30 +00:00
|
|
|
) -> bool {
|
2021-06-13 03:54:16 +00:00
|
|
|
match NameRefClass::classify(self.sema, name_ref) {
|
2022-06-24 11:15:16 +00:00
|
|
|
Some(NameRefClass::Definition(def))
|
2023-01-11 16:10:04 +00:00
|
|
|
if self.def == def
|
2023-01-24 13:11:02 +00:00
|
|
|
// is our def a trait assoc item? then we want to find all assoc items from trait impls of our trait
|
2023-01-11 16:10:04 +00:00
|
|
|
|| matches!(self.assoc_item_container, Some(hir::AssocItemContainer::Trait(_)))
|
|
|
|
&& convert_to_def_in_trait(self.sema.db, def) == self.def =>
|
|
|
|
{
|
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name_ref.syntax());
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::NameRef(name_ref.clone()),
|
2024-01-29 10:42:41 +00:00
|
|
|
category: ReferenceCategory::new(self.sema, &def, name_ref),
|
2023-01-11 16:10:04 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
|
|
|
}
|
|
|
|
// FIXME: special case type aliases, we can't filter between impl and trait defs here as we lack the substitutions
|
|
|
|
// so we always resolve all assoc type aliases to both their trait def and impl defs
|
|
|
|
Some(NameRefClass::Definition(def))
|
|
|
|
if self.assoc_item_container.is_some()
|
|
|
|
&& matches!(self.def, Definition::TypeAlias(_))
|
|
|
|
&& convert_to_def_in_trait(self.sema.db, def)
|
|
|
|
== convert_to_def_in_trait(self.sema.db, self.def) =>
|
2022-06-24 11:15:16 +00:00
|
|
|
{
|
2021-04-03 21:04:31 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name_ref.syntax());
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::NameRef(name_ref.clone()),
|
2024-01-29 10:42:41 +00:00
|
|
|
category: ReferenceCategory::new(self.sema, &def, name_ref),
|
2021-04-03 21:04:31 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
|
|
|
}
|
2021-05-08 20:34:55 +00:00
|
|
|
Some(NameRefClass::Definition(def)) if self.include_self_kw_refs.is_some() => {
|
2021-05-08 20:43:26 +00:00
|
|
|
if self.include_self_kw_refs == def_to_ty(self.sema, &def) {
|
2021-05-08 20:34:55 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name_ref.syntax());
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::NameRef(name_ref.clone()),
|
2024-01-29 10:42:41 +00:00
|
|
|
category: ReferenceCategory::new(self.sema, &def, name_ref),
|
2021-05-08 20:34:55 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
|
|
|
} else {
|
|
|
|
false
|
2021-04-03 21:01:49 +00:00
|
|
|
}
|
|
|
|
}
|
2020-10-15 15:33:32 +00:00
|
|
|
Some(NameRefClass::FieldShorthand { local_ref: local, field_ref: field }) => {
|
2021-01-11 23:05:07 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name_ref.syntax());
|
2023-03-09 14:10:26 +00:00
|
|
|
|
|
|
|
let field = Definition::Field(field);
|
|
|
|
let local = Definition::Local(local);
|
2021-05-08 20:34:55 +00:00
|
|
|
let access = match self.def {
|
2021-10-02 09:18:18 +00:00
|
|
|
Definition::Field(_) if field == self.def => {
|
2024-01-29 10:42:41 +00:00
|
|
|
ReferenceCategory::new(self.sema, &field, name_ref)
|
2021-10-02 09:18:18 +00:00
|
|
|
}
|
2023-03-09 14:10:26 +00:00
|
|
|
Definition::Local(_) if local == self.def => {
|
2024-01-29 10:42:41 +00:00
|
|
|
ReferenceCategory::new(self.sema, &local, name_ref)
|
2021-05-08 20:34:55 +00:00
|
|
|
}
|
|
|
|
_ => return false,
|
2020-10-09 17:55:30 +00:00
|
|
|
};
|
2021-10-02 09:18:18 +00:00
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::NameRef(name_ref.clone()),
|
2021-10-02 09:18:18 +00:00
|
|
|
category: access,
|
|
|
|
};
|
2021-01-11 23:05:07 +00:00
|
|
|
sink(file_id, reference)
|
2020-10-09 17:55:30 +00:00
|
|
|
}
|
2021-05-08 20:34:55 +00:00
|
|
|
_ => false,
|
2020-10-09 17:55:30 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-01-11 23:05:07 +00:00
|
|
|
fn found_name(
|
|
|
|
&self,
|
|
|
|
name: &ast::Name,
|
2024-07-17 15:35:40 +00:00
|
|
|
sink: &mut dyn FnMut(EditionedFileId, FileReference) -> bool,
|
2021-01-11 23:05:07 +00:00
|
|
|
) -> bool {
|
2020-10-15 15:27:50 +00:00
|
|
|
match NameClass::classify(self.sema, name) {
|
2021-02-23 22:31:07 +00:00
|
|
|
Some(NameClass::PatFieldShorthand { local_def: _, field_ref })
|
|
|
|
if matches!(
|
2021-07-11 12:03:35 +00:00
|
|
|
self.def, Definition::Field(_) if Definition::Field(field_ref) == self.def
|
2021-02-23 22:31:07 +00:00
|
|
|
) =>
|
|
|
|
{
|
2021-01-11 23:05:07 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name.syntax());
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::Name(name.clone()),
|
2021-01-11 23:05:07 +00:00
|
|
|
// FIXME: mutable patterns should have `Write` access
|
2024-04-16 15:10:36 +00:00
|
|
|
category: ReferenceCategory::READ,
|
2020-10-09 17:55:30 +00:00
|
|
|
};
|
2021-01-11 23:05:07 +00:00
|
|
|
sink(file_id, reference)
|
2020-10-09 17:55:30 +00:00
|
|
|
}
|
2021-06-11 17:23:59 +00:00
|
|
|
Some(NameClass::ConstReference(def)) if self.def == def => {
|
2021-02-23 22:31:07 +00:00
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name.syntax());
|
2021-10-02 09:18:18 +00:00
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::Name(name.clone()),
|
2024-04-16 15:10:36 +00:00
|
|
|
category: ReferenceCategory::empty(),
|
2021-10-02 09:18:18 +00:00
|
|
|
};
|
2021-02-23 22:31:07 +00:00
|
|
|
sink(file_id, reference)
|
|
|
|
}
|
2021-11-10 21:02:50 +00:00
|
|
|
Some(NameClass::Definition(def)) if def != self.def => {
|
2023-01-24 13:11:02 +00:00
|
|
|
match (&self.assoc_item_container, self.def) {
|
|
|
|
// for type aliases we always want to reference the trait def and all the trait impl counterparts
|
|
|
|
// FIXME: only until we can resolve them correctly, see FIXME above
|
|
|
|
(Some(_), Definition::TypeAlias(_))
|
|
|
|
if convert_to_def_in_trait(self.sema.db, def)
|
|
|
|
!= convert_to_def_in_trait(self.sema.db, self.def) =>
|
|
|
|
{
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
(Some(_), Definition::TypeAlias(_)) => {}
|
|
|
|
// We looking at an assoc item of a trait definition, so reference all the
|
|
|
|
// corresponding assoc items belonging to this trait's trait implementations
|
|
|
|
(Some(hir::AssocItemContainer::Trait(_)), _)
|
|
|
|
if convert_to_def_in_trait(self.sema.db, def) == self.def => {}
|
|
|
|
_ => return false,
|
2022-07-20 11:59:31 +00:00
|
|
|
}
|
|
|
|
let FileRange { file_id, range } = self.sema.original_range(name.syntax());
|
|
|
|
let reference = FileReference {
|
|
|
|
range,
|
2023-12-05 14:42:39 +00:00
|
|
|
name: FileReferenceNode::Name(name.clone()),
|
2024-04-16 15:10:36 +00:00
|
|
|
category: ReferenceCategory::empty(),
|
2022-07-20 11:59:31 +00:00
|
|
|
};
|
|
|
|
sink(file_id, reference)
|
2021-06-11 17:23:59 +00:00
|
|
|
}
|
2021-05-08 20:34:55 +00:00
|
|
|
_ => false,
|
2020-10-09 17:55:30 +00:00
|
|
|
}
|
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
|
|
|
|
2022-07-20 13:02:08 +00:00
|
|
|
fn def_to_ty(sema: &Semantics<'_, RootDatabase>, def: &Definition) -> Option<hir::Type> {
|
2021-05-08 20:34:55 +00:00
|
|
|
match def {
|
2021-11-10 21:02:50 +00:00
|
|
|
Definition::Adt(adt) => Some(adt.ty(sema.db)),
|
|
|
|
Definition::TypeAlias(it) => Some(it.ty(sema.db)),
|
2022-03-26 20:22:35 +00:00
|
|
|
Definition::BuiltinType(it) => Some(it.ty(sema.db)),
|
2021-05-08 20:43:26 +00:00
|
|
|
Definition::SelfType(it) => Some(it.self_ty(sema.db)),
|
2021-05-08 20:34:55 +00:00
|
|
|
_ => None,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-10-02 09:18:18 +00:00
|
|
|
impl ReferenceCategory {
|
2024-01-29 10:42:41 +00:00
|
|
|
fn new(
|
|
|
|
sema: &Semantics<'_, RootDatabase>,
|
|
|
|
def: &Definition,
|
|
|
|
r: &ast::NameRef,
|
2024-04-16 15:10:36 +00:00
|
|
|
) -> ReferenceCategory {
|
|
|
|
let mut result = ReferenceCategory::empty();
|
2024-01-29 10:42:41 +00:00
|
|
|
if is_name_ref_in_test(sema, r) {
|
2024-04-16 15:10:36 +00:00
|
|
|
result |= ReferenceCategory::TEST;
|
2024-01-28 10:28:13 +00:00
|
|
|
}
|
|
|
|
|
2021-10-02 09:18:18 +00:00
|
|
|
// Only Locals and Fields have accesses for now.
|
|
|
|
if !matches!(def, Definition::Local(_) | Definition::Field(_)) {
|
2024-04-16 15:10:36 +00:00
|
|
|
if is_name_ref_in_import(r) {
|
|
|
|
result |= ReferenceCategory::IMPORT;
|
|
|
|
}
|
|
|
|
return result;
|
2021-10-02 09:18:18 +00:00
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
|
2021-10-02 09:18:18 +00:00
|
|
|
let mode = r.syntax().ancestors().find_map(|node| {
|
2024-04-16 15:10:36 +00:00
|
|
|
match_ast! {
|
|
|
|
match node {
|
|
|
|
ast::BinExpr(expr) => {
|
|
|
|
if matches!(expr.op_kind()?, ast::BinaryOp::Assignment { .. }) {
|
|
|
|
// If the variable or field ends on the LHS's end then it's a Write
|
|
|
|
// (covers fields and locals). FIXME: This is not terribly accurate.
|
|
|
|
if let Some(lhs) = expr.lhs() {
|
|
|
|
if lhs.syntax().text_range().end() == r.syntax().text_range().end() {
|
|
|
|
return Some(ReferenceCategory::WRITE)
|
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
|
|
|
}
|
2024-04-16 15:10:36 +00:00
|
|
|
Some(ReferenceCategory::READ)
|
|
|
|
},
|
|
|
|
_ => None,
|
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
2024-04-16 15:10:36 +00:00
|
|
|
}).unwrap_or(ReferenceCategory::READ);
|
2020-03-04 11:14:48 +00:00
|
|
|
|
2024-04-16 15:10:36 +00:00
|
|
|
result | mode
|
2021-10-02 09:18:18 +00:00
|
|
|
}
|
2020-03-04 11:14:48 +00:00
|
|
|
}
|
2022-09-13 12:47:26 +00:00
|
|
|
|
|
|
|
fn is_name_ref_in_import(name_ref: &ast::NameRef) -> bool {
|
|
|
|
name_ref
|
|
|
|
.syntax()
|
|
|
|
.parent()
|
|
|
|
.and_then(ast::PathSegment::cast)
|
|
|
|
.and_then(|it| it.parent_path().top_path().syntax().parent())
|
|
|
|
.map_or(false, |it| it.kind() == SyntaxKind::USE_TREE)
|
|
|
|
}
|
2024-01-28 10:28:13 +00:00
|
|
|
|
2024-01-29 10:42:41 +00:00
|
|
|
fn is_name_ref_in_test(sema: &Semantics<'_, RootDatabase>, name_ref: &ast::NameRef) -> bool {
|
|
|
|
name_ref.syntax().ancestors().any(|node| match ast::Fn::cast(node) {
|
|
|
|
Some(it) => sema.to_def(&it).map_or(false, |func| func.is_test(sema.db)),
|
|
|
|
None => false,
|
|
|
|
})
|
2024-01-28 10:28:13 +00:00
|
|
|
}
|