rust-analyzer/crates/syntax/src/ptr.rs

126 lines
3.9 KiB
Rust
Raw Normal View History

2021-05-22 13:53:47 +00:00
//! In rust-analyzer, syntax trees are transient objects.
//!
//! That means that we create trees when we need them, and tear them down to
//! save memory. In this architecture, hanging on to a particular syntax node
//! for a long time is ill-advisable, as that keeps the whole tree resident.
//!
//! Instead, we provide a [`SyntaxNodePtr`] type, which stores information about
//! *location* of a particular syntax node in a tree. Its a small type which can
//! be cheaply stored, and which can be resolved to a real [`SyntaxNode`] when
//! necessary.
use std::{
hash::{Hash, Hasher},
iter::successors,
marker::PhantomData,
};
2019-01-23 14:37:10 +00:00
2019-07-18 16:23:05 +00:00
use crate::{AstNode, SyntaxKind, SyntaxNode, TextRange};
2019-01-23 14:37:10 +00:00
/// A pointer to a syntax node inside a file. It can be used to remember a
/// specific node across reparses of the same file.
2020-04-10 22:27:00 +00:00
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
2019-01-23 14:37:10 +00:00
pub struct SyntaxNodePtr {
// Don't expose this field further. At some point, we might want to replace
// range with node id.
2019-04-21 14:47:55 +00:00
pub(crate) range: TextRange,
2019-01-23 14:37:10 +00:00
kind: SyntaxKind,
}
impl SyntaxNodePtr {
pub fn new(node: &SyntaxNode) -> SyntaxNodePtr {
2019-07-20 09:58:27 +00:00
SyntaxNodePtr { range: node.text_range(), kind: node.kind() }
2019-01-23 14:37:10 +00:00
}
/// "Dereference" the pointer to get the node it points to.
///
/// Panics if node is not found, so make sure that `root` syntax tree is
/// equivalent (is build from the same text) to the tree which was
/// originally used to get this [`SyntaxNodePtr`].
///
/// The complexity is linear in the depth of the tree and logarithmic in
/// tree width. As most trees are shallow, thinking about this as
/// `O(log(N))` in the size of the tree is not too wrong!
2020-04-10 22:27:00 +00:00
pub fn to_node(&self, root: &SyntaxNode) -> SyntaxNode {
assert!(root.parent().is_none());
successors(Some(root.clone()), |node| {
feature: massively improve performance for large files This story begins in #8384, where we added a smart test for our syntax highting, which run the algorithm on synthetic files of varying length in order to guesstimate if the complexity is O(N^2) or O(N)-ish. The test turned out to be pretty effective, and flagged #9031 as a change that makes syntax highlighting accidentally quadratic. There was much rejoicing, for the time being. Then, lnicola asked an ominous question[1]: "Are we sure that the time is linear right now?" Of course it turned out that our sophisticated non-linearity detector *was* broken, and that our syntax highlighting *was* quadratic. Investigating that, many brave hearts dug deeper and deeper into the guts of rust-analyzer, only to get lost in a maze of traits delegating to traits delegating to macros. Eventually, matklad managed to peel off all layers of abstraction one by one, until almost nothing was left. In fact, the issue was discovered in the very foundation of the rust-analyzer -- in the syntax trees. Worse, it was not a new problem, but rather a well-know, well-understood and event (almost) well-fixed (!) performance bug. The problem lies within `SyntaxNodePtr` type -- a light-weight "address" of a node in a syntax tree [3]. Such pointers are used by rust-analyzer all other the place to record relationships between IR nodes and the original syntax. Internally, the pointer to a syntax node is represented by node's range. To "dereference" the pointer, you traverse the syntax tree from the root, looking for the node with the right range. The inner loop of this search is finding a node's child whose range contains the specified range. This inner loop was implemented by naive linear search over all the children. For wide trees, dereferencing a single `SyntaxNodePtr` was linear. The problem with wide trees though is that they contain a lot of nodes! And dereferencing pointers to all the nodes is quadratic in the size of the file! The solution to this problem is to speed up the children search -- rather than doing a linear lookup, we can use binary search to locate the child with the desired interval. Doing this optimization was one of the motivations (or rather, side effects) of #6857. That's why `rowan` grew the useful `child_or_token_at_range` method which does exactly this binary search. But looks like we've never actually switch to this method? Oups. Lesson learned: do not leave broken windows in the fundamental infra. Otherwise, you'll have to repeatedly re-investigate the issue, by digging from the top of the Everest down to the foundation! [1]: https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Frust-analyzer/topic/.60syntax_highlighting_not_quadratic.60.20failure/near/240811501 [2]: https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Frust-analyzer/topic/Syntax.20highlighting.20is.20quadratic [3]: https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Frust-analyzer/topic/Syntax.20highlighting.20is.20quadratic/near/243412392
2021-06-21 17:14:38 +00:00
node.child_or_token_at_range(self.range).and_then(|it| it.into_node())
2019-01-23 14:37:10 +00:00
})
2019-07-20 09:58:27 +00:00
.find(|it| it.text_range() == self.range && it.kind() == self.kind)
2019-01-23 14:37:10 +00:00
.unwrap_or_else(|| panic!("can't resolve local ptr to SyntaxNode: {:?}", self))
}
2019-08-12 19:39:11 +00:00
pub fn cast<N: AstNode>(self) -> Option<AstPtr<N>> {
if !N::can_cast(self.kind) {
2019-08-12 19:39:11 +00:00
return None;
}
Some(AstPtr { raw: self, _ty: PhantomData })
}
2019-01-23 14:37:10 +00:00
}
2019-01-23 15:26:02 +00:00
/// Like `SyntaxNodePtr`, but remembers the type of node
#[derive(Debug)]
2019-01-23 15:26:02 +00:00
pub struct AstPtr<N: AstNode> {
2019-01-24 10:40:36 +00:00
raw: SyntaxNodePtr,
_ty: PhantomData<fn() -> N>,
2019-01-23 15:26:02 +00:00
}
impl<N: AstNode> Clone for AstPtr<N> {
fn clone(&self) -> AstPtr<N> {
2020-04-10 22:27:00 +00:00
AstPtr { raw: self.raw.clone(), _ty: PhantomData }
2019-01-23 15:26:02 +00:00
}
}
impl<N: AstNode> Eq for AstPtr<N> {}
impl<N: AstNode> PartialEq for AstPtr<N> {
fn eq(&self, other: &AstPtr<N>) -> bool {
self.raw == other.raw
}
}
impl<N: AstNode> Hash for AstPtr<N> {
fn hash<H: Hasher>(&self, state: &mut H) {
self.raw.hash(state);
}
}
2019-01-23 15:26:02 +00:00
impl<N: AstNode> AstPtr<N> {
pub fn new(node: &N) -> AstPtr<N> {
2019-02-08 11:49:43 +00:00
AstPtr { raw: SyntaxNodePtr::new(node.syntax()), _ty: PhantomData }
2019-01-23 15:26:02 +00:00
}
2020-04-10 22:27:00 +00:00
pub fn to_node(&self, root: &SyntaxNode) -> N {
2019-05-13 16:39:06 +00:00
let syntax_node = self.raw.to_node(root);
2019-01-23 15:26:02 +00:00
N::cast(syntax_node).unwrap()
}
2020-04-10 22:27:00 +00:00
pub fn syntax_node_ptr(&self) -> SyntaxNodePtr {
self.raw.clone()
2019-01-23 15:26:02 +00:00
}
2019-07-19 15:22:00 +00:00
pub fn cast<U: AstNode>(self) -> Option<AstPtr<U>> {
if !U::can_cast(self.raw.kind) {
return None;
}
Some(AstPtr { raw: self.raw, _ty: PhantomData })
}
2019-01-23 15:26:02 +00:00
}
2019-03-23 13:28:47 +00:00
impl<N: AstNode> From<AstPtr<N>> for SyntaxNodePtr {
fn from(ptr: AstPtr<N>) -> SyntaxNodePtr {
ptr.raw
}
}
2019-01-23 14:37:10 +00:00
#[test]
fn test_local_syntax_ptr() {
2019-05-13 16:39:06 +00:00
use crate::{ast, AstNode, SourceFile};
2019-01-23 14:37:10 +00:00
2019-05-28 14:34:28 +00:00
let file = SourceFile::parse("struct Foo { f: u32, }").ok().unwrap();
2020-07-30 14:49:13 +00:00
let field = file.syntax().descendants().find_map(ast::RecordField::cast).unwrap();
2019-01-23 14:37:10 +00:00
let ptr = SyntaxNodePtr::new(field.syntax());
2019-05-13 16:39:06 +00:00
let field_syntax = ptr.to_node(file.syntax());
2019-07-18 16:23:05 +00:00
assert_eq!(field.syntax(), &field_syntax);
2019-01-23 14:37:10 +00:00
}