Freshen Architecture.md document

This commit is contained in:
Aleksey Kladov 2020-01-29 15:08:31 +01:00
parent 1065c2bf1d
commit 84dfbfbd1d
2 changed files with 45 additions and 38 deletions

View file

@ -106,6 +106,10 @@ communication, and `print!` would break it.
If I need to fix something simultaneously in the server and in the client, I If I need to fix something simultaneously in the server and in the client, I
feel even more sad. I don't have a specific workflow for this case. feel even more sad. I don't have a specific workflow for this case.
Additionally, I use `cargo run --release -p ra_cli -- analysis-stats
path/to/some/rust/crate` to run a batch analysis. This is primaraly useful for
performance optimiations, or for bug minimization.
# Logging # Logging
Logging is done by both rust-analyzer and VS Code, so it might be tricky to Logging is done by both rust-analyzer and VS Code, so it might be tricky to

View file

@ -12,6 +12,9 @@ analyzer:
https://www.youtube.com/playlist?list=PL85XCvVPmGQho7MZkdW-wtPtuJcFpzycE https://www.youtube.com/playlist?list=PL85XCvVPmGQho7MZkdW-wtPtuJcFpzycE
Note that the guide and videos are pretty dated, this document should be in
generally fresher.
## The Big Picture ## The Big Picture
![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png) ![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png)
@ -20,13 +23,12 @@ On the highest level, rust-analyzer is a thing which accepts input source code
from the client and produces a structured semantic model of the code. from the client and produces a structured semantic model of the code.
More specifically, input data consists of a set of test files (`(PathBuf, More specifically, input data consists of a set of test files (`(PathBuf,
String)` pairs) and information about project structure, captured in the so called String)` pairs) and information about project structure, captured in the so
`CrateGraph`. The crate graph specifies which files are crate roots, which cfg called `CrateGraph`. The crate graph specifies which files are crate roots,
flags are specified for each crate (TODO: actually implement this) and what which cfg flags are specified for each crate and what dependencies exist between
dependencies exist between the crates. The analyzer keeps all this input data in the crates. The analyzer keeps all this input data in memory and never does any
memory and never does any IO. Because the input data is source code, which IO. Because the input data are source code, which typically measures in tens of
typically measures in tens of megabytes at most, keeping all input data in megabytes at most, keeping everything in memory is OK.
memory is OK.
A "structured semantic model" is basically an object-oriented representation of A "structured semantic model" is basically an object-oriented representation of
modules, functions and types which appear in the source code. This representation modules, functions and types which appear in the source code. This representation
@ -43,37 +45,39 @@ can be quickly updated for small modifications.
## Code generation ## Code generation
Some of the components of this repository are generated through automatic Some of the components of this repository are generated through automatic
processes. These are outlined below: processes. `cargo xtask codegen` runs all generation tasks. Generated code is
commited to the git repository.
- `cargo xtask codegen`: The kinds of tokens that are reused in several places, so a generator In particular, `cargo xtask codegen` generates:
is used. We use `quote!` macro to generate the files listed below, based on
the grammar described in [grammar.ron]:
- [ast/generated.rs][ast generated]
- [syntax_kind/generated.rs][syntax_kind generated]
[grammar.ron]: ../../crates/ra_syntax/src/grammar.ron 1. [`syntax_kind/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_parser/src/syntax_kind/generated.rs)
[ast generated]: ../../crates/ra_syntax/src/ast/generated.rs -- the set of terminals and non-terminals of rust grammar.
[syntax_kind generated]: ../../crates/ra_parser/src/syntax_kind/generated.rs
2. [`ast/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_syntax/src/ast/generated.rs)
-- AST data structure.
.3 [`doc_tests/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_assists/src/doc_tests/generated.rs),
[`test_data/parser/inline`](https://github.com/rust-analyzer/rust-analyzer/tree/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_syntax/test_data/parser/inline)
-- tests for assists and the parser.
The source for 1 and 2 is in [`ast_src.rs`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/xtask/src/ast_src.rs).
## Code Walk-Through ## Code Walk-Through
### `crates/ra_syntax`, `crates/ra_parser` ### `crates/ra_syntax`, `crates/ra_parser`
Rust syntax tree structure and parser. See Rust syntax tree structure and parser. See
[RFC](https://github.com/rust-lang/rfcs/pull/2256) for some design notes. [RFC](https://github.com/rust-lang/rfcs/pull/2256) and [./syntax.md](./syntax.md) for some design notes.
- [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees. - [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees.
- `grammar` module is the actual parser. It is a hand-written recursive descent parser, which - `grammar` module is the actual parser. It is a hand-written recursive descent parser, which
produces a sequence of events like "start node X", "finish node Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), produces a sequence of events like "start node X", "finish node Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java),
which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs) which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs)
is what we use for the definition of the Rust language. is what we use for the definition of the Rust language.
- `parser_api/parser_impl` bridges the tree-agnostic parser from `grammar` with `rowan` trees. - `TreeSink` and `TokenSource` traits bridge the tree-agnostic parser from `grammar` with `rowan` trees.
This is the thing that turns a flat list of events into a tree (see `EventProcessor`)
- `ast` provides a type safe API on top of the raw `rowan` tree. - `ast` provides a type safe API on top of the raw `rowan` tree.
- `grammar.ron` RON description of the grammar, which is used to - `ast_src` description of the grammar, which is used to generate `syntax_kinds`
generate `syntax_kinds` and `ast` modules, using `cargo xtask codegen` command. and `ast` modules, using `cargo xtask codegen` command.
- `algo`: generic tree algorithms, including `walk` for O(1) stack
space tree traversal (this is cool).
Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirectories with a bunch of `.rs` Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirectories with a bunch of `.rs`
(test vectors) and `.txt` files with corresponding syntax trees. During testing, we check (test vectors) and `.txt` files with corresponding syntax trees. During testing, we check
@ -81,6 +85,10 @@ Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirec
tests). Additionally, running `cargo xtask codegen` will walk the grammar module and collect tests). Additionally, running `cargo xtask codegen` will walk the grammar module and collect
all `// test test_name` comments into files inside `test_data/parser/inline` directory. all `// test test_name` comments into files inside `test_data/parser/inline` directory.
Note
[`api_walkthrough`](https://github.com/rust-analyzer/rust-analyzer/blob/2fb6af89eb794f775de60b82afe56b6f986c2a40/crates/ra_syntax/src/lib.rs#L190-L348)
in particular: it shows off various methods of working with syntax tree.
See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which
fixes a bug in the grammar. fixes a bug in the grammar.
@ -94,18 +102,22 @@ defines most of the "input" queries: facts supplied by the client of the
analyzer. Reading the docs of the `ra_db::input` module should be useful: analyzer. Reading the docs of the `ra_db::input` module should be useful:
everything else is strictly derived from those inputs. everything else is strictly derived from those inputs.
### `crates/ra_hir` ### `crates/ra_hir*` crates
HIR provides high-level "object oriented" access to Rust code. HIR provides high-level "object oriented" access to Rust code.
The principal difference between HIR and syntax trees is that HIR is bound to a The principal difference between HIR and syntax trees is that HIR is bound to a
particular crate instance. That is, it has cfg flags and features applied (in particular crate instance. That is, it has cfg flags and features applied. So,
theory, in practice this is to be implemented). So, the relation between the relation between syntax and HIR is many-to-one. The `source_binder` module
syntax and HIR is many-to-one. The `source_binder` module is responsible for is responsible for guessing a HIR for a particular source position.
guessing a HIR for a particular source position.
Underneath, HIR works on top of salsa, using a `HirDatabase` trait. Underneath, HIR works on top of salsa, using a `HirDatabase` trait.
`ra_hir_xxx` crates have a strong ECS flavor, in that they work with raw ids and
directly query the databse.
The top-level `ra_hir` façade crate wraps ids into a more OO-flavored API.
### `crates/ra_ide` ### `crates/ra_ide`
A stateful library for analyzing many Rust files as they change. `AnalysisHost` A stateful library for analyzing many Rust files as they change. `AnalysisHost`
@ -135,18 +147,9 @@ different from data on disk. This is more or less the single really
platform-dependent component, so it lives in a separate repository and has an platform-dependent component, so it lives in a separate repository and has an
extensive cross-platform CI testing. extensive cross-platform CI testing.
### `crates/gen_lsp_server`
A language server scaffold, exposing a synchronous crossbeam-channel based API.
This crate handles protocol handshaking and parsing messages, while you
control the message dispatch loop yourself.
Run with `RUST_LOG=sync_lsp_server=debug` to see all the messages.
### `crates/ra_cli` ### `crates/ra_cli`
A CLI interface to rust-analyzer. A CLI interface to rust-analyzer, mainly for testing.
## Testing Infrastructure ## Testing Infrastructure