Commit graph

61 commits

Author SHA1 Message Date
Florian Diebold
cff209f152 WIP: Actually fix up syntax errors in attribute macro input 2022-02-07 18:12:51 +01:00
Jonas Schievink
5088926ec3 Make syntax bridge fully infallible 2022-01-24 17:27:39 +01:00
Aleksey Kladov
3836b195dd minor: replace panics with types 2022-01-02 19:05:37 +03:00
Aleksey Kladov
174c439c56 minor: drop dead code 2022-01-02 19:03:38 +03:00
Aleksey Kladov
d846afdeef check top level entry point invariants 2022-01-02 18:41:32 +03:00
Aleksey Kladov
7989d567e2 internal: more macro tests 2022-01-02 17:18:21 +03:00
Lukas Wirth
8fad24d3c2 minor: Simplify 2022-01-02 12:40:46 +01:00
Lukas Wirth
65a1538dd1 internal: Use basic NonEmptyVec in mbe::syntax_bridge 2022-01-02 03:48:19 +01:00
Lukas Wirth
a0e0e4575b Simplify 2022-01-02 02:39:14 +01:00
Aleksey Kladov
afffa096f6 add TopEntryPoint 2021-12-28 17:00:55 +03:00
Aleksey Kladov
8e7fc7be65 simplify 2021-12-28 17:00:55 +03:00
Aleksey Kladov
74de79b1da internal: rename 2021-12-25 22:02:26 +03:00
Aleksey Kladov
d0d05075ed internal: replace TreeSink with a data structure
The general theme of this is to make parser a better independent
library.

The specific thing we do here is replacing callback based TreeSink with
a data structure. That is, rather than calling user-provided tree
construction methods, the parser now spits out a very bare-bones tree,
effectively a log of a DFS traversal.

This makes the parser usable without any *specifc* tree sink, and allows
us to, eg, move tests into this crate.

Now, it's also true that this is a distinction without a difference, as
the old and the new interface are equivalent in expressiveness. Still,
this new thing seems somewhat simpler. But yeah, I admit I don't have a
suuper strong motivation here, just a hunch that this is better.
2021-12-25 22:02:26 +03:00
Aleksey Kladov
a022ad68c9 internal: move all the lexing to the parser crate 2021-12-18 17:20:38 +03:00
Aleksey Kladov
1055a6111a port mbe to soa tokens 2021-12-12 19:06:40 +03:00
Aleksey Kladov
5a83d1be66 internal: replace L_DOLLAR/R_DOLLAR with parenthesis hack
The general problem we are dealing with here is this:

```
macro_rules! thrice {
    ($e:expr) => { $e * 3}
}

fn main() {
    let x = thrice!(1 + 2);
}
```

we really want this to print 9 rather than 7.

The way rustc solves this is rather ad-hoc. In rustc, token trees are
allowed to include whole AST fragments, so 1+2 is passed through macro
expansion as a single unit. This is a significant violation of token
tree model.

In rust-analyzer, we intended to handle this in a more elegant way,
using token trees with "invisible" delimiters. The idea was is that we
introduce a new kind of parenthesis, "left $"/"right $", and let the
parser intelligently handle this.

The idea was inspired by the relevant comment in the proc_macro crate:

https://doc.rust-lang.org/stable/proc_macro/enum.Delimiter.html#variant.None

> An implicit delimiter, that may, for example, appear around tokens
> coming from a “macro variable” $var. It is important to preserve
> operator priorities in cases like $var * 3 where $var is 1 + 2.
> Implicit delimiters might not survive roundtrip of a token stream
> through a string.

Now that we are older and wiser, we conclude that the idea doesn't work.

_First_, the comment in the proc-macro crate is wishful thinking. Rustc
currently completely ignores none delimiters. It solves the (1 + 2) * 3
problem by having magical token trees which can't be duplicated:

* https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Frust-analyzer/topic/TIL.20that.20token.20streams.20are.20magic
* https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Handling.20of.20Delimiter.3A.3ANone.20by.20the.20parser

_Second_, it's not like our implementation in rust-analyzer works. We
special-case expressions (as opposed to treating all kinds of $var
captures the same) and we don't know how parser error recovery should
work with these dollar-parenthesis.

So, in this PR we simplify the whole thing away by not pretending that
we are doing something proper and instead just explicitly special-casing
expressions by wrapping them into real `()`.

In the future, to maintain bug-parity with `rustc` what we are going to
do is probably adding an explicit `CAPTURED_EXPR` *token* which we can
explicitly account for in the parser.

If/when rustc starts handling delimiter=none properly, we'll port that
logic as well, in addition to special handling.
2021-10-23 20:44:31 +03:00
Aleksey Kladov
a3470a8114 move tests 2021-10-10 11:39:08 +03:00
Aleksey Kladov
f4ee0d736c move tests 2021-10-09 14:39:24 +03:00
Aleksey Kladov
1abe3f8275 internal: move tests 2021-10-09 14:22:49 +03:00
Aleksey Kladov
49f5fecf06 internal: move test 2021-10-09 14:18:53 +03:00
Aleksey Kladov
78ca43ef3d internal: move test 2021-10-09 13:51:02 +03:00
Lukas Wirth
d99adc5738 Make hover work for intra doc links in macro invocations 2021-09-23 17:32:39 +02:00
bors[bot]
f1d7f98ed0
Merge #10293
10293: fix: Don't bail on parse errors in macro input for builtin expansion r=Veykril a=Veykril

Fixes https://github.com/rust-analyzer/rust-analyzer/issues/8158

Co-authored-by: Lukas Wirth <lukastw97@gmail.com>
2021-09-19 22:33:42 +00:00
Lukas Wirth
e7e87fc69d Don't bail on parse errors in macro input for builtin expansion 2021-09-20 00:33:13 +02:00
Lukas Wirth
a6dde501df Only strip derive attributes when preparing macro input 2021-09-19 23:38:38 +02:00
Aleksey Kladov
104cd0ce88 internal: make name consistent with usage 2021-09-06 18:34:03 +03:00
Aleksey Kladov
dbb702cfc1 internal: remove accidental code re-use
FragmentKind played two roles:

* entry point to the parser
* syntactic category of a macro call

These are different use-cases, and warrant different types. For example,
macro can't expand to visibility, but we have such fragment today.

This PR introduces `ExpandsTo` enum to separate this two use-cases.

I suspect we might further split `FragmentKind` into `$x:specifier` enum
specific to MBE, and a general parser entry point, but that's for
another PR!
2021-09-05 22:36:36 +03:00
Lukas Wirth
1195cb50c2 Add simple test for syntax_node_to_token_tree_censored 2021-08-25 19:57:18 +02:00
Lukas Wirth
d6134b6802 Don't mutate syntax trees when preparing proc-macro input 2021-08-25 18:57:24 +02:00
Aleksey Kladov
78c7940f5c internal: remove dead code 2021-08-14 20:29:46 +03:00
Aleksey Kladov
9aa6be71a5 internal: remove useless helpers
We generally avoid "syntax only" helper wrappers, which don't do much:
they make code easier to write, but harder to read. They also make
investigations harder, as "find_usages" needs to be invoked both for the
wrapped and unwrapped APIs
2021-08-09 15:58:21 +03:00
Jonas Schievink
c6669776e1 Rewrite convert_tokens to use an explicit stack 2021-06-23 00:21:11 +02:00
Jonas Schievink
6504c3c32a Move subtree collection out of TokenConvertor 2021-06-23 00:19:54 +02:00
Clemens Wasser
47747cd412 Apply some clippy suggestions 2021-06-21 16:40:21 +02:00
Maan2003
c9b4ac5be4
clippy::redudant_borrow 2021-06-13 09:24:16 +05:30
Clemens Wasser
629e8d1ed0 Apply more clippy suggestions and update generated 2021-06-03 12:46:56 +02:00
Jonas Schievink
27bf62b70e Move TokenMap to its own file 2021-05-24 18:43:42 +02:00
Aleksey Kladov
dc1577d58d Add even more docs 2021-05-22 17:20:22 +03:00
bors[bot]
2ace128dd4
Merge #8560
8560: Escape characters in doc comments in macros correctly r=jonas-schievink a=ChayimFriedman2

Previously they were escaped twice, both by `.escape_default()` and the debug view of strings (`{:?}`). This leads to things like newlines or tabs in documentation comments being `\\n`, but we unescape literals only once, ending up with `\n`.

This was hard to spot because CMark unescaped them (at least for `'` and `"`), but it did not do so in code blocks.

This also was the root cause of #7781. This issue was solved by using `.escape_debug()` instead of `.escape_default()`, but the real issue remained.
We can bring the `.escape_default()` back by now, however I didn't do it because it is probably slower than `.escape_debug()` (more work to do), and also in order to change the code the least.

Example (the keyword and primitive docs are `include!()`d at https://doc.rust-lang.org/src/std/lib.rs.html#570-578, and thus originate from macro):

Before:
![image](https://user-images.githubusercontent.com/24700207/115130096-40544300-9ff5-11eb-847b-969e7034e8a4.png)

After:
![image](https://user-images.githubusercontent.com/24700207/115130143-9cb76280-9ff5-11eb-9281-323746089440.png)


Co-authored-by: Chayim Refael Friedman <chayimfr@gmail.com>
2021-04-18 02:14:27 +00:00
Chayim Refael Friedman
f92be7eaab Escape characters in doc comments in macros correctly
Previously they were escaped twice, both by `.escape_default()` and the debug view of strings (`{:?}`). This leads to things like newlines or tabs in documentation comments being `\\n`, but we unescape literals only once, ending up with `\n`.

This was hard to spot because CMark unescaped them (at least for `'` and `"`), but it did not do so in code blocks.

This also was the root cause of #7781. This issue was solved by using `.escape_debug()` instead of `.escape_default()`, but the real issue remained.
We can bring the `.escape_default()` back by now, however I didn't do it because it is probably slower than `.escape_debug()` (more work to do), and also in order to change the code the least.
2021-04-18 03:16:38 +03:00
Jonas Schievink
3abcdc03ba Make ast_to_token_tree infallible
It could never return `None`, so reflect that in the return type
2021-04-04 01:46:45 +02:00
Edwin Cheng
20d55ce44d Allow include! an empty content file 2021-04-03 12:50:55 +08:00
Matthias Krüger
202b51bc7b a lot of clippy::style fixes 2021-03-21 16:15:41 +01:00
Kevin Mehall
0a0e22235b Make bare underscore token an Ident rather than Punct in proc-macro 2021-03-20 12:28:44 -06:00
Matthias Krüger
966c23f529 avoid converting types into themselves via .into() (clippy::useless-conversion)
example: let x: String = String::from("hello world").into();
2021-03-17 01:27:56 +01:00
Matthias Krüger
cad617bba0 some clippy::performance fixes
use vec![] instead of Vec::new() +  push()
avoid redundant clones
use chars instead of &str for single char patterns in ends_with() and starts_with()
allocate some Vecs with capacity to avoid unneccessary resizing
2021-03-15 10:19:59 +01:00
Edwin Cheng
bf8bc5c882 Fix non-latin characters doc comment for mbe 2021-02-28 13:49:08 +08:00
Edwin Cheng
f5bf1a9650 Fix builtin macros split exprs on comma 2021-02-28 13:06:17 +08:00
Aleksey Kladov
3429b32ad1 ⬆️ rowan
It now stores text inline with tokens
2021-01-20 14:04:53 +03:00
Aleksey Kladov
46b4f89c92 . 2021-01-20 01:56:11 +03:00