I'm fairly sure this is more correct, and saves space(~90mb to 82mb
for Zed's index). I'm checking in about this with SCIP folks in
https://github.com/sourcegraph/scip/pull/299.
Before this change `SymbolInformation` provided by a document was the
info for all encountered symbols that have not yet been emitted. So,
the symbol information on a `Document` was a mishmash of symbols
defined in the documents, symbols from other documents, and external
symbols.
After this change, the `SymbolInformation` on documents is just the
locals and defined symbols from the document. All symbols referenced
and not from emitted documents are included in `external_symbols`.
In particular, the symbol generation before this change creates a lot
of symbols with the same name for different definitions. This change
makes progress on symbol uniqueness, but does not fix a couple cases
where it was unclear to me how to fix (see TODOs in `scip.rs`)
Behavior changes:
* `scip` command now reports symbol information omitted due to symbol
collisions. Iterating with this on a large codebase (Zed!) resulted in
the other improvements in this change.
* Generally fixes providing the path to nested definitions in
symbols. Instead of having special cases for a couple limited cases of
nesting, implements `Definition::enclosing_definition` and uses this
to walk definitions.
* Parameter variables are now treated like locals.
- This fixes a bug where closure captures also received symbols
scoped to the containing function. To bring back parameter
symbols I would want a way to filter these out, since they can
cause symbol collisions.
- Having symbols for them seems to be intentional in
27e2eea54f, but no particular use is
specified there. For the typical indexing purposes of SCIP I don't see
why parameter symbols are useful or sensible, as function parameters
are not referencable by anything but position. I can imagine they
might be useful in representing diagnostics or something.
* Inherent impls are now represented as `impl#[SelfType]` - a type
named `impl` which takes a single type parameter.
* Trait impls are now represented as `impl#[SelfType][TraitType]` - a
type named `impl` which takes two type parameters.
* Associated types in traits and impls are now treated like types
instead of type parameters, and so are now suffixed with `#` instead
of wrapped with `[]`. Treating them as type parameters seems to have
been intentional in 73d9c77f2a but it
doesn't make sense to me, so changing it.
* Static variables are now treated as terms instead of `Meta`, and so
receive `.` suffix instead of `:`.
* Attributes are now treated as `Meta` instead of `Macro`, and so
receive `:` suffix instead of `!`.
* `enclosing_symbol` is now provided for labels and generic params,
which are local symbols.
* Fixes a bug where presence of `'` causes a descriptor name to get
double wrapped in backticks, since both `fn new_descriptor` and
`scip::symbol::format_symbol` have logic for wrapping in
backticks. Solution is to simply delete the redundant logic.
* Deletes a couple tests in moniker.rs because the cases are
adequeately covered in scip.rs and the format for identifiers used in
moniker.rs is clunky with the new representation for trait impls
Because it was a mess.
Previously, pretty much you had to handle all path diagnostics manually: remember to check for them and handle them. Now, we wrap the resolver in `TyLoweringContext` and ensure proper error reporting.
This means that you don't have to worry about them: most of the things are handled automatically, and things that cannot will create a compile-time error (forcing you top `drop(ty_lowering_context);`) if forgotten, instead of silently dropping the diagnostics.
The real place for error reporting is in the hir-def resolver, because there are other things resolving, both in hir-ty and in hir-def, and they all need to ensure proper diagnostics. But this is a good start, and future compatible.
This commit also ensures proper path diagnostics for value/pattern paths, which is why it's marked "feat".
For Windows, this removes the need to add a breakpoint and modify a value to exit the debugger wait loop.
As a ridealong, this adds a 100ms sleep for all platforms such that waiting for the debugger doesn't hog the CPU thread.
cleanup `TypeVerifier`
We should merge it with the `TypeChecker` as we no longer bail in cases where it encounters an error since #111863.
It's quite inconsistent whether a check lives in the verifier or the `TypeChecker`, so this feels like a quite impactful cleanup. I expect that for this we may want to change the `TypeChecker` to also be a MIR visitor 🤔 this is non-trivial so I didn't fully do it in this PR.
Best reviewed commit by commit.
r? `@compiler-errors` feel free to reassign however
Asserts the maximum value that can be returned from `Vec::len`
Currently, casting `Vec<i32>` to `Vec<u32>` takes O(1) time:
```rust
// See <https://godbolt.org/z/hxq3hnYKG> for assembly output.
pub fn cast(vec: Vec<i32>) -> Vec<u32> {
vec.into_iter().map(|e| e as _).collect()
}
```
But the generated assembly is not the same as the identity function, which prevents us from casting `Vec<Vec<i32>>` to `Vec<Vec<u32>>` within O(1) time:
```rust
// See <https://godbolt.org/z/7n48bxd9f> for assembly output.
pub fn cast(vec: Vec<Vec<i32>>) -> Vec<Vec<u32>> {
vec.into_iter()
.map(|e| e.into_iter().map(|e| e as _).collect())
.collect()
}
```
This change tries to fix the problem. You can see the comparison here: <https://godbolt.org/z/jdManrKvx>.
Use `PtrMetadata` instead of `Len` in slice drop shims
I tried to do a bigger change in #134297 which didn't work, so here's the part I really wanted: Removing another use of `Len`, in favour of `PtrMetadata`.
Split into two commits where the first just adds a test, so you can look at the second commit to see how the drop shim for an array changes with this PR.
Reusing the same reviewer from the last one:
r? BoxyUwU
Optimize `is_ascii` for `str` and `[u8]` further
Replace the existing optimized function with one that enables auto-vectorization.
This is especially beneficial on x86-64 as `pmovmskb` can be emitted with careful structuring of the code. The instruction can detect non-ASCII characters one vector register width at a time instead of the current `usize` at a time check.
The resulting implementation is completely safe.
`case00_libcore` is the current implementation, `case04_while_loop` is this PR.
```
benchmarks:
ascii::is_ascii_slice::long::case00_libcore 22.25/iter +/- 1.09
ascii::is_ascii_slice::long::case04_while_loop 6.78/iter +/- 0.92
ascii::is_ascii_slice::medium::case00_libcore 2.81/iter +/- 0.39
ascii::is_ascii_slice::medium::case04_while_loop 1.56/iter +/- 0.78
ascii::is_ascii_slice::short::case00_libcore 5.55/iter +/- 0.85
ascii::is_ascii_slice::short::case04_while_loop 3.75/iter +/- 0.22
ascii::is_ascii_slice::unaligned_both_long::case00_libcore 26.59/iter +/- 0.66
ascii::is_ascii_slice::unaligned_both_long::case04_while_loop 5.78/iter +/- 0.16
ascii::is_ascii_slice::unaligned_both_medium::case00_libcore 2.97/iter +/- 0.32
ascii::is_ascii_slice::unaligned_both_medium::case04_while_loop 2.41/iter +/- 0.10
ascii::is_ascii_slice::unaligned_head_long::case00_libcore 23.71/iter +/- 0.79
ascii::is_ascii_slice::unaligned_head_long::case04_while_loop 7.83/iter +/- 1.31
ascii::is_ascii_slice::unaligned_head_medium::case00_libcore 3.69/iter +/- 0.54
ascii::is_ascii_slice::unaligned_head_medium::case04_while_loop 7.05/iter +/- 0.32
ascii::is_ascii_slice::unaligned_tail_long::case00_libcore 24.44/iter +/- 1.41
ascii::is_ascii_slice::unaligned_tail_long::case04_while_loop 5.12/iter +/- 0.18
ascii::is_ascii_slice::unaligned_tail_medium::case00_libcore 3.24/iter +/- 0.40
ascii::is_ascii_slice::unaligned_tail_medium::case04_while_loop 2.86/iter +/- 0.14
```
`unaligned_head_medium` is the main regression in the benchmarks. It is a 32 byte string being sliced `bytes[1..]`.
The first commit can be used to run the benchmarks against the current core implementation.
Previous implementation was done in #74066
---
Two potential drawbacks of this implementation are that it increases instruction count and may regress other platforms/architectures. The benches here may also be too artificial to glean much insight from.
https://rust.godbolt.org/z/G9znGfY36
Foundations of location-sensitive polonius
I'd like to land the prototype I'm describing in the [polonius project goal](https://github.com/rust-lang/rust-project-goals/issues/118). It still is incomplete and naive and terrible but it's working "well enough" to consider landing.
I'd also like to make review easier by not opening a huge PR, but have a couple small-ish ones (the +/- line change summary of this PR looks big, but >80% is moving datalog to a single place).
This PR starts laying the foundation for that work:
- it refactors and collects 99% of the old datalog fact gen, which was spread around everywhere, into a single dedicated module. It's still present at 3 small places (one of which we should revert anyways) that are kinda deep within localized components and are not as easily extractable into the rest of fact gen, so it's fine for now.
- starts introducing the localized constraints, the building blocks of the naive way of implementing the location-sensitive analysis in-tree, which is roughly sketched out in https://smallcultfollowing.com/babysteps/blog/2023/09/22/polonius-part-1/ and https://smallcultfollowing.com/babysteps/blog/2023/09/29/polonius-part-2/ but with a different vibe than per-point environments described in these posts, just `r1@p: r2@q` constraints.
- sets up the skeleton of generating these localized constraints: converting NLL typeck constraints, and creating liveness constraints
- introduces the polonius dual to NLL MIR to help development and debugging. It doesn't do much currently but is a way to see these localized constraints: it's an NLL MIR dump + a dumb listing of the constraints, that can be dumped with `-Zdump-mir=polonius -Zpolonius=next`. Its current state is not intended to be a long-term thing, just for testing purposes -- I will replace its contents in the future with a different approach (an HTML+js file where we can more easily explore/filter/trace these constraints and loan reachability, have mermaid graphs of the usual graphviz dumps, etc).
I've started documenting the approach in this PR, I'll add more in the future. It's quite simple, and should be very clear when more constraints are introduced anyways.
r? `@matthewjasper`
Best reviewed per commit so that the datalog move is less bothersome to read, but if you'd prefer we separate that into a different PR, I can do that (and michael has offered to review these more mechanical changes if it'd help).
handle member constraints directly in the mir type checker
cleaner, faster, easier to change going forward :> fixes#109654
r? `@oli-obk` `@compiler-errors`