next solver: provisional cache
this adds the cache removed in #115843. However, it should now correctly track whether a provisional result depends on an inductive or coinductive stack.
While working on this, I was using the following doc: https://hackmd.io/VsQPjW3wSTGUSlmgwrDKOA. I don't think it's too helpful to understanding this, but am somewhat hopeful that the inline comments are more useful.
There are quite a few future perf improvements here. Given that this is already very involved I don't believe it is worth it (for now). While working on this PR one of my few attempts to significantly improve perf ended up being unsound again because I was not careful enough ✨
r? `@compiler-errors`
libtest: Fix padding of benchmarks run as tests
### Summary
The first commit adds regression tests for libtest padding.
The second commit fixes padding for benches run as tests and updates the blessed output of the regression tests to make it clear what effect the fix has on padding.
Closes#104092 which is **E-help-wanted** and **regression-from-stable-to-stable**
### More details
Before this fix we applied padding _before_ manually doing what `convert_benchmarks_to_tests()` does which affects padding calculations. Instead use `convert_benchmarks_to_tests()` first if applicable and then apply padding afterwards so it becomes correct.
Benches should only be padded when run as benches to make it easy to compare the benchmark numbers. Not when run as tests.
r? `@ghost` until CI passes.
bump bootstrap dependencies
This PR removes hard-coded patch versions, updates bootstrap's dependency stack to recent versions (some of the versions were released 3-4 years ago), and removes a few dependencies from bootstrap.
Removed dependencies:
![image](https://github.com/rust-lang/rust/assets/39852038/95e86325-aea0-4055-bee5-245c144f662e)
Reorder early post-inlining passes.
`RemoveZsts`, `RemoveUnneededDrops` and `UninhabitedEnumBranching` only depend on types, so they should be executed together early after MIR inlining introduces those types.
This does not change the end-result, but this makes the pipeline a bit more consistent.
Exhaustiveness: use an `Option` instead of allocating fictitious patterns
In the process of exhaustiveness checking, `Matrix` stores a 2D array of patterns. Those are subpatterns of the patterns we were provided as input, _except_ sometimes we allocate some extra wildcard patterns to fill a hole during specialization.
Morally though, we could store `Option<&'p DeconstructedPat>` in the matrix, where `None` signifies a wildcard. That way we'd only have "real" patterns in the matrix and we wouldn't need the arena to allocate these wildcards. This is what this PR does.
This is part of me splitting up https://github.com/rust-lang/rust/pull/119581 for ease of review.
r? `@compiler-errors`
A more efficient slice comparison implementation for T: !BytewiseEq
(This is a follow up PR on #113654)
This PR changes the implementation for `[T]` slice comparison when `T: !BytewiseEq`. The previous implementation using zip was not optimized properly by the compiler, which didn't leverage the fact that both length were equal. Performance improvements are for example 20% when testing that `[Some(0_u64); 4096].as_slice() == [Some(0_u64); 4096].as_slice()`.
unify query canonicalization mode
Exclude from canonicalization only the static lifetimes that appear in the param env because of #118965 . Any other occurrence can be canonicalized safely AFAICT.
r? `@lcnr`
Support async recursive calls (as long as they have indirection)
Before #101692, we stored coroutine witness types directly inside of the coroutine. That means that a coroutine could not contain itself (as a witness field) without creating a cycle in the type representation of the coroutine, which we detected with the `OpaqueTypeExpander`, which is used to detect cycles when expanding opaque types after that are inferred to contain themselves.
After `-Zdrop-tracking-mir` was stabilized, we no longer store these generator witness fields directly, but instead behind a def-id based query. That means there is no technical obstacle in the compiler preventing coroutines from containing themselves per se, other than the fact that for a coroutine to have a non-infinite layout, it must contain itself wrapped in a layer of allocation indirection (like a `Box`).
This means that it should be valid for this code to work:
```
async fn async_fibonacci(i: u32) -> u32 {
if i == 0 || i == 1 {
i
} else {
Box::pin(async_fibonacci(i - 1)).await
+ Box::pin(async_fibonacci(i - 2)).await
}
}
```
Whereas previously, you'd need to coerce the future to `Pin<Box<dyn Future<Output = ...>>` before `await`ing it, to prevent the async's desugared coroutine from containing itself across as await point.
This PR does two things:
1. Only report an error if an opaque expansion cycle is detected *not* through coroutine witness fields.
* Instead, if we find an opaque cycle through coroutine witness fields, we compute the layout of the coroutine. If that results in a cycle error, we report it as a recursive async fn.
4. Reworks the way we report layout errors having to do with coroutines, to make up for the diagnostic regressions introduced by (1.). We actually do even better now, pointing out the call sites of the recursion!
Adding alignment to the cases to test for specific error messages.
Adding alignment to the list of cases to test for specific error message. Covers `>`, `^` and `<`.
Pinging people who chimed in last time ( https://github.com/rust-lang/rust/pull/106805 ): ``@estebank`` , ``@compiler-errors`` and ``@Nilstrieb``
mir-opt and custom target fixes
From https://github.com/rust-lang/rust/issues/115642#issuecomment-1879589022
> > Could you please test the last two commits from https://github.com/onur-ozkan/rust/commits/panic-abort-mir-opt when you have the time? The first commit should resolve the error of using the nightly flag with a stable compiler, and the second one should resolve the custom target issue.
> I tested with the two commits and the errors of using nightly flag and custom target specs were not seen.
Testing was completed for the test suites like ui, run-pass-valgrind, coverage, mir-opt, codegen, assembly, incremental.
Fixes#115642
Use `assert_unsafe_precondition` for `char::from_u32_unchecked`
Use `assert_unsafe_precondition` in `char::from_u32_unchecked` so that it can be stabilized as `const`.
Add -Zuse-sync-unwind
Currently Rust uses async unwind by default, but async unwind will bring non-negligible size overhead. it would be nice to allow users to choose this.
In addition, async unwind currently prevents LLVM from generate compact unwind for MachO, if one wishes to generate compact unwind for MachO, then also needs this flag.
Rewrite Iterator::position default impl
Storing the accumulating value outside the fold in an attempt to improve code generation has shown speedups on various handwritten benchmarks, see discussion at #119551.
Avoid specialization in the metadata serialization code
With the exception of a perf-only specialization for byte slices and byte vectors.
This uses the same trick of introducing a new trait and having the Encodable and Decodable derives add a bound to it as used for TyEncoder/TyDecoder. The new code is clearer about which encoder/decoder uses which impl and it reduces the dependency of rustc on specialization, making it easier to remove support for specialization entirely or turn it into a construct that is only allowed for perf optimizations if we decide to do this.
Exhaustiveness: Statically enforce revealing of opaques
In https://github.com/rust-lang/rust/pull/116821 it was decided that exhaustiveness should operate on the hidden type of an opaque type when relevant. This PR makes sure we consistently reveal opaques within exhaustiveness. This makes it possible to remove `reveal_opaque_ty` from the `TypeCx` trait which was an unfortunate implementation detail.
r? `@compiler-errors`
Replace a number of FxHashMaps/Sets with stable-iteration-order alternatives
This PR replaces almost all of the remaining `FxHashMap`s in query results with either `FxIndexMap` or `UnordMap`. The only case that is missing is the `EffectiveVisibilities` struct which turned out to not be straightforward to transform. Once that is done too, we can remove the `HashStable` implementation from `HashMap`.
The first commit adds the `StableCompare` trait which is a companion trait to `StableOrd`. Some types like `Symbol` can be compared in a cross-session stable way, but their `Ord` implementation is not stable. In such cases, a `StableCompare` implementation can be provided to offer a lightweight way for stable sorting. The more heavyweight option is to sort via `ToStableHashKey`, but then sorting needs to have access to a stable hashing context and `ToStableHashKey` can also be expensive as in the case of `Symbol` where it has to allocate a `String`.
The rest of the commits are rather mechanical and don't overlap, so they are best reviewed individually.
Part of [MCP 533](https://github.com/rust-lang/compiler-team/issues/533).
Separate immediate and in-memory ScalarPair representation
Currently, we assume that ScalarPair is always represented using a two-element struct, both as an immediate value and when stored in memory.
This currently works fairly well, but runs into problems with https://github.com/rust-lang/rust/pull/116672, where a ScalarPair involving an i128 type can no longer be represented as a two-element struct in memory. For example, the tuple `(i32, i128)` needs to be represented in-memory as `{ i32, [3 x i32], i128 }` to satisfy alignment requirements. Using `{ i32, i128 }` instead will result in the second element being stored at the wrong offset (prior to LLVM 18).
Resolve this issue by no longer requiring that the immediate and in-memory type for ScalarPair are the same. The in-memory type will now look the same as for normal struct types (and will include padding filler and similar), while the immediate type stays a simple two-element struct type. This also means that booleans in immediate ScalarPair are now represented as i1 rather than i8, just like we do everywhere else.
The core change here is to llvm_type (which now treats ScalarPair as a normal struct) and immediate_llvm_type (which returns the two-element struct that llvm_type used to produce). The rest is fixing things up to no longer assume these are the same. In particular, this switches places that try to get pointers to the ScalarPair elements to use byte-geps instead of struct-geps.
internal: Only compare relevant parts in `ide::{runnables,inlay_hints}` tests
This PR limits the data being compared. Therefore the tests should be more readable, as well as being more robust to changes to the data structure.
Part of https://github.com/rust-lang/rust-analyzer/issues/14268.
internal: clean and enhance readability for `generate_delegate_trait`
Continue from #16112
This PR primarily involves some cleanup and simple refactoring work, including:
- Adding numerous comments to layer the code and explain the behavior of each step.
- Renaming some variables to make them more sensible.
- Simplify certain operations using a more elegant approach.
The goal is to make this intricate implementation clearer and facilitate future maintenance.
In addition to this, the PR also removes redundant `path_transform` operations for `type_gen_args`.
Taking the example of `impl Trait<T1> for S<S1>`, where `S1` is considered. The struct `S` must be in the file where the user triggers code actions, so there's no need for the `path_transform`. Furthermore, before performing the transform, we've already renamed `S1`, ensuring it won't clash with existing generics parameters. Therefore, there's no need to transform it.
internal: Speed up import searching some more
Pushes the sorting to the caller, meaning additional filtering can be done pre-sorting. Similarly a collect call was pushed to the caller for allowing some other filters to run pre-collecting.