Remove more `PtrToPtr` casts in GVN
This addresses two things I noticed in MIR:
1. `NonNull::<T>::eq` does `(a as *mut T) == (b as *mut T)`, but it could just compare the `*const T`s, so this removes `PtrToPtr` casts that are on both sides of a pointer comparison, so long as they're not fat-to-thin casts.
2. `NonNull::<T>::addr` does `transmute::<_, usize>(p as *const ())`, but so long as `T: Thin` that cast doesn't do anything, and thus we can directly transmute the `*const T` instead.
r? mir-opt
Various refactorings to rustc_interface
This should make it easier to move the driver interface away from queries in the future. Many custom drivers call queries like `queries.global_ctxt()` before they are supposed to be called, breaking some things like certain `--print` and `-Zunpretty` options, `-Zparse-only` and emitting the dep info at the wrong point in time. They are also not actually necessary at all. Passing around the query output manually would avoid recomputation too and would be just as easy. Removing driver queries would also reduce the amount of global mutable state of the compiler. I'm not removing driver queries in this PR to avoid breaking the aforementioned custom drivers.
transmute size check: properly account for alignment
Fixes another place where ZST alignment was ignored when checking whether something is a newtype. I wonder how many more of these there are...
Fixes https://github.com/rust-lang/rust/issues/101084
Allow constraining opaque types during various unsizing casts
allows unsizing of tuples, arrays and Adts to constraint opaque types in their generic parameters to concrete types on either side of the unsizing cast.
Also allows constraining opaque types during trait object casts that only differ in auto traits or lifetimes.
cc #116652
Save 2 pointers in `TerminatorKind` (96 → 80 bytes)
These things don't need to be `Vec`s; boxed slices are enough.
The frequent one here is call arguments, but MIR building knows the number of arguments from the THIR, so the collect is always getting the allocation right in the first place, and thus this shouldn't ever add the shrink-in-place overhead.
std: refactor the TLS implementation
As discovered by Mara in #110897, our TLS implementation is a total mess. In the past months, I have simplified the actual macros and their expansions, but the majority of the complexity comes from the platform-specific support code needed to create keys and register destructors. In keeping with #117276, I have therefore moved all of the `thread_local_key`/`thread_local_dtor` modules to the `thread_local` module in `sys` and merged them into a new structure, so that future porters of `std` can simply mix-and-match the existing code instead of having to copy the same (bad) implementation everywhere. The new structure should become obvious when looking at `sys/thread_local/mod.rs`.
Unfortunately, the documentation changes associated with the refactoring have made this PR rather large. That said, this contains no functional changes except for two small ones:
* the key-based destructor fallback now, by virtue of sharing the implementation used by macOS and others, stores its list in a `#[thread_local]` static instead of in the key, eliminating one indirection layer and drastically simplifying its code.
* I've switched over ZKVM (tier 3) to use the same implementation as WebAssembly, as the implementation was just a way worse version of that
Please let me know if I can make this easier to review! I know these large PRs aren't optimal, but I couldn't think of any good intermediate steps.
`@rustbot` label +A-thread-locals
Remove confusing `use_polonius` flag and do less cloning
The `use_polonius` flag is both redundant and confusing since every function it's propagated to also checks if `all_facts` is `Some`, the true test of whether to generate Polonius facts for Polonius or for external consumers. This PR makes that path clearer by simply doing away with the argument and handling the logic in precisely two places: where facts are populated (check for `Some`), and where `all_facts` are initialised. It also delays some statements until after that check to avoid the miniscule performance penalty of executing them when Polonius is disabled.
This also addresses `@lqd's` concern in #125652 by reducing the size of what is cloned out of Polonius facts to just the facts being added, as opposed to the entire vector of potential inputs, and added descriptive comments.
*Reviewer note*: the comments in `add_extra_drop_facts` should be inspected by a reviewer, in particular the one on [L#259](https://github.com/rust-lang/rust/compare/master...amandasystems:you-dropped-this-again?expand=1#diff-aa727290e6670264df2face84f012897878e11a70e9c8b156543cfcd9619bac3R259) in this PR, which should be trivial for someone with the right background knowledge to address.
I also included some lints I found on the way there that I couldn't help myself from addressing.
fix: use ItemInNs::Macros to convert ModuleItem to ItemInNs
fix#17425.
When converting `PathResolution` to `ItemInNs`, we should convert `ModuleDef::Macro` to `ItemInNs::Macros` to ensure that it can be found in `DefMap`.
Stop sorting `Span`s' `SyntaxContext`, as that is incompatible with incremental
work towards https://github.com/rust-lang/rust/issues/90317
Luckily no one actually needed these to be sorted, so it didn't even affect diagnostics. I'm guessing they'd have been sorted by creation time anyway, so it wouldn't really have mattered.
r? `@cjgillot`
Replace sort implementations
This PR replaces the sort implementations with tailor-made ones that strike a balance of run-time, compile-time and binary-size, yielding run-time and compile-time improvements. Regressing binary-size for `slice::sort` while improving it for `slice::sort_unstable`. All while upholding the existing soft and hard safety guarantees, and even extending the soft guarantees, detecting strict weak ordering violations with a high chance and reporting it to users via a panic.
* `slice::sort` -> driftsort [design document](https://github.com/Voultapher/sort-research-rs/blob/main/writeup/driftsort_introduction/text.md), includes detailed benchmarks and analysis.
* `slice::sort_unstable` -> ipnsort [design document](https://github.com/Voultapher/sort-research-rs/blob/main/writeup/ipnsort_introduction/text.md), includes detailed benchmarks and analysis.
#### Why should we change the sort implementations?
In the [2023 Rust survey](https://blog.rust-lang.org/2024/02/19/2023-Rust-Annual-Survey-2023-results.html#challenges), one of the questions was: "In your opinion, how should work on the following aspects of Rust be prioritized?". The second place was "Runtime performance" and the third one "Compile Times". This PR aims to improve both.
#### Why is this one big PR and not multiple?
* The current documentation gives performance recommendations for `slice::sort` and `slice::sort_unstable`. If for example only one of them were to be changed, this advice would be misleading for some Rust versions. By replacing them atomically, the advice remains largely unchanged, and users don't have to change their code.
* driftsort and ipnsort share a substantial part of their implementations.
* The implementation of `select_nth_unstable` uses internals of `slice::sort_unstable`, which makes it impractical to split changes.
---
This PR is a collaboration with `@orlp.`
Implement LLVM x86 bmi intrinsics
This implements the intrinsics for both the bmi1 and bmi2 ISA extensions. All of these intrinsics live inside the same namespace as far as LLVM is concerned, which is why it is arguably better to bundle the implementations of these two extensions.
Filter builtin macro expansion
This PR adds a filter on the types of built in macros that are allowed to be expanded.
Currently, This list of allowed macros contains, `stringify, cfg, core_panic, std_panic, concat, concat_bytes, include, include_str, include_bytes, env` and `option_env`.
Fixes#14177
Remove panicbit.cargo extension warning
A warning was introduced regarding the incompatabilities between `rust-analyzer` and `panicbit.cargo`'s diagnostics / `cargo check` functionality.
This functionality has been removed in the latest version of the cargo extension (`0.3.0`), which is why the warning can be removed now.
fix: ensure there are no cycles in the source_root_parent_map
See #17409
We can view the connections between roots as a graph. The problem is that this graph may contain cycles, so when adding edges, it is necessary to check whether it will lead to a cycle.
Since we ensure that each node has at most one outgoing edge (because each SourceRoot can have only one parent), we can use a disjoint-set to maintain the connectivity between nodes. If an edge’s two nodes belong to the same set, they are already connected.
Additionally, this PR includes the following three changes:
1. Removed the workaround from #17409.
2. Added an optimization: If `map.contains_key(&SourceRootId(*root_id as u32))`, we can skip the current loop iteration since we have already found its parent.
3. Modified the inner loop to iterate in reverse order with `roots[..idx].iter().rev()` at line 319. This ensures that if we are looking for the parent of `a/b/c`, and both `a` and `a/b` meet the criteria, we will choose the longer match (`a/b`).
fix(completion): complete async keyword
Fixes#17452
Not entirely confident of the fix here, but my logic is that `async` should in general be offered in similar semantic scenarios as other keywords like `static` or `pub`.