Move predicate, region, and const stuff into their own modules in middle
This PR mostly moves things around, and in a few cases adds some `ty::` to the beginning of names to avoid one-off imports.
I don't mean this to be the most *thorough* move/refactor. I just generally wanted to begin to split up `ty/mod.rs` and `ty/sty.rs` which are huge and hard to distinguish, and have a lot of non-ty stuff in them.
r? lcnr
CI: Use ninja on apple builders
This switches the apple builders to use ninja when building LLVM. My hope is that this should resolve the timeouts we have been experiencing since December. My theory is that something in the image update from [Dec 20](dec20a5272) is causing an issue with our build (or, perhaps more remotely, some [update to LLVM itself](https://github.com/rust-lang/rust/commits/master/src/llvm-project)).
The symptoms are that during the LLVM build it just hangs just before the install step. The last thing it prints is `[100%] Built target llvm-reduce` and then just hangs. Normally the next part should be `Install the project...` where it starts installing LLVM. I'm able to reproduce this without too much difficulty. I've been testing ninja, and it seems to be working better (however, my test isn't quite equivalent, since I'm getting sccache misses, and I can't update the S3 bucket).
Installing ninja takes about 7 to 10 seconds, so it shouldn't impact things. I can't determine if it will affect the overall build timing due to not being able to test with a warm S3 cache.
Bump Fuchsia, build tests, and use 8 core bots
- Build Fuchsia on 8 cores instead of 16
- Skip building cranelift for Fuchsia
- Bump Fuchsia (includes building tests)
This includes a change to the upstream build_fuchsia_from_rust_ci script that builds a minimal set of tests, to improve coverage on this builder. This would have caught https://github.com/rust-lang/rust-clippy/issues/11952 and #119593.
See prior discussion on #119400 about building on 8 cores instead of 16. This PR combines changes from that and #119399, plus clean up.
r? `@Mark-Simulacrum`
Further improve `space_between`
`space_between` is used by `print_tts` to decide when spaces should be put between tokens. This PR improves it in two ways:
- avoid unnecessary spaces before semicolons, and
- don't omit some necessary spaces before/after some punctuation symbols.
r? `@petrochenkov`
Cache CI Docker images in ghcr registry
This PR changes the way `rust-lang` caches Docker images used in CI workflows. Before, the intermediate Docker layers were manually exported from `docker history` and backed up in S3. However, this approach doesn't work any more with the Docker version used by GitHub Actions since August 2023. We had to revert to disabling Docker BuildKit to make the old caching work, but this workaround will stop working eventually, after GitHub updates Docker again and the old build backend will be removed.
This PR changes the caching to use [Docker caching](https://docs.docker.com/build/cache/) instead. There are several backends for the cache, for our use-case S3 and Docker registry makes sense. This PR uses the Docker registry backend and uses the ghcr.io registry.
The caching creates a Docker image labeled `rust-ci`, which is currently stored to the `ghcr.io/rust-lang-ci` package registry. This image appears [here](https://ghcr.io/rust-lang-ci/rust-ci). The image is stored in `rust-lang-ci` and not `rust-lang`, because `try` and `auto` builds run in the context of that repository, so the used `GITHUB_TOKEN` has permissions for it (unlike for `rust-lang`).
For pull request CI runs, the provided `GITHUB_TOKEN` reduces its permissions automatically to `packages: read`, which means that we won't be able to write the Docker image. If we're not able to write, we won't have anything to read. So I disabled the caching entirely for PR runs (it makes it slightly faster to build the Docker image if we don't have to deal with exporting and using a separate build driver). Note that before this PR, we also weren't able to read or write the cache on PR runs.
Rustup part of this change is [here](https://github.com/rust-lang/rustup/pull/3648).
Related issue: https://github.com/rust-lang/infra-team/issues/81
r? `@Mark-Simulacrum`
Merge into larger interval set
This reduces the work done while merging rows. In at least one case (#50450), we have thousands of union([range], [20,000 ranges]), which previously inserted each of the 20,000 ranges one by one. Now we only insert one range into the right hand set after copying the set over.
This cuts the runtime of the test case in #50450 from ~26 seconds to ~6 seconds locally, though it doesn't change the memory usage peak (~9.5GB).
llvm: change data layout bug to an error and make it trigger more
Fixes#33446.
Don't skip the inconsistent data layout check for custom LLVMs or non-built-in targets.
With #118708, all targets will have a simple test that would trigger this error if LLVM's data layouts do change - so data layouts would be corrected during the LLVM upgrade. Therefore, with builtin targets, this error won't happen with our LLVM because each target will have been confirmed to work. With non-builtin targets, this error is probably useful to have because you can change the data layout in your target and if it is wrong then that could lead to bugs.
When using a custom LLVM, the same justification makes sense for non-builtin targets as with our LLVM, the user can update their target to match their LLVM and that's probably a good thing to do. However, with a custom LLVM, the user cannot change the builtin target data layouts if they don't match - though given that the compiler's data layout is used for layout computation and a bunch of other things - you could get some bugs because of the mismatch and probably want to know about that. I'm not sure if this is something that people do and is okay, but I doubt it?
`CFG_LLVM_ROOT` was also always set during local development with `download-ci-llvm` so this bug would never trigger locally.
In #33446, two points are raised:
- In the issue itself, changing this from a `bug!` to a proper error is what is suggested, by using `isCompatibleDataLayout` from LLVM, but that function still just does the same thing that we do and check for equality, so I've avoided the additional code necessary to do that FFI call.
- `@Mark-Simulacrum` suggests a different check is necessary to maintain backwards compatibility with old LLVM versions. I don't know how often this comes up, but we can do that with some simple string manipulation + LLVM version checks as happens already for LLVM 17 just above this diff.
Add the unstable option to reduce the binary size of dynamic library…
# Motivation
The average length of symbol names in the rust standard library is about 100 bytes, while the average length of symbol names in the C++ standard library is about 65 bytes. In some embedded environments where dynamic library are widely used, rust dynamic library symbol name space hash become one of the key bottlenecks of application, Especially when the existing C/C++ module is reconstructed into the rust module.
The unstable option `-Z symbol_mangling_version=hashed` is added to solve the bottleneck caused by too long dynamic library symbol names.
## Test data
The following is a set of test data on the ubuntu 18.04 LTS environment. With this plug-in, the space saving rate of dynamic libraries can reach about 20%.
The test object is the standard library of rust (built based on Xargo), tokio crate, and hyper crate.
The contents of the Cargo.toml file in the construction project of the three dynamic libraries are as follows:
```txt
# Cargo.toml
[profile.release]
panic = "abort"
opt-leve="z"
codegen-units=1
strip=true
debug=true
```
The built dynamic library also removes the `.rustc` segments that are not needed at run time and then compares the size. The detailed data is as follows:
1. libstd.so
> | symbol_mangling_version | size | saving rate |
> | --- | --- | --- |
> | legacy | 804896 ||
> | hashed | 608288 | 0.244 |
> | v0 | 858144 ||
> | hashed | 608288 | 0.291 |
2. libhyper.so
> | symbol_mangling_version(libhyper.so) | symbol_mangling_version(libstd.so) | size | saving rate |
> | --- | --- | --- | --- |
> | legacy | legacy | 866312 ||
> | hashed | legacy | 645128 |0.255|
> | legacy | hashed | 854024 ||
> | hashed | hashed | 632840 |0.259|
Don't fire `OPAQUE_HIDDEN_INFERRED_BOUND` on sized return of AFIT
Conceptually, we should probably not fire `OPAQUE_HIDDEN_INFERRED_BOUND` for methods like:
```
trait Foo { async fn bar() -> Self; }
```
Even though we technically cannot prove that `Self: Sized`, which is one of the item bounds of the `Output` type in the `-> impl Future<Output = Sized>` from the async desugaring.
This is somewhat justifiable along the same lines as how we allow regular methods to return `-> Self` even though `Self` isn't sized.
Fixes#113538
(side-note: some days i wonder if we should just remove the `OPAQUE_HIDDEN_INFERRED_BOUND` lint... it does make me sad that we have non-well-formed types in signatures, though.)
privacy: Refactor top-level visiting in `NamePrivacyVisitor`
Full hierarchical visiting (`nested_filter::All`) is not necessary, visiting all item-likes in isolation is enough.
Tracking current item is not necessary, passing any `HirId` with the same parent module to `adjust_ident_and_get_scope` is enough.
Follow up to https://github.com/rust-lang/rust/pull/120284.
Remove special-case handling of `vec.split_off(0)`
#76682 added special handling to `Vec::split_off` for the case where `at == 0`. Instead of copying the vector's contents into a freshly-allocated vector and returning it, the special-case code steals the old vector's allocation, and replaces it with a new (empty) buffer with the same capacity.
That eliminates the need to copy the existing elements, but comes at a surprising cost, as seen in #119913. The returned vector's capacity is no longer determined by the size of its contents (as would be expected for a freshly-allocated vector), and instead uses the full capacity of the old vector.
In cases where the capacity is large but the size is small, that results in a much larger capacity than would be expected from reading the documentation of `split_off`. This is especially bad when `split_off` is called in a loop (to recycle a buffer), and the returned vectors have a wide variety of lengths.
I believe it's better to remove the special-case code, and treat `at == 0` just like any other value:
- The current documentation states that `split_off` returns a “newly allocated vector”, which is not actually true in the current implementation when `at == 0`.
- If the value of `at` could be non-zero at runtime, then the caller has already agreed to the cost of a full memcpy of the taken elements in the general case. Avoiding that copy would be nice if it were close to free, but the different handling of capacity means that it is not.
- If the caller specifically wants to avoid copying in the case where `at == 0`, they can easily implement that behaviour themselves using `mem::replace`.
Fixes#119913.
Add portable-atomic-util bug to "bugs found" list
At least, reading https://notgull.net/cautionary-unsafe-tale/ it seems fair to say Miri found this bug. `@notgull` please let me know if you are okay with having this listed here.
Modify GenericArg and Term structs to use strict provenance rules
This is the first PR to solve issue #119217 . In this PR, I have modified the GenericArg struct to use the `NonNull` struct as the pointer instead of `NonZeroUsize`. The change were tested by running `./x test compiler/rustc_middle`.
Resolves https://github.com/rust-lang/rust/issues/119217
r? `@WaffleLapkin`
Replacement of #114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two
This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See #114390 for more discussion.
While I have extended the scope of the power of two optimization from #114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.
Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).
Fixes#47234Resolves#114390
`@Centri3`
`unescape_literal` becomes `unescape_unicode`, and `unescape_c_string`
becomes `unescape_mixed`. Because rfc3349 will mean that C string
literals will no longer be the only mixed utf8 literals.
- Rename it as `MixedUnit`, because it will soon be used in more than
just C string literals.
- Change the `Byte` variant to `HighByte` and use it only for
`\x80`..`\xff` cases. This fixes the old inexactness where ASCII chars
could be encoded with either `Byte` or `Char`.
- Add useful comments.
- Remove `is_ascii`, in favour of `u8::is_ascii`.
Use upstream exhaustiveness checker!
Because it has been librarified!
The extra `Apache-2.0 WITH LLVM-exception` license is for `rustc_apfloat`. Also this duplicates `rustc_index` because the other upstream deps are still on an earlier version. They should be bumpable now though. Good thing is that we don't need this new crate to be synchronized with the others, which will make our lives easier.
Return a finite number of AllocIds per ConstAllocation in Miri
Before this, every evaluation of a const slice would produce a new AllocId. So in Miri, this program used to have unbounded memory use:
```rust
fn main() {
loop {
helper();
}
}
fn helper() {
"ouch";
}
```
Every trip around the loop creates a new AllocId which we need to keep track of a base address for. And the provenance GC can never clean up that AllocId -> u64 mapping, because the AllocId is for a const allocation which will never be deallocated.
So this PR moves the logic of producing an AllocId for a ConstAllocation to the Machine trait, and the implementation that Miri provides will only produce 16 AllocIds for each allocation. The cache is also keyed on the Instance that the const is evaluated in, so that equal consts evaluated in two functions will have disjoint base addresses.
r? RalfJung