Further improve `space_between`
`space_between` is used by `print_tts` to decide when spaces should be put between tokens. This PR improves it in two ways:
- avoid unnecessary spaces before semicolons, and
- don't omit some necessary spaces before/after some punctuation symbols.
r? `@petrochenkov`
Cache CI Docker images in ghcr registry
This PR changes the way `rust-lang` caches Docker images used in CI workflows. Before, the intermediate Docker layers were manually exported from `docker history` and backed up in S3. However, this approach doesn't work any more with the Docker version used by GitHub Actions since August 2023. We had to revert to disabling Docker BuildKit to make the old caching work, but this workaround will stop working eventually, after GitHub updates Docker again and the old build backend will be removed.
This PR changes the caching to use [Docker caching](https://docs.docker.com/build/cache/) instead. There are several backends for the cache, for our use-case S3 and Docker registry makes sense. This PR uses the Docker registry backend and uses the ghcr.io registry.
The caching creates a Docker image labeled `rust-ci`, which is currently stored to the `ghcr.io/rust-lang-ci` package registry. This image appears [here](https://ghcr.io/rust-lang-ci/rust-ci). The image is stored in `rust-lang-ci` and not `rust-lang`, because `try` and `auto` builds run in the context of that repository, so the used `GITHUB_TOKEN` has permissions for it (unlike for `rust-lang`).
For pull request CI runs, the provided `GITHUB_TOKEN` reduces its permissions automatically to `packages: read`, which means that we won't be able to write the Docker image. If we're not able to write, we won't have anything to read. So I disabled the caching entirely for PR runs (it makes it slightly faster to build the Docker image if we don't have to deal with exporting and using a separate build driver). Note that before this PR, we also weren't able to read or write the cache on PR runs.
Rustup part of this change is [here](https://github.com/rust-lang/rustup/pull/3648).
Related issue: https://github.com/rust-lang/infra-team/issues/81
r? `@Mark-Simulacrum`
Merge into larger interval set
This reduces the work done while merging rows. In at least one case (#50450), we have thousands of union([range], [20,000 ranges]), which previously inserted each of the 20,000 ranges one by one. Now we only insert one range into the right hand set after copying the set over.
This cuts the runtime of the test case in #50450 from ~26 seconds to ~6 seconds locally, though it doesn't change the memory usage peak (~9.5GB).
llvm: change data layout bug to an error and make it trigger more
Fixes#33446.
Don't skip the inconsistent data layout check for custom LLVMs or non-built-in targets.
With #118708, all targets will have a simple test that would trigger this error if LLVM's data layouts do change - so data layouts would be corrected during the LLVM upgrade. Therefore, with builtin targets, this error won't happen with our LLVM because each target will have been confirmed to work. With non-builtin targets, this error is probably useful to have because you can change the data layout in your target and if it is wrong then that could lead to bugs.
When using a custom LLVM, the same justification makes sense for non-builtin targets as with our LLVM, the user can update their target to match their LLVM and that's probably a good thing to do. However, with a custom LLVM, the user cannot change the builtin target data layouts if they don't match - though given that the compiler's data layout is used for layout computation and a bunch of other things - you could get some bugs because of the mismatch and probably want to know about that. I'm not sure if this is something that people do and is okay, but I doubt it?
`CFG_LLVM_ROOT` was also always set during local development with `download-ci-llvm` so this bug would never trigger locally.
In #33446, two points are raised:
- In the issue itself, changing this from a `bug!` to a proper error is what is suggested, by using `isCompatibleDataLayout` from LLVM, but that function still just does the same thing that we do and check for equality, so I've avoided the additional code necessary to do that FFI call.
- `@Mark-Simulacrum` suggests a different check is necessary to maintain backwards compatibility with old LLVM versions. I don't know how often this comes up, but we can do that with some simple string manipulation + LLVM version checks as happens already for LLVM 17 just above this diff.
Add the unstable option to reduce the binary size of dynamic library…
# Motivation
The average length of symbol names in the rust standard library is about 100 bytes, while the average length of symbol names in the C++ standard library is about 65 bytes. In some embedded environments where dynamic library are widely used, rust dynamic library symbol name space hash become one of the key bottlenecks of application, Especially when the existing C/C++ module is reconstructed into the rust module.
The unstable option `-Z symbol_mangling_version=hashed` is added to solve the bottleneck caused by too long dynamic library symbol names.
## Test data
The following is a set of test data on the ubuntu 18.04 LTS environment. With this plug-in, the space saving rate of dynamic libraries can reach about 20%.
The test object is the standard library of rust (built based on Xargo), tokio crate, and hyper crate.
The contents of the Cargo.toml file in the construction project of the three dynamic libraries are as follows:
```txt
# Cargo.toml
[profile.release]
panic = "abort"
opt-leve="z"
codegen-units=1
strip=true
debug=true
```
The built dynamic library also removes the `.rustc` segments that are not needed at run time and then compares the size. The detailed data is as follows:
1. libstd.so
> | symbol_mangling_version | size | saving rate |
> | --- | --- | --- |
> | legacy | 804896 ||
> | hashed | 608288 | 0.244 |
> | v0 | 858144 ||
> | hashed | 608288 | 0.291 |
2. libhyper.so
> | symbol_mangling_version(libhyper.so) | symbol_mangling_version(libstd.so) | size | saving rate |
> | --- | --- | --- | --- |
> | legacy | legacy | 866312 ||
> | hashed | legacy | 645128 |0.255|
> | legacy | hashed | 854024 ||
> | hashed | hashed | 632840 |0.259|
Don't fire `OPAQUE_HIDDEN_INFERRED_BOUND` on sized return of AFIT
Conceptually, we should probably not fire `OPAQUE_HIDDEN_INFERRED_BOUND` for methods like:
```
trait Foo { async fn bar() -> Self; }
```
Even though we technically cannot prove that `Self: Sized`, which is one of the item bounds of the `Output` type in the `-> impl Future<Output = Sized>` from the async desugaring.
This is somewhat justifiable along the same lines as how we allow regular methods to return `-> Self` even though `Self` isn't sized.
Fixes#113538
(side-note: some days i wonder if we should just remove the `OPAQUE_HIDDEN_INFERRED_BOUND` lint... it does make me sad that we have non-well-formed types in signatures, though.)
privacy: Refactor top-level visiting in `NamePrivacyVisitor`
Full hierarchical visiting (`nested_filter::All`) is not necessary, visiting all item-likes in isolation is enough.
Tracking current item is not necessary, passing any `HirId` with the same parent module to `adjust_ident_and_get_scope` is enough.
Follow up to https://github.com/rust-lang/rust/pull/120284.
Remove special-case handling of `vec.split_off(0)`
#76682 added special handling to `Vec::split_off` for the case where `at == 0`. Instead of copying the vector's contents into a freshly-allocated vector and returning it, the special-case code steals the old vector's allocation, and replaces it with a new (empty) buffer with the same capacity.
That eliminates the need to copy the existing elements, but comes at a surprising cost, as seen in #119913. The returned vector's capacity is no longer determined by the size of its contents (as would be expected for a freshly-allocated vector), and instead uses the full capacity of the old vector.
In cases where the capacity is large but the size is small, that results in a much larger capacity than would be expected from reading the documentation of `split_off`. This is especially bad when `split_off` is called in a loop (to recycle a buffer), and the returned vectors have a wide variety of lengths.
I believe it's better to remove the special-case code, and treat `at == 0` just like any other value:
- The current documentation states that `split_off` returns a “newly allocated vector”, which is not actually true in the current implementation when `at == 0`.
- If the value of `at` could be non-zero at runtime, then the caller has already agreed to the cost of a full memcpy of the taken elements in the general case. Avoiding that copy would be nice if it were close to free, but the different handling of capacity means that it is not.
- If the caller specifically wants to avoid copying in the case where `at == 0`, they can easily implement that behaviour themselves using `mem::replace`.
Fixes#119913.
Add portable-atomic-util bug to "bugs found" list
At least, reading https://notgull.net/cautionary-unsafe-tale/ it seems fair to say Miri found this bug. `@notgull` please let me know if you are okay with having this listed here.
Modify GenericArg and Term structs to use strict provenance rules
This is the first PR to solve issue #119217 . In this PR, I have modified the GenericArg struct to use the `NonNull` struct as the pointer instead of `NonZeroUsize`. The change were tested by running `./x test compiler/rustc_middle`.
Resolves https://github.com/rust-lang/rust/issues/119217
r? `@WaffleLapkin`
Replacement of #114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two
This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See #114390 for more discussion.
While I have extended the scope of the power of two optimization from #114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.
Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).
Fixes#47234Resolves#114390
`@Centri3`
`unescape_literal` becomes `unescape_unicode`, and `unescape_c_string`
becomes `unescape_mixed`. Because rfc3349 will mean that C string
literals will no longer be the only mixed utf8 literals.
- Rename it as `MixedUnit`, because it will soon be used in more than
just C string literals.
- Change the `Byte` variant to `HighByte` and use it only for
`\x80`..`\xff` cases. This fixes the old inexactness where ASCII chars
could be encoded with either `Byte` or `Char`.
- Add useful comments.
- Remove `is_ascii`, in favour of `u8::is_ascii`.
Return a finite number of AllocIds per ConstAllocation in Miri
Before this, every evaluation of a const slice would produce a new AllocId. So in Miri, this program used to have unbounded memory use:
```rust
fn main() {
loop {
helper();
}
}
fn helper() {
"ouch";
}
```
Every trip around the loop creates a new AllocId which we need to keep track of a base address for. And the provenance GC can never clean up that AllocId -> u64 mapping, because the AllocId is for a const allocation which will never be deallocated.
So this PR moves the logic of producing an AllocId for a ConstAllocation to the Machine trait, and the implementation that Miri provides will only produce 16 AllocIds for each allocation. The cache is also keyed on the Instance that the const is evaluated in, so that equal consts evaluated in two functions will have disjoint base addresses.
r? RalfJung
Use `assert_unchecked` instead of `assume` intrinsic in the standard library
Now that a public wrapper for the `assume` intrinsic exists, we can use it in the standard library.
CC #119131
Pack u128 in the compiler to mitigate new alignment
This is based on #116672, adding a new `#[repr(packed(8))]` wrapper on `u128` to avoid changing any of the compiler's size assertions. This is needed in two places:
* `SwitchTargets`, otherwise its `SmallVec<[u128; 1]>` gets padded up to 32 bytes.
* `LitKind::Int`, so that entire `enum` can stay 24 bytes.
* This change definitely has far-reaching effects though, since it's public.
Add `#[track_caller]` to the "From implies Into" impl
This pr implements what was mentioned in https://github.com/rust-lang/rust/issues/77474#issuecomment-1074480790
This follows from my URLO https://users.rust-lang.org/t/104497
```rust
#![allow(warnings)]
fn main() {
// Gives a good location
let _: Result<(), Loc> = dbg!(Err::<(), _>(()).map_err(|e| e.into()));
// still doesn't work, gives location of `FnOnce::call_once()`
let _: Result<(), Loc> = dbg!(Err::<(), _>(()).map_err(Into::into));
}
#[derive(Debug)]
pub struct Loc {
pub l: &'static std::panic::Location<'static>,
}
impl From<()> for Loc {
#[track_caller]
fn from(_: ()) -> Self {
Loc {
l: std::panic::Location::caller(),
}
}
}
```
Document some alternatives to `Vec::split_off`
One of the discussion points that came up in #119917 is that some people use `Vec::split_off` in cases where they probably shouldn't, because the alternatives (like `mem::take`) are hard to discover.
This PR adds some suggestions to the documentation of `split_off` that should point people towards alternatives that might be more appropriate for their use-case.
I've deliberately tried to keep these changes as simple and uncontroversial as possible, so that they don't depend on how the team decides to handle the concerns raised in #119917. That's why I haven't touched the existing documentation for `split_off`, and haven't added links to `split_off` to the documentation of other methods.
`rustc_mir_dataflow`: Restore removed exports
Added back previously available exports:
* `Forward`/`Backward`: used when implementing `AnalysisDomain`
* `Engine`: used in user's code to solve the dataflow problem
* `SwitchIntEdgeEffects`: used when implementing functions of the `Analysis` trait
* `graphviz`: potentially useful for debugging purposes
Closes#120130
fix: Drop guard was deallocating with the incorrect size
InPlaceDstBufDrop holds onto the allocation before the shrinking happens which means it must deallocate the destination elements but the source allocation.
Thanks `@cuviper` for spotting this.
Make stable_mir::with_tables sound
See the first commit for the actual soundness fix. The rest is just fallout from that and is entirely safe code. Includes most of #120120
The major difference to #120120 is that we don't need an unsafe trait, as we can now rely on the type system (the only unsafe part, and the actual source of the unsoundness was in `with_tables`)
r? `@celinval`
Implement iterator specialization traits on more adapters
This adds
* `TrustedLen` to `Skip` and `StepBy`
* `TrustedRandomAccess` to `Skip`
* `InPlaceIterable` and `SourceIter` to `Copied` and `Cloned`
The first two might improve performance in the compiler itself since `skip` is used in several places. Constellations that would exercise the last point are probably rare since it would require an owning iterator that has references as Items somewhere in its iterator pipeline.
Improvements for `Skip`:
```
# old
test iter::bench_skip_trusted_random_access ... bench: 8,335 ns/iter (+/- 90)
# new
test iter::bench_skip_trusted_random_access ... bench: 2,753 ns/iter (+/- 27)
```
Rollup of 8 pull requests
Successful merges:
- #116090 (Implement strict integer operations that panic on overflow)
- #118811 (Use `bool` instead of `PartiolOrd` as return value of the comparison closure in `{slice,Iteraotr}::is_sorted_by`)
- #119081 (Add Ipv6Addr::is_ipv4_mapped)
- #119461 (Use an interpreter in MIR jump threading)
- #119996 (Move OS String implementation into `sys`)
- #120015 (coverage: Format all coverage tests with `rustfmt`)
- #120027 (pattern_analysis: Remove `Ty: Copy` bound)
- #120084 (fix(rust-analyzer): use new pkgid spec to compare)
r? `@ghost`
`@rustbot` modify labels: rollup