Support linking to rust dylib with --crate-type staticlib
This allows for example dynamically linking libstd, while statically linking the user crate into an executable or C dynamic library. For this two unstable flags (`-Z staticlib-allow-rdylib-deps` and `-Z staticlib-prefer-dynamic`) are introduced. Without the former you get an error. The latter is the equivalent to `-C prefer-dynamic` for the staticlib crate type to indicate that dynamically linking is preferred when both options are available, like for libstd. Care must be taken to ensure that no crate ends up being merged into two distinct staticlibs that are linked together. Doing so will cause a linker error at best and undefined behavior at worst. In addition two distinct staticlibs compiled by different rustc may not be combined under any circumstances due to some rustc private symbols not being mangled.
To successfully link a staticlib, `--print native-static-libs` can be used while compiling to ask rustc for the linker flags necessary when linking the staticlib. This is an existing flag which previously only listed native libraries. It has been extended to list rust dylibs too. Trying to locate libstd yourself to link against it is not supported and may break if for example the libstd of multiple rustc versions are put in the same directory.
For an example on how to use this see the `src/test/run-make-fulldeps/staticlib-dylib-linkage/` test.
Start using `windows sys` for Windows FFI bindings in std
Switch to using windows-sys for FFI. In order to avoid some currently contentious issues, this uses windows-bindgen to generate a smaller set of bindings instead of using the full crate.
Unlike the windows-sys crate, the generated bindings uses `*mut c_void` for handle types instead of `isize`. This to sidestep opsem concerns about mixing pointer types and integers between languages. Note that `SOCKET` remains defined as an integer but instead of being a usize, it's changed to fit the [standard library definition](a41fc00eaf/library/std/src/os/windows/raw.rs (L12-L16)):
```rust
#[cfg(target_pointer_width = "32")]
pub type SOCKET = u32;
#[cfg(target_pointer_width = "64")]
pub type SOCKET = u64;
```
The generated bindings also customizes the `#[link]` imports. I hope to switch to using raw-dylib but I don't want to tie that too closely with the switch to windows-sys.
---
Changes outside of the bindings are, for the most part, fairly minimal (e.g. some differences in `*mut` vs. `*const` or a few types differ). One issue is that our own bindings sometimes mix in higher level types, like `BorrowedHandle`. This is pretty adhoc though.
Add `#[inline]` to functions that are never called
This makes libcore binary size reduce by ~300 bytes. Not much, but these functions are never called so it doesn't make sense for them to get into the binary anyway.
Always const-evaluate the GCD in `slice::align_to_offsets`
Use an inline `const`-block to force the compiler to calculate the GCD at compile time, even in debug mode. This shouldn't affect the behavior of the program at all, but it drastically cuts down on the number of instructions emitted with optimizations disabled.
With the current implementation, a single `slice::align_to` instantiation (specifically `<[u8]>::align_to::<u128>()`) generates 676 instructions (on x86-64). Forcing the GCD computation to be const cuts it down to 327 instructions, so just over 50% less. This is obviously not representative of actual runtime gains, but I still see it as a significant win as long as it doesn't degrade compile times.
Not having to worry about LLVM const-evaluating the GCD function also allows it to use the textbook recursive euclidean algorithm instead of a much more complicated iterative implementation with multiple `unsafe`-blocks.
Disable nrvo mir opt
See #111005 and #110902 . The ICE can definitely be hit on stable, the miscompilation I'm not sure about. The pass makes some pretty sketchy assumptions though, and we should not have it on while that's the case.
I'm not going to work on actually fixing this, it's probably not excessively difficult though.
r? rust-lang/mir-opt
ConstProp into PlaceElem::Index.
Noticed this while looking at keccak output MIR.
This pass aims to replace `ProjectionElem::Index` with `ProjectionElem::ConstantIndex` during ConstProp.
r? `@ghost`
Mark s390x condition code register as clobbered in inline assembly
Various s390x instructions (arithmetic operations, logical operations, comparisons, etc. see also "Condition Codes" section in [z/Architecture Reference Summary](https://www.ibm.com/support/pages/zarchitecture-reference-summary)) modify condition code register `cc`, but AFAIK there is currently no way to mark it as clobbered in `asm!`.
`cc` register definition in LLVM:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td#L320
This PR also updates asm_experimental_arch docs in the unstable-book to mention s390x registers.
cc `@uweigand`
r? `@Amanieu`
Remove `identity_future` from stdlib
This function/lang_item was introduced in #104321 as a temporary workaround of future lowering. The usage and need for it went away in #104833.
After a bootstrap update, the function itself can be removed from `std`.
Update max_atomic_width of armv7r and armv7_sony_vita targets to 64.
All armv7a and armv7r implementations support `ldrexd`/`strexd`, only armv7m does not.
enable `rust_2018_idioms` lint group for doctests
With this change, `rust_2018_idioms` lint group will be enabled for compiler/libstd doctests.
Resolves#106086Resolves#99144
Signed-off-by: ozkanonur <work@onurozkan.dev>
This function/lang_item was introduced in #104321 as a temporary workaround of future lowering.
The usage and need for it went away in #104833.
After a bootstrap update, the function itself can be removed from `std`.
Update the version of musl used on `*-linux-musl` targets to 1.2.3
Update the version of musl used on our Linux musl targets from 1.1.24 to 1.2.3 as proposed in rust-lang/compiler-team#572. musl 1.2.3 is the latest version of musl and supports the same range of Linux kernels as the 1.1 series. As such, it does not affect the minimum supported version of Linux for any of the musl targets.
One of the major musl 1.2 features is support for [time64](https://musl.libc.org/time64.html). This support is both source and ABI compatible with programs built against musl 1.1 and so updating the musl version for these targets should not cause Rust programs to fail to run or compile (a [crater run](https://github.com/rust-lang/rust/pull/107129#issuecomment-1407196104) has been completed which demonstrates this for the `i686-unknown-linux-musl` target).
Once this change reaches stable, the `libc` crate will then be able to [update their definitions to support 64-bit time](https://github.com/rust-lang/libc/pull/3068), matching the default musl 1.2 APIs exactly.
Fixes#91178
ci: upgrade and refactor crosstool-ng builders
The first commit upgrades our builders from crosstool-ng 1.24.0 to 1.25.0. There are otherwise no changes intended to the toolchains we're using, but there are some minor version upgrades as a result, especially GCC 8.3.0 to 8.5.0. The newer crosstool-ng will position us well to make toolchain upgrades in the future though, as we were maxed out before and it now goes up to GCC 11.
The second commit refactors our config management to only commit a "mini-defconfig" for each target, produced by `ct-ng savedefconfig`. This makes it much clearer which settings we're actually changing, and also makes it easier to ensure consistency for things like mirror management.
Optimize builder sizes
The infra-team is continuously monitoring the efficiency of the CI system in an effort to improve overall build times and resource usage. Some builders have used much less than their allocated resources, so we are testing smaller builder sizes for them.
r? `@pietroalbini`
Don't validate constants in const propagation
Validation is neither necessary nor desirable.
The constant validation is already omitted at mir-opt-level >= 3, so there there are not changes in MIR test output (the propagation of invalid constants is covered by an existing test in tests/mir-opt/const_prop/invalid_constant.rs).
Parallelize initial Rust download in bootstrap
Parallelize the initial download of Rust in `bootstrap.py`
`time ./x.py --help` after `rm -r build`
Before: 33s
After: 27s
`inline(always)` for `lt`/`le`/`ge`/`gt` on integers and floats
I happened to notice one of these not getting inlined as part of `Range::next` in <https://rust.godbolt.org/z/4WKWWxj1G>
```rust
bb1: {
StorageLive(_5);
_6 = &mut _4;
StorageLive(_21);
StorageLive(_14);
StorageLive(_15);
_15 = &((*_6).0: usize);
StorageLive(_16);
_16 = &((*_6).1: usize);
_14 = <usize as PartialOrd>::lt(move _15, move _16) -> bb7;
}
```
So since a call for something that's just one instruction is never the right choice, `#[inline(always)]` seems appropriate, like we have it on things like the rotate methods on integers.
Remove `QueryEngine` trait
This removes the `QueryEngine` trait and `Queries` from `rustc_query_impl` and replaced them with function pointers and fields in `QuerySystem`. As a side effect `OnDiskCache` is moved back into `rustc_middle` and the `OnDiskCache` trait is also removed.
This has a couple of benefits.
- `TyCtxt` is used in the query system instead of the removed `QueryCtxt` which is larger.
- Function pointers are more flexible to work with. A variant of https://github.com/rust-lang/rust/pull/107802 is included which avoids the double indirection. For https://github.com/rust-lang/rust/pull/108938 we can name entry point `__rust_end_short_backtrace` to avoid some overhead. For https://github.com/rust-lang/rust/pull/108062 it avoids the duplicate `QueryEngine` structs.
- `QueryContext` now implements `DepContext` which avoids many `dep_context()` calls in `rustc_query_system`.
- The `rustc_driver` size is reduced by 0.33%, hopefully that means some bootstrap improvements.
- This avoids the unsafe code around the `QueryEngine` trait.
r? `@cjgillot`
Improve niche placement by trying two strategies and picking the better result
Fixes#104807Fixes#105371
Determining which sort order is better requires calculating the struct size (so we can calculate the niche offset). But that in turn depends on the field order, so happens after sorting. So the simple way to solve that is to run the whole thing twice and pick the better result.
1st commit is just code motion, the meat is in the later ones.
Use MIR's `Offset` for pointer `add` too
~~Status: draft while waiting for #110822 to land, since this is built atop that.~~
~~r? `@ghost~~`
Canonical Rust code has mostly moved to `add`/`sub` on pointers, which take `usize`, instead of `offset` which takes `isize`. (And, relatedly, when `sub_ptr` was added it turned out it replaced every single in-tree use of `offset_from`, because `usize` is just so much more useful than `isize` in Rust.)
Unfortunately, `intrinsics::offset` could only accept `*const` and `isize`, so there's a *huge* amount of type conversions back and forth being done. They're identity conversions in the backend, but still end up producing quite a lot of unhelpful MIR.
This PR changes `intrinsics::offset` to accept `*const` *and* `*mut` along with `isize` *and* `usize`. Conveniently, the backends and CTFE already handle this, since MIR's `BinOp::Offset` [already supports all four combinations](adaac6b166/compiler/rustc_const_eval/src/transform/validate.rs (L523-L528)).
To demonstrate the difference, I added some `mir-opt/pre-codegen/` tests around slice indexing. Here's the difference to `[T]::get_mut`, since it uses `<*mut _>::add` internally:
```diff
`@@` -79,30 +70,21 `@@` fn slice_get_mut_usize(_1: &mut [u32], _2: usize) -> Option<&mut u32> {
StorageLive(_12); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageLive(_9); // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
_9 = _8 as *mut u32 (PtrToPtr); // scope 11 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_13); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _13 = _2 as isize (IntToInt); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_14); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_15); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _15 = _9 as *const u32 (Pointer(MutToConstPointer)); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _14 = Offset(move _15, _13); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_15); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _7 = move _14 as *mut u32 (PtrToPtr); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_14); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_13); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
+ _7 = Offset(_9, _2); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
StorageDead(_9); // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageDead(_12); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageDead(_11); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
```
1c1c8e442a (diff-a841b6a4538657add3f39bc895744331453d0625e7aace128b1f604f0b63c8fdR80)
`remove_dir_all`: try deleting the directory even if `FILE_LIST_DIRECTORY` access is denied
If opening a directory with `FILE_LIST_DIRECTORY` access fails then we should try opening without requesting that access. We may still be able to delete it if it's empty or a link.
Fixes https://github.com/rust-lang/cargo/issues/12042
Add some missing built-in lints
(and also sort them, so this is best reviewed one commit at a time)
Fixes#110911
I wonder if there's a good way to detect when a lint is built-in (i.e. not associated to a lint pass). If so, it needs to be added to this list, or else we're unable to `allow` or `deny` it. Leaving that for future work, I guess...
Migrate trivially translatable `rustc_parse` diagnostics
cc #100717
Migrate diagnostics in `rustc_parse` which are emitted in a single statement. I worked on this by expanding the lint introduced in #108760, although that isn't included here as there is much more work to be done to satisfy it
More core::fmt::rt cleanup.
- Removes the `V1` suffix from the `Argument` and `Flag` types.
- Moves more of the format_args lang items into the `core::fmt::rt` module. (The only remaining lang item in `core::fmt` is `Arguments` itself, which is a public type.)
Part of https://github.com/rust-lang/rust/issues/99012
Follow-up to https://github.com/rust-lang/rust/pull/110616
Move the WorkerLocal type from the rustc-rayon fork into rustc_data_structures
This PR moves the definition of the `WorkerLocal` type from `rustc-rayon` into `rustc_data_structures`. This is enabled by the introduction of the `Registry` type which allows you to group up threads to be used by `WorkerLocal` which is basically just an array with an per thread index. The `Registry` type mirrors the one in Rayon and each Rayon worker thread is also registered with the new `Registry`. Safety for `WorkerLocal` is ensured by having it keep a reference to the registry and checking on each access that we're still on the group of threads associated with the registry used to construct it.
Accessing a `WorkerLocal` is micro-optimized due to it being hot since it's used for most arena allocations.
Performance is slightly improved for the parallel compiler:
<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.9992s</td><td align="right">1.9949s</td><td align="right"> -0.21%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2977s</td><td align="right">0.2970s</td><td align="right"> -0.22%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">1.1335s</td><td align="right">1.1315s</td><td align="right"> -0.18%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.8235s</td><td align="right">1.8171s</td><td align="right"> -0.35%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">6.9047s</td><td align="right">6.8930s</td><td align="right"> -0.17%</td></tr><tr><td>Total</td><td align="right">12.1586s</td><td align="right">12.1336s</td><td align="right"> -0.21%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9977s</td><td align="right"> -0.23%</td></tr></table>
cc `@SparrowLii`
Revert "Download the GCC sources insecurely"
This reverts commit 3da037f82988b8b3aca2ce13c5c81ba975923cab.
This workaround was added after TLS problems with Debian 6 were noted in <https://github.com/rust-lang/rust/pull/86586#issuecomment-868355356>, but we should be well past that since #95026, where our oldest images are now based on CentOS 7.
Rewrite MemDecoder around pointers not a slice
This is basically https://github.com/rust-lang/rust/pull/109910 but I'm being a lot more aggressive. The pointer-based structure means that it makes a lot more sense to absorb more complexity into `MemDecoder`, most of the diff is just complexity moving from one place to another.
The primary argument for this structure is that we only incur a single bounds check when doing multi-byte reads from a `MemDecoder`. With the slice-based implementation we need to do those with `data[position..position + len]` , which needs to account for `position + len` wrapping. It would be possible to dodge the first bounds check if we stored a slice that starts at `position`, but that would require updating the pointer and length on every read.
This PR also embeds the failure path in a separate function, which means that this PR should subsume all the perf wins observed in https://github.com/rust-lang/rust/pull/109867.
Add shortcut for Grisu3 algorithm.
While Grisu3 is way more faster for most numbers compare to Dragon4, the fall back to Dragon4 procedure for certain numbers could cause some performance regressions compare to use Dragon4 directly. Mitigating the regression caused by falling back is important for a largely used core library.
In Grisu3 algorithm implementation, there's a shortcut to jump out earlier when the fractional or integrals cannot meet the requirement of requested digits. This could significantly improve the performance of converting floating number to string as it falls back even without starting trying the algorithm.
The original idea is from the [.NET implementation](https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/Number.Grisu3.cs#L602-L615) and the code was originally added in [this PR](https://github.com/dotnet/coreclr/pull/14646#issuecomment-350942050). This shortcut has been shipped long time ago and has been proved working.
Fix#110129
Break up long function in trait selection error reporting + clean up nearby code
- Move blocks of code into their own functions
- Replace a few function argument types with their type aliases
- Create "AppendConstMessage" enum to replace a nested `Option`.
Allow older LLVM versions to have missing components
This check was introduced by #77280 to ensure that all tests that are filtered by LLVM component are actually tested in CI. However this causes issues for new targets (e.g. #101069) where support is only available on the latest LLVM version.
This PR restricts the tests to only CI jobs that use the latest LLVM version.