optimize str.replace
Adds a fast path for str.replace for the ascii to ascii case. This allows for autovectorizing the code. Also should this instead be done with specialization? This way we could remove one branch. I think it is the kind of branch that is easy to predict though.
Benchmark for the fast path (replace all "a" with "b" in the rust wikipedia article, using criterion) :
| N | Speedup | Time New (ns) | Time Old (ns) |
|----------|---------|---------------|---------------|
| 2 | 2.03 | 13.567 | 27.576 |
| 8 | 1.73 | 17.478 | 30.259 |
| 11 | 2.46 | 18.296 | 45.055 |
| 16 | 2.71 | 17.181 | 46.526 |
| 37 | 4.43 | 18.526 | 81.997 |
| 64 | 8.54 | 18.670 | 159.470 |
| 200 | 9.82 | 29.634 | 291.010 |
| 2000 | 24.34 | 81.114 | 1974.300 |
| 20000 | 30.61 | 598.520 | 18318.000 |
| 1000000 | 29.31 | 33458.000 | 980540.000 |
Make destructors on `extern "C"` frames to be executed
This would make the example in #123231 print "Noisy Drop". I didn't mark this as fixing the issue because the behaviour is yet to be spec'ed.
Tracking:
- https://github.com/rust-lang/rust/issues/74990
internal: Use local time when formatting logs
When debugging rust-analyzer and looking at logs, it's much easier to read when the timestamp is in the local timezone.
Before:
2024-08-28T20:55:38.792321Z INFO ParseQuery: invoked at R18460
After:
2024-08-28T13:55:38.792321-07:00 INFO ParseQuery: invoked at R18460
When debugging rust-analyzer and looking at logs, it's much easier to read
when the timestamp is in the local timezone.
Before:
2024-08-28T20:55:38.792321Z INFO ParseQuery: invoked at R18460
After:
2024-08-28T13:55:38.792321-07:00 INFO ParseQuery: invoked at R18460
fix: incorrect autofix for missing wrapped unit in return expr
fix#18298.
We should insert `Ok(())` or `Some(())` instead of wrapping `return` with variants.
minor: `ra-salsa` in `profile.dev.package`
Since `ra-salsa`'s package name is actually `salsa` it makes the following warning in `cargo` commands;
```
warning: profile package spec `ra-salsa` in profile `dev` did not match any packages
```
and the opt level isn't applied to it.
chore: rename `salsa` to `ra_salsa`
Laying some groundwork to start before I import the new Salsa crate. Here's why:
1. As part of the migration, `@darichey,` `@Wilfred,` and I will create new Salsa equivalents of the existing databases/query groups. We'll get them to compile crate-by-crate.
2. Once we wrote all equivalents of all queries, we'd start to refactor usage sites of the vendored Salsa to use the new Salsa databases.
3. Starting porting usage sites of old Salsa to the new Salsa.
4. Remove the vendored `ra_salsa`; declare victory.
internal: switch remaining OpQueues to use named structs
Building atop of https://github.com/rust-lang/rust-analyzer/pull/18195, I switched `GlobalState::fetch_build_data_queue` to use a struct instead of a tuple.
(I didn't switch `fetch_proc_macros_queue` to not return a bool, as the return value is only used in one spot.)
feat: respect references.exclude_tests in call-hierarchy
close#18212
### Changes
1. feat: respect `references.exclude_tests` in call-hierarchy
2. Modified the description of `references.exclude_tests`
fix: Do not consider mutable usage of deref to `*mut T` as deref_mut
Fixes#15799
We are doing some heuristics for deciding whether the given deref is deref or deref_mut here;
5982d9c420/crates/hir-ty/src/infer/mutability.rs (L182-L200)
But this heuristic is erroneous if we are dereferencing to a mut ptr and normally those cases are filtered out here as builtin;
5982d9c420/crates/hir-ty/src/mir/lower/as_place.rs (L165-L177)
Howerver, this works not so well if the given dereferencing is double dereferencings like the case in the #15799.
```rust
struct WrapPtr(*mut u32);
impl core::ops::Deref for WrapPtr {
type Target = *mut u32;
fn deref(&self) -> &Self::Target {
&self.0
}
}
fn main() {
let mut x = 0u32;
let wrap = WrapPtr(&mut x);
unsafe {
**wrap = 6;
}
}
```
Here are two - outer and inner - dereferences here, and the outer dereference is marked as deref_mut because there is an assignment operation.
And this deref_mut marking is propagated into the inner dereferencing.
In the later MIR lowering, the outer dereference is filtered out as it's expr type is `*mut u32`, but the expr type in the inner dereference is an ADT, so this false-mutablility is not filtered out.
This PR cuts propagation of this false mutablilty chain if the expr type is mut ptr.
Since this happens before the resolve_all, it may have some limitations when the expr type is determined as mut ptr at the very end of inferencing, but I couldn't find simple fix for it 🤔
internal: Don't resolve extern crates in import fix point resolution
The fix point loop won't progress them given the potential extern crate candidates are set up at build time.
fix: Join rustfmt overrideCommand with project root
When providing a custom rustfmt command, join it with the project root instead of the workspace root. This fixes rust-analyzer getting the wrong invocation path in projects containing subprojects.
This makes the behaviour consistent with how a custom path provided in rust-analyzer.procMacro.server behaves already.
Resolves issue #18222
feat: Highlight exit points of async blocks
Async blocks act similar to async functions in that the await keywords are related, but also act like functions where the exit points are related.
Fixes#18147
Optimize `escape_ascii` using a lookup table
Based upon my suggestion here: https://github.com/rust-lang/rust/pull/125340#issuecomment-2130441817
Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.
The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)
But, if you want to try my benchmarking code for yourself:
<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>
```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};
const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();
#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 2) };
let mut output = [ascii::Char::Null; N];
output[0] = ascii::Char::ReverseSolidus;
output[1] = a;
(output, 0..2)
}
#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
let mut output = [ascii::Char::Null; N];
let hi = HEX_DIGITS[(byte >> 4) as usize];
let lo = HEX_DIGITS[(byte & 0xf) as usize];
output[0] = ascii::Char::ReverseSolidus;
output[1] = ascii::Char::SmallX;
output[2] = hi;
output[3] = lo;
(output, 0..4)
}
#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 1) };
let mut output = [ascii::Char::Null; N];
output[0] = a;
(output, 0..1)
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
match byte {
b'\t' => backslash(ascii::Char::SmallT),
b'\r' => backslash(ascii::Char::SmallR),
b'\n' => backslash(ascii::Char::SmallN),
b'\\' => backslash(ascii::Char::ReverseSolidus),
b'\'' => backslash(ascii::Char::Apostrophe),
b'\"' => backslash(ascii::Char::QuotationMark),
0x00..=0x1F => hex_escape(byte),
_ => match ascii::Char::from_u8(byte) {
Some(a) => verbatim(a),
None => hex_escape(byte),
},
}
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
/// Lookup table helps us determine how to display character.
///
/// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
/// indicate whether the result is escaped or unescaped.
///
/// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
/// escaped NUL will not occur.
const LOOKUP: [u8; 256] = {
let mut arr = [0; 256];
let mut idx = 0;
loop {
arr[idx as usize] = match idx {
// use 8th bit to indicate escaped
b'\t' => 0x80 | b't',
b'\r' => 0x80 | b'r',
b'\n' => 0x80 | b'n',
b'\\' => 0x80 | b'\\',
b'\'' => 0x80 | b'\'',
b'"' => 0x80 | b'"',
// use NUL to indicate hex-escaped
0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',
_ => idx,
};
if idx == 255 {
break;
}
idx += 1;
}
arr
};
let lookup = LOOKUP[byte as usize];
// 8th bit indicates escape
let lookup_escaped = lookup & 0x80 != 0;
// SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };
if lookup_escaped {
// NUL indicates hex-escaped
if matches!(lookup_ascii, ascii::Char::Null) {
hex_escape(byte)
} else {
backslash(lookup_ascii)
}
} else {
verbatim(lookup_ascii)
}
}
fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
let mut vec = Vec::new();
for b in bytes {
let (buf, range) = f(*b);
vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
}
vec
}
pub fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("escape_ascii");
group.sample_size(1000);
let rand_200k = &mut [0; 200 * 1024];
thread_rng().fill(&mut rand_200k[..]);
let cat = include_bytes!("/bin/cat");
let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");
group.bench_function("old_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
});
group.bench_function("new_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
});
group.bench_function("old_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_old));
});
group.bench_function("new_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_new));
});
group.bench_function("old_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
});
group.bench_function("new_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```
</details>
My benchmark results:
```
escape_ascii/old_rand time: [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
4 (0.40%) high mild
18 (1.80%) high severe
escape_ascii/new_rand time: [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
38 (3.80%) high mild
escape_ascii/old_bin time: [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
17 (1.70%) high mild
22 (2.20%) high severe
escape_ascii/new_bin time: [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
43 (4.30%) high mild
64 (6.40%) high severe
escape_ascii/old_cargo_toml
time: [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
21 (2.10%) high mild
183 (18.30%) high severe
escape_ascii/new_cargo_toml
time: [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
56 (5.60%) high mild
32 (3.20%) high severe
```
Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
Run subprocesses async in vscode extension
Extensions should not block the vscode extension host. Replace uses of `spawnSync` with `spawnAsync`, a convenience wrapper around `spawn`.
These `spawnSync`s are unlikely to cause a real issue in practice, because they spawn very short-lived processes, so we aren't blocking for very long. That said, blocking the extension host is poor practice, and if they _do_ block for too long for whatever reason, vscode becomes useless.
Retire the `unnamed_fields` feature for now
`#![feature(unnamed_fields)]` was implemented in part in #115131 and #115367, however work on that feature has (afaict) stalled and in the mean time there have been some concerns raised (e.g.[^1][^2]) about whether `unnamed_fields` is worthwhile to have in the language, especially in its current desugaring. Because it represents a compiler implementation burden including a new kind of anonymous ADT and additional complication to field selection, and is quite prone to bugs today, I'm choosing to remove the feature.
However, since I'm not one to really write a bunch of words, I'm specifically *not* going to de-RFC this feature. This PR essentially *rolls back* the state of this feature to "RFC accepted but not yet implemented"; however if anyone wants to formally unapprove the RFC from the t-lang side, then please be my guest. I'm just not totally willing to summarize the various language-facing reasons for why this feature is or is not worthwhile, since I'm coming from the compiler side mostly.
Fixes#117942Fixes#121161Fixes#121263Fixes#121299Fixes#121722Fixes#121799Fixes#126969Fixes#131041
Tracking:
* https://github.com/rust-lang/rust/issues/49804
[^1]: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Unnamed.20struct.2Funion.20fields
[^2]: https://github.com/rust-lang/rust/issues/49804#issuecomment-1972619108
disable `download-rustc` if LLVM submodule has changes in CI
We can't use CI rustc while using in-tree LLVM (which happens in LLVM submodule update PRs) and this PR handles that by ignoring CI-rustc in CI and failing in non-CI environments.