mirror of
https://github.com/uutils/coreutils
synced 2024-12-14 15:22:38 +00:00
735db78b3d
* wc: specialize scanning loop on settings. The primary computational loop in wc (iterating over all the characters and computing word lengths, etc) is configured by a number of boolean options that control the text-scanning behavior. If we monomorphize the code loop for each possible combination of scanning configurations, the rustc is able to generate better code for each instantiation, at the least by removing the conditional checks on each iteration, and possibly by allowing things like vectorization. On my computer (aarch64/macos), I am seeing at least a 5% performance improvement in release builds on all wc flag configurations (other than those that were already specialized) against odyssey1024.txt, with wc -l showing the greatest improvement at 15%. * Reduce the size of the wc dispatch table by half. By extracting the handling of hand-written fast-paths to the same dispatch as the automatic specializations, we can avoid needing to pass `show_bytes` as a const generic to `word_count_from_reader_specialized`. Eliminating this parameter halves the number of arms in the dispatch. |
||
---|---|---|
.. | ||
benches/factor | ||
by-util | ||
common | ||
fixtures | ||
test_util_name.rs | ||
tests.rs |