coreutils/src/uu/cut/BENCHMARKING.md

## Benchmarking cut

### Performance profile

In normal use cases a significant amount of the total execution time of `cut`
is spent performing I/O. When invoked with the `-f` option (cut fields) some
CPU time is spent on detecting fields (in `Searcher::next`). Other than that
some small amount of CPU time is spent on breaking the input stream into lines.


### How to

When fixing bugs or adding features you might want to compare
performance before and after your code changes.

-   `hyperfine` can be used to accurately measure and compare the total
    execution time of one or more commands.

    ```
    $ cargo build --release --package uu_cut

    $ hyperfine -w3 "./target/release/cut -f2-4,8 -d' ' input.txt" "cut -f2-4,8 -d' ' input.txt"
    ```
    You can put those two commands in a shell script to be sure that you don't
    forget to build after making any changes.

When optimizing or fixing performance regressions seeing the number of times a
function is called, and the amount of time it takes can be useful.

-   `cargo flamegraph` generates flame graphs from function level metrics it records using `perf` or `dtrace`

    ```
    $ cargo flamegraph --bin cut --package uu_cut -- -f1,3-4 input.txt > /dev/null
    ```


### What to benchmark

There are four different performance paths in `cut` to benchmark.

-   Byte ranges `-c`/`--characters` or `-b`/`--bytes` e.g. `cut -c 2,4,6-`
-   Byte ranges with output delimiters e.g. `cut -c 4- --output-delimiter=/`
-   Fields e.g. `cut -f -4`
-   Fields with output delimiters e.g. `cut -f 7-10 --output-delimiter=:`

Choose a test input file with large number of lines so that program startup time does not significantly affect the benchmark.
cut: add BENCHMARKING.md and minor refactoring 2021-04-28 17:58:26 +00:00			`## Benchmarking cut`

			`### Performance profile`

			In normal use cases a significant amount of the total execution time of `cut`
			is spent performing I/O. When invoked with the `-f` option (cut fields) some
			CPU time is spent on detecting fields (in `Searcher::next`). Other than that
			`some small amount of CPU time is spent on breaking the input stream into lines.`


			`### How to`

			`When fixing bugs or adding features you might want to compare`
			`performance before and after your code changes.`

			- `hyperfine` can be used to accurately measure and compare the total
			`execution time of one or more commands.`

			```
			`$ cargo build --release --package uu_cut`

			`$ hyperfine -w3 "./target/release/cut -f2-4,8 -d' ' input.txt" "cut -f2-4,8 -d' ' input.txt"`
			```
			`You can put those two commands in a shell script to be sure that you don't`
			`forget to build after making any changes.`

			`When optimizing or fixing performance regressions seeing the number of times a`
			`function is called, and the amount of time it takes can be useful.`

			- `cargo flamegraph` generates flame graphs from function level metrics it records using `perf` or `dtrace`

			```
			`$ cargo flamegraph --bin cut --package uu_cut -- -f1,3-4 input.txt > /dev/null`
			```


			`### What to benchmark`

			There are four different performance paths in `cut` to benchmark.

			- Byte ranges `-c`/`--characters` or `-b`/`--bytes` e.g. `cut -c 2,4,6-`
			- Byte ranges with output delimiters e.g. `cut -c 4- --output-delimiter=/`
			- Fields e.g. `cut -f -4`
			- Fields with output delimiters e.g. `cut -f 7-10 --output-delimiter=:`

			`Choose a test input file with large number of lines so that program startup time does not significantly affect the benchmark.`