and minor refactoring
1.7 KiB
Benchmarking cut
Performance profile
In normal use cases a significant amount of the total execution time of cut
is spent performing I/O. When invoked with the -f
option (cut fields) some
CPU time is spent on detecting fields (in Searcher::next
). Other than that
some small amount of CPU time is spent on breaking the input stream into lines.
How to
When fixing bugs or adding features you might want to compare performance before and after your code changes.
-
hyperfine
can be used to accurately measure and compare the total execution time of one or more commands.$ cargo build --release --package uu_cut $ hyperfine -w3 "./target/release/cut -f2-4,8 -d' ' input.txt" "cut -f2-4,8 -d' ' input.txt"
You can put those two commands in a shell script to be sure that you don't forget to build after making any changes.
When optimizing or fixing performance regressions seeing the number of times a function is called, and the amount of time it takes can be useful.
-
cargo flamegraph
generates flame graphs from function level metrics it records usingperf
ordtrace
$ cargo flamegraph --bin cut --package uu_cut -- -f1,3-4 input.txt > /dev/null
What to benchmark
There are four different performance paths in cut
to benchmark.
- Byte ranges
-c
/--characters
or-b
/--bytes
e.g.cut -c 2,4,6-
- Byte ranges with output delimiters e.g.
cut -c 4- --output-delimiter=/
- Fields e.g.
cut -f -4
- Fields with output delimiters e.g.
cut -f 7-10 --output-delimiter=:
Choose a test input file with large number of lines so that program startup time does not significantly affect the benchmark.