mirror of
https://github.com/uutils/coreutils
synced 2024-12-13 23:02:38 +00:00
dd: add BENCHMARKING instructions
This commit is contained in:
parent
a186adbff1
commit
881f0c3d06
1 changed files with 62 additions and 0 deletions
62
src/uu/dd/BENCHMARKING.md
Normal file
62
src/uu/dd/BENCHMARKING.md
Normal file
|
@ -0,0 +1,62 @@
|
|||
# Benchmarking dd
|
||||
|
||||
`dd` is a utility used for copying and converting files. It is often used for
|
||||
writing directly to devices, such as when writing an `.iso` file directly to a
|
||||
drive.
|
||||
|
||||
## Understanding dd
|
||||
|
||||
At the core, `dd` has a simple loop of operation. It reads in `blocksize` bytes
|
||||
from an input, optionally performs a conversion on the bytes, and then writes
|
||||
`blocksize` bytes to an output file.
|
||||
|
||||
In typical usage, the performance of `dd` is dominated by the speed at which it
|
||||
can read or write to the filesystem. For those scenarios it is best to optimize
|
||||
the blocksize for the performance of the devices being read/written to. Devices
|
||||
typically have an optimal block size that they work best at, so for maximum
|
||||
performance `dd` should be using a block size, or multiple of the block size,
|
||||
that the underlying devices prefer.
|
||||
|
||||
For benchmarking `dd` itself we will use fast special files provided by the
|
||||
operating system that work out of RAM, `/dev/zero` and `/dev/null`. This reduces
|
||||
the time taken reading/writing files to a minimum and maximises the percentage
|
||||
time we spend in the `dd` tool itself, but care still needs to be taken to
|
||||
understand where we are benchmarking the `dd` tool and where we are just
|
||||
benchmarking memory performance.
|
||||
|
||||
The main parameter to vary for a `dd` benchmark is the blocksize, but benchmarks
|
||||
testing the conversions that are supported by `dd` could also be interesting.
|
||||
|
||||
`dd` has a convenient `count` argument, that will copy `count` blocks of data
|
||||
from the input to the output, which is useful for benchmarking.
|
||||
|
||||
## Blocksize Benchmarks
|
||||
|
||||
When measuring the impact of blocksize on the throughput, we want to avoid
|
||||
testing the startup time of `dd`. `dd` itself will give a report on the
|
||||
throughput speed once complete, but it's probably better to use an external
|
||||
tool, such as `hyperfine` to measure the performance.
|
||||
|
||||
Benchmarks should be sized so that they run for a handful of seconds at a
|
||||
minimum to avoid measuring the startup time unnecessarily. The total time will
|
||||
be roughly equivalent to the total bytes copied (`blocksize` x `count`).
|
||||
|
||||
Some useful invocations for testing would be the following:
|
||||
|
||||
```
|
||||
hyperfine "./target/release/dd bs=4k count=1000000 < /dev/zero > /dev/null"
|
||||
hyperfine "./target/release/dd bs=1M count=20000 < /dev/zero > /dev/null"
|
||||
hyperfine "./target/release/dd bs=1G count=10 < /dev/zero > /dev/null"
|
||||
```
|
||||
|
||||
Choosing what to benchmark depends greatly on what you want to measure.
|
||||
Typically you would choose a small blocksize for measuring the performance of
|
||||
`dd`, as that would maximize the overhead introduced by the `dd` tool. `dd`
|
||||
typically does some set amount of work per block which only depends on the size
|
||||
of the block if conversions are used.
|
||||
|
||||
As an example, https://github.com/uutils/coreutils/pull/3600 made a change to
|
||||
reuse the same buffer between block copies, avoiding the need to reallocate a
|
||||
new block of memory for each copy. The impact of that change mostly had an
|
||||
impact on large block size copies because those are the circumstances where the
|
||||
memory performance dominated the total performance.
|
Loading…
Reference in a new issue