Add some helper functions and adjust some error-handling to make the
`Output::dd_out()` method, containing the main loop of the `dd`
program, more concise. This commit also adds documentation and
comments describing the main loop procedure in more detail.
This lets us use fewer reallocations when parsing each line.
The current guess is set to the maximum fields in a line so far. This is
a free performance win in the common case where each line has the same
number of fields, but comes with some memory overhead in the case where
there is a line with lots of fields at the beginning of the file, and
fewer later, but each of these lines are typically not kept for very
long anyway.
Using indexes into the line instead of Vec<u8>s means we don't have to copy
the line to store the fields (indexes instead of slices because it avoids
self-referential structs). Using memchr also empirically saves a lot of
intermediate allocations.
Refactor the code for representing the `df` data table into `Header`
and `Row` structs. These structs live in a new module `table.rs`. When
combined with the `Options` struct, these structs can be
`Display`ed. Organizing the code this way makes it possible to test
the display settings independently of the machinery for getting the
filesystem data. New unit tests have been added to `table.rs` to
demonstrate this benefit.
Show a warning if the `skip=N` command-line argument would cause `dd`
to skip past the end of the input. For example:
$ printf "abcd" | dd bs=1 skip=5 count=0 status=noxfer
'standard input': cannot skip to specified offset
0+0 records in
0+0 records out
Add some structure to errors that can be created during parsing of
settings from command-line options. This commit creates
`StrategyError` and `SettingsError` enumerations to represent the
various parsing and other errors that can arise when transforming
`ArgMatches` into `Settings`.
Show a warning when a block size includes "0x" since this is
ambiguous: the user may have meant "multiply the next number by zero"
or they may have meant "the following characters should be interpreted
as a hexadecimal number".
When specifying `seek=N` and *not* specifying `conv=notrunc`, truncate
the output file to `N` blocks instead of truncating it to zero before
starting to write output. For example
$ printf "abc" > outfile
$ printf "123" | dd bs=1 skip=1 seek=1 count=1 status=noxfer of=outfile
1+0 records in
1+0 records out
$ cat outfile
a2
Fixes#3068.
When this option is present, the files argument is not processed. This option processes the file list from provided file, splitting them by the ascii NUL (\0) character. When files0-from is '-', the file list is processed from stdin.
Correct the behavior of `dd` when multiple arguments are provided.
Before this commit, if the multiple arguments was provided then
the validation error are returned.
For example
```
$ printf '' | ./target/debug/dd status=none status=noxfer
error: The argument '--status=<LEVEL>' was provided more than once, but cannot be used multiple times
USAGE:
dd [OPTIONS]
For more information try --help
```
The unittest was added for this case.
This avoids hacking around the short options of these command line
arguments that have been introduced by clap. Additionally, we test and
correctly handle the combination of both version and help. The GNU
binary will ignore both arguments in this case while clap would perform
the first one. A test for this edge case was added.
Now treats recognized command line options and ignores unrecognized
command line options instead of returning a special exit status for
them.
There is one point of interest, which is related to an implementation
detail in GNU `true`. It may return a non-true exit status (in
particular EXIT_FAIL) if writing the diagnostics of a GNU specific
option fails. For example `true --version > /dev/full` would fail and
have exit status 1.
This behavior was acknowledged in gnu in commit
<9a6a486e6503520fd2581f2d3356b7149f1b225d>. No further
justification provided for keeping this quirk.
POSIX knows no such options, and requires an exit status of 0 in all
cases. We replicate GNU here which is a consistency improvement over the
prior implementation. Adds documentation to clarify the intended
behavior more properly.
Exit with status code 1 for argument parsing errors in `truncate`. When
`clap` encounters an error during argument parsing, it exits with status
code 2. This causes some GNU tests to fail since they expect status code
1.
Refactor the `Mode` enum in the `head.rs` module so that it includes
not only the mode type---lines or bytes---but also whether to read the
first NUM items of that type or all but the last NUM. Before this
commit, these two pieces of information were stored separately. This
made it difficult to read the code through several function calls and
understand at a glance which strategy was being employed.
This allows for `-t` to take invalid unicode (but still single-byte) values
on unix-like platforms. Other platforms, which as of the time of this commit
do not support `OsStr::as_bytes()`, could possibly be supported in the future,
but would require design decisions as to what that means.
Replace the `FilenameFactory` with `FilenameIterator` and calls to
`FilenameFactory::make()` with calls to `FilenameIterator::next()`. We
did not need the fully generality of being able to produce the
filename for an arbitrary chunk index. Instead we need only iterate
over filenames one after another. This allows for a less
mathematically dense algorithm that is easier to understand and
maintain. Furthermore, it can be connected to some familiar concepts
from the representation of numbers as a sequence of digits.
This does not change the behavior of the `split` program, just the
implementation of how filenames are produced.
Co-authored-by: Terts Diepraam <terts.diepraam@gmail.com>
- Change the main! proc_macro to a bin! macro_rules macro.
- Reexport uucore_procs from uucore
- Make utils to not import uucore_procs directly
- Remove the `syn` dependency and don't parse proc_macro input (hopefully for faster compile times)
Prevent usize underflow when reducing the size of a file by more than
its current size. For example, if `f` is a file with 3 bytes, then
truncate -s-5 f
will now set the size of the file to 0 instead of causing a panic.
Improve the error message that gets printed when a directory does not
exist. After this commit, the error message is
truncate: cannot open '{file}' for writing: No such file or directory
where `{file}` is the name of a file in a directory that does not
exist.
Change a word in the error message displayed when an increment value
of 0 is provided to `seq`. This commit changes the message from "Zero
increment argument" to "Zero increment value" to match the GNU `seq`
error message.
Add an error for division by zero. Previously, running `truncate -s /0
file` or `-s %0` would panic due to division by zero. After this
change, it writes an error message "division by zero" to stderr and
terminates with an error code.
Replace some uses of `crash!()` and move `UError` handling down into
the `truncate()` function. This does not change the behavior of the
program, just organizes the code to facilitate introducing code to
handle other types of errors in the future.
Add support for the `-f FORMAT` option to `seq`. This option instructs
the program to render each value in the generated sequence using a
given `printf`-style floating point format. For example,
$ seq -f %.2f 0.0 0.1 0.5
0.00
0.10
0.20
0.30
0.40
0.50
Fixes issue #2616.
Fix a bug where `tail -f` would terminate with an error due to failing
to parse a UTF-8 string from a sequence of bytes read from the
followed file. This commit replaces the call to `BufRead::read_line()`
with a call to `BufRead::read_until()` so that any sequence of bytes
regardless of encoding can be read.
Fixes#1050.
Correct the behavior of `dd` with the `status=noxfer` option. Before
this commit, the status output was entirely suppressed (as happens
with `status=none`). This was incorrect behavior. After this commit,
the input/output counts are printed to stderr as expected.
For example,
$ printf "" | dd status=noxfer
0+0 records in
0+0 records out
This commit also updates a unit test that was enforcing the wrong
behavior.
Fix the behavior of truncate when given a non-existent file so that it
correctly creates the file before truncating it (unless the
`--no-create` option is also given).
Fix a bug when getting all but the first NUM lines or bytes of a file
via `tail -n +NUM <file>` or `tail -c +NUM <file>`. The bug only
existed when a file is given as an argument; it did not exist when the
input data came from stdin.
Support `-z` option when the input is not a seekable file. Previously,
the option was accepted by the argument parser, but it was being
ignored by the application logic.
This expands the error message that is printed if either input file has
an unsorted line. Both the program name (join) and the offending line
are printed out with the message to match the behaviour of the GNU
utility.
Factor out a loop for finding the index of the byte immediately
following the `n`th line from the end of a file. This does not change
the behavior of the code, just its organization.
This commit replaces generic Results with UResults in some key
functions in numfmt. As a result of this, we can provide different
exit codes for different errors, which resolves ~70 failing test
cases in the GNU numfmt.pl test suite.
Create a `Settings::from` method that converts a `clap::ArgMatches`
instance into a `Settings` instance. This eliminates the unnecessary
use of a mutable variable when initializing the settings.
Move the `printf::memo` module to `uucore` so that it can be used by
other programs, not just `printf`. For example, the `-f` option to `seq`
requires parsing and formatting numbers according to the same logic as
`printf`.