My previous commits meant to bring our wc's output and behavior in line
with GNU's. There should be tests that check for these changes!
I found a stupid bug in my own changes, I was not adding 1 to the
indexes produced by .enumerate() when printing errors.
for the algorithms md5, sha[1,224,256,384,512], blake2b, and sm3 from
<digest> <filesize> <filename>
to
<algo name> (<filename>) = <digest>
to use the same format as GNU cksum
While the rust coreutils semantics were arguably more correct,
they were different than the gnu split semantics when handling a
file without a trailing EOF. This patch addresses that difference
and allows passing one more GNU test suite.
These are the first half of changes needed to pass the dd/bytes.sh tests:
- Add iseek and oseek options (additive with skip and seek options)
- Implement tests for the new flags, matching those from dd/bytes.sh
Implement the `--line-bytes` option to `split`. In this mode, the
program tries to write as many lines of the input as possible to each
chunk of output without exceeding a specified byte limit. The new
`LineBytesChunkWriter` struct represents this functionality.
Implement `-n l/k/N` option, where the `k`th chunk of the input file
is written to stdout. For example,
$ seq -w 0 99 > f; split -n l/3/10 f
20
21
22
23
24
25
26
27
28
29
Add the `-e` flag, which indicates whether to elide (that is, remove)
empty files that would have been created by the `-n` option.
The `-n` command-line argument gives a specific number of chunks into
which the input files will be split. If the number of chunks is
greater than the number of bytes, then empty files will be created for
the excess chunks. But if `-e` is given, then empty files will not be
created.
For example, contrast
$ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac
a
cat: xac: No such file or directory
with
$ printf 'a\n' > f && split -n 3 f && cat xaa xab xac
a
Clean up unit tests in the `dd` crate to make them easier to
manage. This commit does a few things.
* move test cases that test the complete functionality of the `dd`
program from the `dd_unit_tests` module up to the
`tests/by-util/test_dd.rs` module so that they can take advantage of
the testing framework and common testing tools provided by uutils,
* move test cases that test internal functions of the `dd`
implementation into the `tests` module within `dd.rs` so that they
live closer to the code they are testing,
* replace test cases defined by macros with test cases defined by
plain old functions to make the test cases easier to read at a
glance.
Add support for the `-x` command-line option to `split`. This option
causes `split` to produce filenames with hexadecimal suffixes instead
of the default alphabetic suffixes.
Replace `ByteSplitter` and `LineSplitter` with `ByteChunkWriter` and
`LineChunkWriter` respectively. This results in a more maintainable
design and an increase in the speed of splitting by lines.
Correct the `test_split::test_suffixes_exhausted` test case so that it
actually exercises the intended behavior of `split`. Previously, the
test fixture contained 26 bytes. After this commit, the test fixture
contains 27 bytes. When using a suffix width of one, only 26 filenames
should be available when naming chunk files---one for each lowercase
ASCII letter. This commit ensures that the filenames will be exhausted
as intended by the test.
When this option is present, the files argument is not processed. This option processes the file list from provided file, splitting them by the ascii NUL (\0) character. When files0-from is '-', the file list is processed from stdin.
This allows for `-t` to take invalid unicode (but still single-byte) values
on unix-like platforms. Other platforms, which as of the time of this commit
do not support `OsStr::as_bytes()`, could possibly be supported in the future,
but would require design decisions as to what that means.