Add some abstractions to simplify the `rbuf_but_last_n_lines()`
function, which implements the "take all but the last `n` lines"
functionality of the `head` program. This commit adds
- `RingBuffer`, a fixed-size ring buffer,
- `ZLines`, an iterator over zero-terminated "lines",
- `TakeAllBut`, an iterator over all but the last `n` elements of an
iterator.
These three together make the implementation of
`rbuf_but_last_n_lines()` concise.
`sort` supports three ways to specify the sort mode: a long option
(e.g. --numeric-sort), a short option (e.g. -n) and the sort flag
(e.g. --sort=numeric).
This adds support for the sort flag.
Additionally, sort modes now conflict, which means that an error is
shown when multiple modes are passed, instead of silently picking a mode.
For consistency, I added the `random` sort mode to the `SortMode` enum,
instead of it being a bool flag.
Instead of overflowing when calculating the buffer size, use
saturating_{pow, mul}.
When failing to parse the buffer size, we now crash instead of silently
ignoring the error.
To make this work we make default sort a special case of external sort.
External sorting uses auxiliary files for intermediate chunks. However,
when we can keep our intermediate chunks in memory, we don't write them
to the file system at all. Only when we notice that we can't keep them
in memory they are written to the disk.
Additionally, we don't allocate buffers with the capacity of their
maximum size anymore. Instead, they start with a capacity of 8kb and are
grown only when needed.
This makes sorting smaller files about as fast as it was before
(I'm seeing a regression of ~3%), and allows us to seamlessly continue
with auxiliary files when needed.
For any commandline arguments, ls should print the argument as is (and
not truncate to just the file name)
For any other files it reaches (say through recursive exploration), ls
should print just the filename (as path is printed once when we enter
the directory)
A lot of tests depend on GNU's coreutils to be installed in order
to obtain reference values during testing.
In these cases testing is limited to `target_os = linux`.
This PR installs GNU's coreutils on "github actions" and adjusts the
tests for `who`, `stat` and `pinky` in order to be compatible with macOS.
* `brew install coreutils` (prefix is 'g', e.g. `gwho`, `gstat`, etc.
* switch paths for testing to something that's available on both OSs,
e.g. `/boot` -> `/bin`, etc.
* switch paths for testing to the macOS equivalent,
e.g. `/dev/pts/ptmx` -> `/dev/ptmx`, etc.
* exclude paths when no equivalent is available,
e.g. `/proc`, `/etc/fstab`, etc.
* refactor tests to make better use of the testing API
* fix a warning in utmpx.rs to print to stderr instead of stdout
* fix long_usage text in `who`
* fix minor output formatting in `stat`
* the `expected_result` function should be refactored
to reduce duplicate code
* more tests should be adjusted to not only run on `target_os = linux`
Fix a bug in which the incorrect character was being used to indicate
"round up to the nearest multiple" mode. The character was "*" but it
should be "%". This commit corrects that.
Change the error message for when the reference file (the `-r` argument)
is not found to match GNU coreutils. This commit also eliminates a
redundant call to `File::open`; the file need not be opened because the
size in bytes can be read from the result of `std::fs::metadata()`.
Change the interface provided by the `parse_size()` function to reduce
its responsibilities to just a single task: parsing a number of bytes
from a string of the form '123KB', etc. Previously, the function was
also responsible for deciding which mode truncate would operate in.
Furthermore, this commit simplifies the code for parsing the number and
unit to be less verbose and use less mutable state.
Finally, this commit adds some unit tests for the `parse_size()`
function.
Previous version would perform an amount of work proportional to `CHUNK_SIZE`,
so this wasn't a valid way to benchmark at multiple values of that constant.
The `TryInto` implementation for `&mut [T]` to `&mut [T; N]` relies on `const`
generics, and is available in (stable) Rust v1.51 and later.
Change the behavior of `head` to display an error for each problematic
file, instead of displaying an error message for the first problematic
file and terminating immediately at that point. This change now matches
the behavior of GNU `head`.
Before this commit, the first error caused the program to terminate
immediately:
$ head a b c
head: error: head: cannot open 'a' for reading: No such file or directory
After this commit:
$ head a b c
head: cannot open 'a' for reading: No such file or directory
head: cannot open 'b' for reading: No such file or directory
head: cannot open 'c' for reading: No such file or directory
Instead of using a BufReader and reading each line separately,
allocating a String for each one, we read to a chunk. Lines are
references to this chunk. This makes the allocator's job much easier
and yields performance improvements.
Chunks are read on a separate thread to further improve performance.
Fix a bug in which `head` failed to print headings for `stdin` inputs
when reading from multiple files, and fix another bug in which `head`
failed to print a blank line between the contents of a file and the
heading for the next file when reading multiple files. The output now
matches that of GNU `head`.