This commit aim to correct #2645.
The origin of the problem was that `handle_obsolete` was not looking for
named signals but only for numbers preceded by '-'. I have made a few
changes:
- Change `handle_obsolete` to use `signal_by_name_or_value` to parse the
possible signal.
- Since now the signal is already parsed return an `Option<usize>`
instead of `Option<&str>` to parse later.
- Change the way to decide the pid to use from a `match` to a `if`
because the tested element are actually not the same for each branch.
- Parse the signal from the `-s` argument outside of `kill` function for
consistency with the "obsolete" signal case.
- Again for consistency, parse the pids outside the `kill` function.
This makes it no longer possible to pass multiple filenames, but every
other implementation I tried (GNU, busybox, FreeBSD, sbase) also
forbids that so I think it's for the best.
* chown: support '.' as 'user.group' separator (like ':')
It also manages the case:
user.name:group (valid too)
Should fix:
gnu/tests/chown/separator.sh
* Fixed some documentation of display_item_long that was missed in #2623
* Implemented coloring of `ls -l` symlink targets( after the arrow `->`).
* Documented display_file_name to some extent.
* Ran rustfmt as part of mitigating CI chain errors.
* Removed unused variables and code in test_ls_long_format as per #2623
specifically, as per
https://github.com/uutils/coreutils/pull/2623#pullrequestreview-742386707
* Added a thorough test for `ls -laR --color` symlink coloring implemented in this branch.
* renamed test files and dirs to shorter names and ran rustfmt.
* Changed the order with which files are expected to match the change in their name.
* Bettered some comments.
* Removed needless borrow. Fixed that one clippy warning.
* Moved the cfg not windows up to the function level
because this function is meant to only run in non-windows OS (with
groups and unames).
Fixes the unused variable warning in CI.
Part of the code was transliterated from GNU's implementation in C,
and used flags to store settings. Instead, we can use enums to avoid
magic values or binary operations to extract flags.
This way later code can assume `src` and `dest` to be the actual paths
of source and destination, and do not have to constantly check
`options.dereference`.
This requires moving the error context calculation to the beginning as
well, since later steps no longer operate with the same file paths as
supplied by the user.
Fix a mix-up between ownership and mode. The latter (mode / file permissions)
can also be set on windows (which however only affects the read-only flag),
while there doesn't seem to be a straight-forward way to change file ownership
on windows.
Copy the acl as well when copying the mode. This is a non-default feature and can be
enabled with --features feat_acl, because it doesn't seem to work on CI.
It is only available for unix so far.
Copy the SELinux context if possible.
Makes the -o auto option construct a format at initialization, rather
than try to handle it as a special case when printing lines. Fixes bugs
when combined with -e, especially when combined with -a.
* Used .as_path() and .as_str() when required:
when the argument required is a Path and not a PathBuf,
or an str and not a Path, respectively.
* Changed display_items to take Vec<PathData>, which is passed, instead of [PathData]
* Added a pad_right function.
* Implemented column-formating to mimic the behavior of GNU coreutils's ls
Added returns in display_dir_entry_size that keep track of uname and
group lengths.
Renamed variables to make more sense.
Changed display_item_long to take all the lengths it needs to render
correctly.
Implemented owner, group, and author padding right to mimic GNU ls.
* Added a todo for future quality-of-life cache addition.
* Documented display_item_long, as a first step in documenting all functions.
* Revert "Used .as_path() and .as_str() when required:"
This reverts commit b88db6a817.
* Revert "Changed display_items to take Vec<PathData>, which is passed, instead of [PathData]"
This reverts commit 0c690dda8d.
* Ran cargo fmt to get rid of Style/format `fmt` testing error.
* Added a test for `ls -l` and `ls -lan` line formats.
* Changed uname -> username for cspell. Removed extra blank line for rustfmt.
Fix a bug in `seq` where the number of characters needed to print the
number was computed incorrectly in some cases. This commit changes the
computation of the width to be after parsing the number instead of
before, in order to accommodate inputs like `1e3`, which requires four
digits when printing the number, not three.
Combine the `first`, `increment`, and `last` parameters of the
`print_seq()` and `print_seq_integers()` functions into a `RangeF64` or
`RangeInt` type, respectively.
- Implement all of GNU's fiddly little details
- Don't assume Linux for error codes
- Accept badly-encoded filenames
- Report errors after the fact instead of checking ahead of time
- General cleanup
rmdir now passes GNU's tests.
If reading fails midway through then a count should be reported for
what could be read.
If opening a file fails then no count should be reported.
The exit code shouldn't report the number of failures, that's fragile
in case of many failures.
Errors are now always shown with the corresponding filename.
Errors are no longer converted into warnings. Previously `wc < .`
would cause a loop.
Checking whether something is a directory is no longer done in
advance. This removes race conditions and the edge case where stdin is
a directory.
The custom error type is removed because io::Error is now enough.
- Reuse allocations for read lines
- Increase splice size
- Check if /dev/null was opened correctly
- Do not discard read bytes after I/O error
- Add fast line counting with bytecount
Unify the usage of macros `return_if_err` and `crash_if_err`. As
`return_if_err` is used only in `uumain` routines of utilities, it
achieves the same thing as `crash_if_err`, which calls the `crash!`
macro to terminate function execution immediately.
Unify the usage of macros `safe_unwrap` and `crash_if_err` that are
identical to each other except for the assumption of a default error
code. Use the more generic `crash_if_err` so that `safe_unwrap` is now
obsolete and can be removed.
Remove a copy operation of the input buffer being read for digest when
reading in text mode on Windows. Previously, the code was copying the
buffer to a completely new `Vec`, replacing "\r\n" with "\n". Instead,
the code now scans for the indices at which each "\r\n" occurs in the
input buffer and inputs into the digest only the characters before the
"\r" and after it.
Report errors properly instead of panicking.
Replace zero_copy by a simpler specialized private module.
Do not assume splices move all data at once.
Use the modern uutils machinery.
Remove the "latency" feature. The time it takes to prepare the buffer
is drowned out by the startup time anyway.
yes: Add tests
yes: Fix long input test on Windows
Fix a bug in `tac` where multi-character line separators would cause
incorrect behavior when there was overlap between candidate matches in
the input string. This commit adds a dependency on `memchr` in order to
use the `memchr::memmem::rfind_iter()` function to scan for
non-overlapping instances of the specified line separator characters,
scanning from right to left.
Fixes#2580.
* hashsum: support --check for var. length outputs
Add the ability for `hashsum --check` to work with algorithms with
variable output length. Previously, the program would terminate with an
error due to constructing an invalid regular expression.
* fixup! hashsum: support --check for var. length outputs
* tac: correct behavior of -b option
Correct the behavior of `tac -b` to match that of GNU coreutils
`tac`. Specifically, this changes `tac -b` to assume *leading* line
separators instead of the default *trailing* line separators.
Before this commit, the (incorrect) behavior was
$ printf "/abc/def" | tac -b -s "/"
def/abc/
After this commit, the behavior is
$ printf "/abc/def" | tac -b -s "/"
/def/abc
Fixes#2262.
* fixup! tac: correct behavior of -b option
* fixup! tac: correct behavior of -b option
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
Hopefully will be feature parity with GNU `tr`.
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Implemented a bit of new expansion module
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Implemented delete operation
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Partially implemented delete operation
Will go through translate next.
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Fix formatting...
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Implemented translation feature
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Move the creation of temporary files into next_file so that it happens
while the lock is taken.
Previously, only the computation of the new file path happened while
the lock was taken, while the creation happened later.
csplit fails when suffix has no flags:
$ csplit result.expected -f /tmp/EXPECT -b "%d" "/^## alternative ##$/" {*}
csplit: error: incorrect conversion specification in suffix
This is supported by original csplit
If options::WIDTH is not given, we should try to use the terminal width.
If that is unavailable, we should fall back to the 'COLUMNS' environment variable.
If that is unavailable (or invalid), we should fall back to a default of 80.
As custom errors are prefered over wrapping around common errors, the
distinction between UCommonError and UCustomError is removed. This
reduces the number of types and makes the error handling easier to
understand.
The following test case read stdin instead of file:
```
echo abcdefg > file
cargo run -- od --format x1 file
```
This is because the -t/--format argument was able to absorb multiple
arguments after it. This has now been fixed, and a test case is added
to ensure it will not happen again.
When hitting Ctrl+C sort now deletes any temporary files. To make this easier I
created a new struct `TmpDirWrapper` that handles the creation of new temporary
files and the registration of the signal handler.
The ToDo list was updated to mark `chcon` as done.
Building and testing `chcon` requires enabling the `feat_selinux` feature. If `make` is used for building, then please specify `SELINUX_ENABLED=1` if building and testing on a system where SELinux is not enabled.
Fix an issue where `basename` would print the empty path when provided
an input comprising only slashes (for example, "/" or "//" or
"///"). Now it correctly prints the path "/".
In such cases we have to create a temporary copy of the input file to prevent
overwriting the input with the output. This only affects merge sort, because it
is the only mode where we start writing to the output before having read all inputs.
Since lines that compare equal should be sorted together, we need to first
compare the lines (taking settings into account). Only if they do not compare
equal we should compare the hashes.
Use new macro to construct UIoError types from `std::io::Error` before
printing. This gives consistent error messages across all utilities as it
prepends custom errors to the error description for the respective application
and error type. This saves the developers from manually appending the
`std::io::Error`-specific error messages.
Drop the previous flags that would tell whether a noncritical error occured
during execution in favor of the `show!` macro from the error submodule.
This allows us to generate regular error types during execution to signify
failures inside the program, but without prematurely aborting program execution
if not needed or specified.
Also make verbose outputs use `print!` and friends instead of `show_error!` to
ensure verbose output is redirected to stdout, not stderr.
Add support for
* space-separated list of tab stops, like `--tabs="2 4 6"`,
* mixed comma- and space-separated lists, like `--tabs="2,4 6"`,
* the slash specifier in the last tab stop, like `--tabs=1,/5`,
* the plus specifier in the last tab stop, like `--tabs=1,+5`.
For whatever reason, the following is equivalent,
cargo run -- rm --presume-input-tty
cargo run -- rm ---presume-input-tty
cargo run -- rm -----presume-input-tty
cargo run -- rm ---------presume-input-tty
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
The custom usage string does not have to include the "sort\nUsage:" part,
because this part is already printed by clap.
It prevents the following duplication:
USAGE:
sort
Usage:
sort [OPTION]... [FILE]..
Now, only the following is printed:
USAGE:
sort [OPTION]... [FILE]...
- Adds more jargon to the spellchecker jargon file.
- Adds more ignores to the per-project ignores.
- Fixes clippy warnigns about uneeded clones on non-linux.
- Block/Unblock now account for system dependend newlines.
Change the behavior of `tac` when there are no line separators in the
input. Previously, it was appending an extra line separator; this commit
prevents that from happening. The input is now writted directly to
stdout.
Correct the error message produced by `tac` when trying to read from a
directory. Previously if the path 'a' referred to a directory, then
running `tac a` would produce the error message
dir: read error: Invalid argument
after this commit it produces
a: read error: Invalid argument
which matches GNU `tac`.
Handle additional edge cases arising from test(1)’s lack of reserved words.
For example, left parenthesis followed by an operator could indicate
either
* string comparison of a literal left parenthesis, e.g. `( = foo`
* parenthesized expression using an operator as a literal, e.g. `( = != foo )`
- Adds words to cspell exceptions
- Converts test macros to use Default trait.
- Converts parser to use Default trait.
- Adds Windows-friendly test files for block/unblock when nl is present
in test/spec file.
- Removes dd from feat_require_unix (keeps it in feat_common_core)
- Changes name of make_linux_iflags parameter from oflags to iflags.
- Removes roughed out SIGINFO impl.
- Renames plen -> target_len.
- Removes internal fn def for build_blocks and replaces with a call to
chunks from std.
- Renames apply_ct to the more descriptive apply_conversion
- Replaces manual swap steps in perform_swab with call to swap from std.
- Replaces manual solution with chunks where appropriate (in Read, and Write impl).
- Renames xfer -> transfer (in local variable names).
- Improves documentation for dd_filout/dd_stdout issue.
- Removes commented debug_assert statements.
- Modifies ProdUpdate to contain ReadStat and WriteStat rather than
copying their fields.
- Addresses verbose return in `Output<File>::new(...)`
- Resoves compiler warning when built as release when signal handler fails to
register.
- Derives _Default_ trait on ReadStat.
- Adds comments for truncated lines in block unblock tests
- Removes `as u8` in block unblock tests.
- Removes unecessary `#[inline]`
- Delegates multiplier string parsing to uucore::parse_size.
- Renames 'unfailed' -> 'succeeded' for clairity.
- Removes #dead_code warnings. No clippy warnings on my local machine.
- Reworks signal handler to better accomodate platform-specific signals.
- Removes explicit references to "if" and "of" in dd.rs.
- Removes explicit references to "bs", "ibs", "cbs" and "status" in
parseargs.rs.
- Removes `#[allow(deadcode)]` for OFlags, and IFlags.
- Removes spellchecker ignore from all dd files.
- Adds tests for 'traditional' and 'modern' CLI.
- Address Rust fmt issue
- ignores run-without-noatime test which fails on some build machines
- adds cspell:disable to all project files.
- adds .../dd/fixtures/cspell.json to ignore test fixtures.
- Adds cspell.json. Hopefully this will make you happy, spellchecker.
- Removes non-functional spellchecker-ignore tags
- Adds a sleep call to the no noatime test. Some systems were did not notice a changed
atime without the option present.
- Adds a test with a unicode filename.
- Addresses clippy lints and rustfmt issues.
- adds project header to multiple files
- updates spell check skip words
- removes linux only flags direct,noatime from mac_os build
- applies rustfmt to test_dd
- runs rustfmt on test_dd.rs
- eliminates compiler warnings
- adds many words to spellchecker ignore list
- adds sanity test for vexing conv=nocreat issue. Still WIP.
The `uname` `-o` switch reports the operating system used. If the GNU C
standard library (glibc) is not in use, for example if musl is being
used instead, report "Linux" instead of "GNU/Linux".
Fails with:
```
error: use of irregular braces for `write!` macro
--> src/uucore/src/lib/features/encoding.rs:19:17
|
19 | #[derive(Debug, Error)]
| ^^^^^
|
= note: `-D clippy::nonstandard-macro-braces` implied by `-D warnings`
help: consider writing `Error`
--> src/uucore/src/lib/features/encoding.rs:19:17
|
19 | #[derive(Debug, Error)]
| ^^^^^
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonstandard_macro_braces
= note: this error originates in the derive macro `Error` (in Nightly builds, run with -Z macro-backtrace for more info)
```
This makes clap wrap the help text according to the terminal width,
which improves readability for terminal widths < 120 chars,
because clap defaults to a width of 120 chars without this feature.
This reimplements version_cmp, which is used in sort and ls to sort
according to versions.
However, it is not bug-for-bug identical with GNU's implementation.
I reported a bug with GNU here:
https://lists.gnu.org/archive/html/bug-coreutils/2021-06/msg00045.html
This implementation does not contain the bugs regarding the handling of
file extensions and null bytes.
Adds the ability to perform file backups before installing newer files on top
of existing ones. Adds a status message about backups to stdout if running in
verbose mode.
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Keep one of the texts in-place
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Reduced the fix to just formatting changes
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
The '--backup' option would previously accept arguments separated from the
option either by a space or an equals sign. The GNU implementation strictly
requires an "equals" for argument separation.
As the argument to '--backup' is optional, the equals sign mustn't be ommited
as otherwise there is no way to tell a file argument apart from an argument
that's meant for the '--backup' option. This ensures that if '--backup' is
present it either has no further associated arguments (i.e. fallback to the
default), or the arguments are separated by an equals sign.
Gerenal numeric sort works by comparing pre-parsed floating point
numbers. That means that we don't have to store the &str the float was
parsed from.
As a result, memory usage was slightly reduced for general numeric sort.
`LANGUAGE=C` is not enough, `LC_ALL=C` is needed as the environment
variable that overrides all the other localization settings.
e.g.
```bash
$ LANGUAGE=C id foobar
id: ‘foobar’: no such user
$ LC_ALL=C id foobar
id: 'foobar': no such user
```
* replace `LANGUAGE` with `LC_ALL` as environment variable in the tests
* fix the the date string of affected uutils
* replace `‘` and `’` with `'`
Data that was previously boxed inside the `Line` struct was moved to
separate vectors. Inside of each `Line` remains only an index that
allows to access that data.
This helps with keeping the `Line` struct small and therefore reduces
memory usage in most cases.
Additionally, this improves performance because one big allocation (the
vectors) are faster than many small ones (many boxes inside of each
`Line`). Those vectors can be reused as well, reducing the amount of
(de-)allocations.
- Windows hidden file attribute determination would assume symbolic link
to be valid and would panic
- Check symbolic link's attributes if the link points to non-existing
file
- For dangling symlinks, errors should only be reported if
dereferencing options were passed and dereferencing was applicable to
the particular symlink
- With -i parameter, report '?' as the inode number for dangling
symlinks
When invoked via '[' name, last argument must be ']' or we bail out with
syntax error. Then the trailing ']' is simply disregarded and processing
happens like usual.
- The dd info page mentions a special fast-read framework if no conv=FLAG is
specified (see bs=N) which I left space for. As it turns out, this is performed already
so it does not need to be implemented.
- When we have finished reading from a temproary file, we can immediately
delete it.
- Use one single directory for all temporary files.
- Only create the temporary directory when needed.
- Also compress temporary files created by the merge step if requested.
1. Its better to bump u16 to usize than the other way round.
2. Highly unlikely to have a terminal with usize rows...makes making sense of the code easier.
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
- Adds print fn's
- Modifies internal fn's as needed to track read/write state
- Modifies status update thread to respect status level
- Adds signal handler for SIGUSR1 (print xfer stats)
* --check=silent and --check=quiet, which are equivalent to -C.
* --check=diagnose-first, which is the same as --check
We also allow -c=<value>, which confuses GNU sort.
`timeout` used to set the timeout to 0 when -k was not set. This
collided with the behavior of 0 timeouts, which disable the timeout.
When -k is not set the process should not be killed.
To prevent clap from parsing flags for the command to run as flags for
timeout, remove the "args" positional argument, but allow to pass flags
via the "command" positional arg.
As discussed here: https://github.com/uutils/coreutils/pull/2361
the group IDs returned for GNU's 'group' and GNU's 'id --groups'
starts with the effective group ID.
This implements a wrapper for `entris::get_groups()` which mimics
GNU's behaviour.
* add tests for `id`
* add tests for `groups`
* fix `id --groups --real` to no longer ignore `--real`
- Splits read fn into conv=sync and standard (consecutive)
versions.
- Fixes bug in previous read/fill where short reads would copy to wrong
position in output buffer.
- Fixes bug in unit tests. Empty source would pass (since no bytes
failed to match).
- Use `unicode_segmentation` and `unicode_width` to determine proper `break_line` position.
- Keep track of total_width as suggested by @tertsdiepraam.
- Add unittest for ZWJ unicode case
Related to #2319.
consistent
* add tests for each flag that takes NUM/SIZE arguments
* fix bug in tail where 'quiet' and 'verbose' flags did not override each other POSIX style
dst may or may not exist. In case it exists it might already be a symlink.
In neither case we should try to canonicalize dst, only its parent directory.
https://www.gnu.org/software/coreutils/manual/html_node/ln-invocation.html
> Relative symbolic links are generated based on their canonicalized
> **containing directory**, and canonicalized targets.
Port argument parsing from getopts to clap.
The only difference I have observed is that clap auto-generates -h and
-V short options for help and version, and there is no way (in clap 2.x)
to disable them.
Instead of using into_raw_fd(), which transfers ownership and
requires us to close the file descriptor manually,
use as_raw_fd(), which does not transfer ownership to us but drops the
file descriptor when the original file is dropped (in our case at the
end of the function).
We were reporting "no match" when sorting something like "0 ". This is
because we don't distinguish between 0 and invalid lines when sorting.
For debug output we have to get this information back.
GNU seq does not support -t, but always outputs a newline at the end.
Therefore, our default for -t should be \n.
Also removes support for escape sequences (interpreting a literal "\n"
as a newline). This is not what GNU seq is doing, and unexpected.
If we notice that we can represent all arguments as BigInts, take a
different code path. Just like GNU seq this means we can print an
infinite amount of numbers in this case.
When a single directory is passed to ls in recursive mode, uutils ls
won't print the directory name
======================
GNU ls:
z:
======================
======================
uutils ls:
======================
This commit fixes this minor inconsistency and adds corresponding test.
Closes#2254. We should only inherit global settings for keys when there
are absolutely no options attached to the key.
The default key (matching the whole line) is implicitly added only if no
keys are supplied.
Improved some error messages by including more context.
* expr: support arbitrary precision integers
Instead of i64s we now use BigInts for integer operations. This means
that no result or input can be out of range.
The representation of integer flags was changed from i64 to u8 to make
their intention clearer.
* expr: allow big numbers as arguments as well
Also adds some tests
* expr: use num-traits to check bigints for 0 and 1
* expr: remove obsolete refs
match ergonomics made these avoidable.
* formatting
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
Reorganize the code in `truncate.rs` into three distinct functions
representing the three modes of operation of the `truncate` program. The
three modes are
- `truncate -r RFILE FILE`, which sets the length of `FILE` to match the
length of `RFILE`,
- `truncate -r RFILE -s NUM FILE`, which sets the length of `FILE`
relative to the given `RFILE`,
- `truncate -s NUM FILE`, which sets the length of `FILE` either
absolutely or relative to its curent length.
This organization of the code makes it more concise and easier to
follow.
Create a method that computes the final target size in bytes for the
file to truncate, given the reference file size and the parameter to the
`TruncateMode`.
Add a helper function to contain the code for parsing the size and the
modifier symbol, if any. This commit also changes the `TruncateMode`
enum so that the parameter for each "mode" is stored along with the
enumeration value. This is because the parameter has a different meaning
in each mode.
Remove "read" permissions from the `OpenOptions` when opening a new file
just to truncate it. We will never read from the file, only write to
it. (Specifically, we will only call `File::set_len()`.)
* sort: crash when failing to open an input file
Instead of ignoring files we fail to open, crash.
The error message does not exactly match gnu, but that would require
more effort.
* use split_whitespace instead of a manual implementation
* fix expected error on windows
* sort: update expected error message
* sort: disable support for thousand separators
In order to be compatible with GNU, we have to disable thousands
separators. GNU does not enable them for the C locale, either.
Once we add support for locales we can add this feature back.
* sort: delete unused fixtures
* sort: compare -0 and 0 equal
I must have misunderstood this when implementing, but GNU considers
-0, 0, and invalid numbers to be equal.
* sort: strip blanks before applying the char index
* sort: don't crash when key start is after key end
* sort: add "no match" for months at the first non-whitespace char
We should put the "^ no match for key" indicator at the first
non-whitespace character of a field.
* sort: improve support for e notation
* sort: use maches! macros
Add some abstractions to simplify the `rbuf_but_last_n_lines()`
function, which implements the "take all but the last `n` lines"
functionality of the `head` program. This commit adds
- `RingBuffer`, a fixed-size ring buffer,
- `ZLines`, an iterator over zero-terminated "lines",
- `TakeAllBut`, an iterator over all but the last `n` elements of an
iterator.
These three together make the implementation of
`rbuf_but_last_n_lines()` concise.
Reorganize the code in `truncate.rs` into three distinct functions
representing the three modes of operation of the `truncate` program. The
three modes are
- `truncate -r RFILE FILE`, which sets the length of `FILE` to match the
length of `RFILE`,
- `truncate -r RFILE -s NUM FILE`, which sets the length of `FILE`
relative to the given `RFILE`,
- `truncate -s NUM FILE`, which sets the length of `FILE` either
absolutely or relative to its curent length.
This organization of the code makes it more concise and easier to
follow.
Create a method that computes the final target size in bytes for the
file to truncate, given the reference file size and the parameter to the
`TruncateMode`.
Add a helper function to contain the code for parsing the size and the
modifier symbol, if any. This commit also changes the `TruncateMode`
enum so that the parameter for each "mode" is stored along with the
enumeration value. This is because the parameter has a different meaning
in each mode.
Remove "read" permissions from the `OpenOptions` when opening a new file
just to truncate it. We will never read from the file, only write to
it. (Specifically, we will only call `File::set_len()`.)
`sort` supports three ways to specify the sort mode: a long option
(e.g. --numeric-sort), a short option (e.g. -n) and the sort flag
(e.g. --sort=numeric).
This adds support for the sort flag.
Additionally, sort modes now conflict, which means that an error is
shown when multiple modes are passed, instead of silently picking a mode.
For consistency, I added the `random` sort mode to the `SortMode` enum,
instead of it being a bool flag.
Change the behavior of `wc` to print the counts for a file as soon as
it is computed, instead of waiting to compute the counts for all files
before writing any output to `stdout`. The new behavior matches the
behavior of GNU `wc`.
The old behavior looked like this (the word "hello" is entered on
`stdin`):
$ wc emptyfile.txt -
hello
0 0 0 emptyfile.txt
1 1 6
1 1 6 total
The new behavior looks like this:
$ wc emptyfile.txt -
0 0 0 emptyfile.txt
hello
1 1 6
1 1 6 total
Instead of overflowing when calculating the buffer size, use
saturating_{pow, mul}.
When failing to parse the buffer size, we now crash instead of silently
ignoring the error.
To make this work we make default sort a special case of external sort.
External sorting uses auxiliary files for intermediate chunks. However,
when we can keep our intermediate chunks in memory, we don't write them
to the file system at all. Only when we notice that we can't keep them
in memory they are written to the disk.
Additionally, we don't allocate buffers with the capacity of their
maximum size anymore. Instead, they start with a capacity of 8kb and are
grown only when needed.
This makes sorting smaller files about as fast as it was before
(I'm seeing a regression of ~3%), and allows us to seamlessly continue
with auxiliary files when needed.
For any commandline arguments, ls should print the argument as is (and
not truncate to just the file name)
For any other files it reaches (say through recursive exploration), ls
should print just the filename (as path is printed once when we enter
the directory)
A lot of tests depend on GNU's coreutils to be installed in order
to obtain reference values during testing.
In these cases testing is limited to `target_os = linux`.
This PR installs GNU's coreutils on "github actions" and adjusts the
tests for `who`, `stat` and `pinky` in order to be compatible with macOS.
* `brew install coreutils` (prefix is 'g', e.g. `gwho`, `gstat`, etc.
* switch paths for testing to something that's available on both OSs,
e.g. `/boot` -> `/bin`, etc.
* switch paths for testing to the macOS equivalent,
e.g. `/dev/pts/ptmx` -> `/dev/ptmx`, etc.
* exclude paths when no equivalent is available,
e.g. `/proc`, `/etc/fstab`, etc.
* refactor tests to make better use of the testing API
* fix a warning in utmpx.rs to print to stderr instead of stdout
* fix long_usage text in `who`
* fix minor output formatting in `stat`
* the `expected_result` function should be refactored
to reduce duplicate code
* more tests should be adjusted to not only run on `target_os = linux`
Fix a bug in which the incorrect character was being used to indicate
"round up to the nearest multiple" mode. The character was "*" but it
should be "%". This commit corrects that.
Change the error message for when the reference file (the `-r` argument)
is not found to match GNU coreutils. This commit also eliminates a
redundant call to `File::open`; the file need not be opened because the
size in bytes can be read from the result of `std::fs::metadata()`.
Change the interface provided by the `parse_size()` function to reduce
its responsibilities to just a single task: parsing a number of bytes
from a string of the form '123KB', etc. Previously, the function was
also responsible for deciding which mode truncate would operate in.
Furthermore, this commit simplifies the code for parsing the number and
unit to be less verbose and use less mutable state.
Finally, this commit adds some unit tests for the `parse_size()`
function.
Previous version would perform an amount of work proportional to `CHUNK_SIZE`,
so this wasn't a valid way to benchmark at multiple values of that constant.
The `TryInto` implementation for `&mut [T]` to `&mut [T; N]` relies on `const`
generics, and is available in (stable) Rust v1.51 and later.
Change the behavior of `head` to display an error for each problematic
file, instead of displaying an error message for the first problematic
file and terminating immediately at that point. This change now matches
the behavior of GNU `head`.
Before this commit, the first error caused the program to terminate
immediately:
$ head a b c
head: error: head: cannot open 'a' for reading: No such file or directory
After this commit:
$ head a b c
head: cannot open 'a' for reading: No such file or directory
head: cannot open 'b' for reading: No such file or directory
head: cannot open 'c' for reading: No such file or directory
Instead of using a BufReader and reading each line separately,
allocating a String for each one, we read to a chunk. Lines are
references to this chunk. This makes the allocator's job much easier
and yields performance improvements.
Chunks are read on a separate thread to further improve performance.
Fix a bug in which `head` failed to print headings for `stdin` inputs
when reading from multiple files, and fix another bug in which `head`
failed to print a blank line between the contents of a file and the
heading for the next file when reading multiple files. The output now
matches that of GNU `head`.
Fix two issues with the string formatting width for counts displayed
by `wc`.
First, the output was previously not using the default minimum width
(seven characters) when reading from `stdin`. This commit corrects
this behavior to match GNU `wc`. For example,
$ cat alice_in_wonderland.txt | wc
5 57 302
Second, if at least 10^7 bytes were read from `stdin` *after* reading
from a smaller regular file, then every output row would have width
8. This disagrees with GNU `wc`, in which only the `stdin` row and the
total row would have width 8. This commit corrects this behavior to
match GNU `wc`. For example,
$ printf "%.0s0" {1..10000000} | wc emptyfile.txt -
0 0 0 emptyfile.txt
0 1 10000000
0 1 10000000 total
Fixes#2186.
Change the error messages that get printed to `stderr` for compatibility
with GNU `wc` when an input is a directory and when an input does not
exist.
Fixes#2211.
This closes#2181.
`who --lookup` is failing with a runtime panic (double free).
Since `crate::dns-lookup` already includes a safe wrapper for `getaddrinfo`
I used this crate instead of further debugging the existing code in
utmpx::canon_host().
* It was neccessary to remove the version constraint for libc in uucore.
Refactor code from the `backwards_thru_file()` function into a new
`ReverseChunks` iterator, and use that iterator to simplify the
implementation of the `backwards_thru_file()` function. The
`ReverseChunks` iterator yields `Vec<u8>` objects, each of which
references bytes of a given file.
- add `==` as undocumented alias of `=`
- handle negated comparison of `=` as literal
- negation generally applies to only the first expression of a Boolean chain,
except when combining evaluation of two literal strings
Refactor common code out of two branches of the `unbounded_tail()`
function into a new `unbounded_tail_collect()` helper function, that
collects from an iterator into a `VecDeque` and keeps either the last
`n` elements or all but the first `n` elements.
This commit also adds a new struct, `RingBuffer`, in a new module,
`ringbuffer.rs`, to be responsible for keeping the last `n` elements
of an iterator.
When merging files we need to prioritize files that occur earlier in the
command line arguments with -m.
This also makes the extsort merge step (and thus extsort itself) stable again.
Refactor the counting code from the inner loop of the `wc` program
into the `WordCount::from_line()` associated function. This commit
also splits that function up into other helper functions that
encapsulate decoding characters and finding word boundaries from raw
bytes.
This commit also implements the `Sum` trait for the `WordCount`
struct, so that we can simply call `sum()` on an iterator that yields
`WordCount` instances.