Fix a bug when getting all but the first NUM lines or bytes of a file
via `tail -n +NUM <file>` or `tail -c +NUM <file>`. The bug only
existed when a file is given as an argument; it did not exist when the
input data came from stdin.
Support `-z` option when the input is not a seekable file. Previously,
the option was accepted by the argument parser, but it was being
ignored by the application logic.
This expands the error message that is printed if either input file has
an unsorted line. Both the program name (join) and the offending line
are printed out with the message to match the behaviour of the GNU
utility.
This commit replaces generic Results with UResults in some key
functions in numfmt. As a result of this, we can provide different
exit codes for different errors, which resolves ~70 failing test
cases in the GNU numfmt.pl test suite.
Fix two issues with the filename creation algorithm. First, this
corrects the behavior of the `-a` option. This commit ensures a
failure occurs when the number of chunks exceeds the number of
filenames representable with the specified fixed width:
$ printf "%0.sa" {1..11} | split -d -b 1 -a 1
split: output file suffixes exhausted
Second, this corrects the behavior of the default behavior when `-a`
is not specified on the command line. Previously, it was always
settings the filenames to have length 2 suffixes. This commit corrects
the behavior to follow the algorithm implied by GNU split, where the
filename lengths grow dynamically by two characters once the number of
chunks grows sufficiently large:
$ printf "%0.sa" {1..91} | ./target/debug/coreutils split -d -b 1 \
> && ls x* | tail
x81
x82
x83
x84
x85
x86
x87
x88
x89
x9000
* print error in the correct order by flushing the stdout buffer before printing an error
* print correct GNU error codes
* correct formatting for config.inode, and for dangling links
* correct padding for Format::Long
* remove colors after the -> link symbol as this doesn't match GNU
* correct the major, minor #s for char devices, and correct padding
* improve speed for all metadata intensive ops by not allocating metadata unless in a Sort mode
* new tests, have struggled with how to deal with stderr, stdout ordering in a test though
* tried to implement UIoError, but am still having issues matching the formatting of GNU
Co-authored-by: electricboogie <32370782+electricboogie@users.noreply.github.com>
Fix a bug in which a negative decimal input would not be displayed with
the correct width in the output. Before this commit, the output was
incorrectly
$ seq -w -.1 .1 .11
-0.1
0.0
0.1
After this commit, the output is correctly
$ seq -w -.1 .1 .11
-0.1
00.0
00.1
The code was failing to take into account that the input decimal "-.1"
needs to be displayed with a leading zero, like "-0.1".
Pad infinity and negative infinity values with spaces when using the
`-w` option to `seq`. This corrects the behavior of `seq` to match that
of the GNU version:
$ seq -w 1.000 inf inf | head -n 4
1.000
inf
inf
inf
Previously, it incorrectly padded with 0s instead of spaces.
* seq: use BigDecimal to represent floats
Use `BigDecimal` to represent arbitrary precision floats in order to
prevent numerical precision issues when iterating over a sequence of
numbers. This commit makes several changes at once to accomplish this
goal.
First, it creates a new struct, `PreciseNumber`, that is responsible for
storing not only the number itself but also the number of digits (both
integer and decimal) needed to display it. This information is collected
at the time of parsing the number, which lives in the new
`numberparse.rs` module.
Second, it uses the `BigDecimal` struct to store arbitrary precision
floating point numbers instead of the previous `f64` primitive
type. This protects against issues of numerical precision when
repeatedly accumulating a very small increment.
Third, since neither the `BigDecimal` nor `BigInt` types have a
representation of infinity, minus infinity, minus zero, or NaN, we add
the `ExtendedBigDecimal` and `ExtendedBigInt` enumerations which extend
the basic types with these concepts.
* fixup! seq: use BigDecimal to represent floats
* fixup! seq: use BigDecimal to represent floats
* fixup! seq: use BigDecimal to represent floats
* fixup! seq: use BigDecimal to represent floats
* fixup! seq: use BigDecimal to represent floats
GNU rm allows the `-r` flag to be specified multiple times, but
uutils/coreutils would previously exit with an error.
I encountered this while attempting to run `make clean` on the
Postgres source tree (github.com/postgres/postgres).
Updates #1663.
Replace two loops that print leading and trailing 0s when printing a
number in fixed-width mode with a single call to `write!()` with the
appropriate formatting parameters.
Ensure that the `print_seq_integers()` function is called when the first
number and the increment are integers, regardless of the type of the
last value specified.
Change the way `seq` computes the number of digits needed to print a
number so that it works for inputs given in scientific notation.
Specifically, this commit parses the input string to determine whether
it is an integer, a float in decimal notation, or a float in scientific
notation, and then computes the number of integral digits and the number
of fractional digits based on that. This also supports floating point
negative zero, expressed in both decimal and scientific notation.
This makes it no longer possible to pass multiple filenames, but every
other implementation I tried (GNU, busybox, FreeBSD, sbase) also
forbids that so I think it's for the best.
* chown: support '.' as 'user.group' separator (like ':')
It also manages the case:
user.name:group (valid too)
Should fix:
gnu/tests/chown/separator.sh
* Fixed some documentation of display_item_long that was missed in #2623
* Implemented coloring of `ls -l` symlink targets( after the arrow `->`).
* Documented display_file_name to some extent.
* Ran rustfmt as part of mitigating CI chain errors.
* Removed unused variables and code in test_ls_long_format as per #2623
specifically, as per
https://github.com/uutils/coreutils/pull/2623#pullrequestreview-742386707
* Added a thorough test for `ls -laR --color` symlink coloring implemented in this branch.
* renamed test files and dirs to shorter names and ran rustfmt.
* Changed the order with which files are expected to match the change in their name.
* Bettered some comments.
* Removed needless borrow. Fixed that one clippy warning.
* Moved the cfg not windows up to the function level
because this function is meant to only run in non-windows OS (with
groups and unames).
Fixes the unused variable warning in CI.
- prevent duplicate errors from both us and `walkdir` by instructing `walkdir'
to skip directories we failed to read metadata for.
- don't directly display `walkdir`'s errors, but format them ourselves to
match gnu's format
Fix a mix-up between ownership and mode. The latter (mode / file permissions)
can also be set on windows (which however only affects the read-only flag),
while there doesn't seem to be a straight-forward way to change file ownership
on windows.
Copy the acl as well when copying the mode. This is a non-default feature and can be
enabled with --features feat_acl, because it doesn't seem to work on CI.
It is only available for unix so far.
Copy the SELinux context if possible.
Makes the -o auto option construct a format at initialization, rather
than try to handle it as a special case when printing lines. Fixes bugs
when combined with -e, especially when combined with -a.
* Used .as_path() and .as_str() when required:
when the argument required is a Path and not a PathBuf,
or an str and not a Path, respectively.
* Changed display_items to take Vec<PathData>, which is passed, instead of [PathData]
* Added a pad_right function.
* Implemented column-formating to mimic the behavior of GNU coreutils's ls
Added returns in display_dir_entry_size that keep track of uname and
group lengths.
Renamed variables to make more sense.
Changed display_item_long to take all the lengths it needs to render
correctly.
Implemented owner, group, and author padding right to mimic GNU ls.
* Added a todo for future quality-of-life cache addition.
* Documented display_item_long, as a first step in documenting all functions.
* Revert "Used .as_path() and .as_str() when required:"
This reverts commit b88db6a817.
* Revert "Changed display_items to take Vec<PathData>, which is passed, instead of [PathData]"
This reverts commit 0c690dda8d.
* Ran cargo fmt to get rid of Style/format `fmt` testing error.
* Added a test for `ls -l` and `ls -lan` line formats.
* Changed uname -> username for cspell. Removed extra blank line for rustfmt.
Fix a bug in `seq` where the number of characters needed to print the
number was computed incorrectly in some cases. This commit changes the
computation of the width to be after parsing the number instead of
before, in order to accommodate inputs like `1e3`, which requires four
digits when printing the number, not three.
- Implement all of GNU's fiddly little details
- Don't assume Linux for error codes
- Accept badly-encoded filenames
- Report errors after the fact instead of checking ahead of time
- General cleanup
rmdir now passes GNU's tests.
Errors are now always shown with the corresponding filename.
Errors are no longer converted into warnings. Previously `wc < .`
would cause a loop.
Checking whether something is a directory is no longer done in
advance. This removes race conditions and the edge case where stdin is
a directory.
The custom error type is removed because io::Error is now enough.
Report errors properly instead of panicking.
Replace zero_copy by a simpler specialized private module.
Do not assume splices move all data at once.
Use the modern uutils machinery.
Remove the "latency" feature. The time it takes to prepare the buffer
is drowned out by the startup time anyway.
yes: Add tests
yes: Fix long input test on Windows
Fix a bug in `tac` where multi-character line separators would cause
incorrect behavior when there was overlap between candidate matches in
the input string. This commit adds a dependency on `memchr` in order to
use the `memchr::memmem::rfind_iter()` function to scan for
non-overlapping instances of the specified line separator characters,
scanning from right to left.
Fixes#2580.
* hashsum: support --check for var. length outputs
Add the ability for `hashsum --check` to work with algorithms with
variable output length. Previously, the program would terminate with an
error due to constructing an invalid regular expression.
* fixup! hashsum: support --check for var. length outputs
* tac: correct behavior of -b option
Correct the behavior of `tac -b` to match that of GNU coreutils
`tac`. Specifically, this changes `tac -b` to assume *leading* line
separators instead of the default *trailing* line separators.
Before this commit, the (incorrect) behavior was
$ printf "/abc/def" | tac -b -s "/"
def/abc/
After this commit, the behavior is
$ printf "/abc/def" | tac -b -s "/"
/def/abc
Fixes#2262.
* fixup! tac: correct behavior of -b option
* fixup! tac: correct behavior of -b option
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
Hopefully will be feature parity with GNU `tr`.
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Implemented a bit of new expansion module
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Implemented delete operation
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Partially implemented delete operation
Will go through translate next.
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Fix formatting...
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Implemented translation feature
Signed-off-by: Hanif Bin Ariffin <hanif.ariffin.4326@gmail.com>
Setting the current directory in tests affects other tests, even if the
change is reverted after, because tests are run in parallel.
This should fix the flaky cp tests.
Since some tests run multiple commands, we have to re-calculate the
expected result for every run.
This is because the expected results depend on file timestamps, but the
files are re-created for every new TestScenario.
If options::WIDTH is not given, we should try to use the terminal width.
If that is unavailable, we should fall back to the 'COLUMNS' environment variable.
If that is unavailable (or invalid), we should fall back to a default of 80.
The following test case read stdin instead of file:
```
echo abcdefg > file
cargo run -- od --format x1 file
```
This is because the -t/--format argument was able to absorb multiple
arguments after it. This has now been fixed, and a test case is added
to ensure it will not happen again.
When hitting Ctrl+C sort now deletes any temporary files. To make this easier I
created a new struct `TmpDirWrapper` that handles the creation of new temporary
files and the registration of the signal handler.
The ToDo list was updated to mark `chcon` as done.
Building and testing `chcon` requires enabling the `feat_selinux` feature. If `make` is used for building, then please specify `SELINUX_ENABLED=1` if building and testing on a system where SELinux is not enabled.
Since sort exits early due to the nonexistent file, it might no longer
be around when we try to send it the input.
This is "by design" and can be ignored.
Fix an issue where `basename` would print the empty path when provided
an input comprising only slashes (for example, "/" or "//" or
"///"). Now it correctly prints the path "/".
In such cases we have to create a temporary copy of the input file to prevent
overwriting the input with the output. This only affects merge sort, because it
is the only mode where we start writing to the output before having read all inputs.
Since lines that compare equal should be sorted together, we need to first
compare the lines (taking settings into account). Only if they do not compare
equal we should compare the hashes.
Add support for
* space-separated list of tab stops, like `--tabs="2 4 6"`,
* mixed comma- and space-separated lists, like `--tabs="2,4 6"`,
* the slash specifier in the last tab stop, like `--tabs=1,/5`,
* the plus specifier in the last tab stop, like `--tabs=1,+5`.
Change the behavior of `tac` when there are no line separators in the
input. Previously, it was appending an extra line separator; this commit
prevents that from happening. The input is now writted directly to
stdout.
Correct the error message produced by `tac` when trying to read from a
directory. Previously if the path 'a' referred to a directory, then
running `tac a` would produce the error message
dir: read error: Invalid argument
after this commit it produces
a: read error: Invalid argument
which matches GNU `tac`.
Handle additional edge cases arising from test(1)’s lack of reserved words.
For example, left parenthesis followed by an operator could indicate
either
* string comparison of a literal left parenthesis, e.g. `( = foo`
* parenthesized expression using an operator as a literal, e.g. `( = != foo )`
- Adds words to cspell exceptions
- Converts test macros to use Default trait.
- Converts parser to use Default trait.
- Adds Windows-friendly test files for block/unblock when nl is present
in test/spec file.
- Removes dd from feat_require_unix (keeps it in feat_common_core)
- Changes name of make_linux_iflags parameter from oflags to iflags.
- Removes roughed out SIGINFO impl.
- Renames plen -> target_len.
- Removes internal fn def for build_blocks and replaces with a call to
chunks from std.
- Renames apply_ct to the more descriptive apply_conversion
- Replaces manual swap steps in perform_swab with call to swap from std.
- Replaces manual solution with chunks where appropriate (in Read, and Write impl).
- Renames xfer -> transfer (in local variable names).
- Improves documentation for dd_filout/dd_stdout issue.
- Removes commented debug_assert statements.
- Modifies ProdUpdate to contain ReadStat and WriteStat rather than
copying their fields.
- Addresses verbose return in `Output<File>::new(...)`
- Resoves compiler warning when built as release when signal handler fails to
register.
- Derives _Default_ trait on ReadStat.
- Adds comments for truncated lines in block unblock tests
- Removes `as u8` in block unblock tests.
- Removes unecessary `#[inline]`
- Delegates multiplier string parsing to uucore::parse_size.
- Renames 'unfailed' -> 'succeeded' for clairity.
- Removes #dead_code warnings. No clippy warnings on my local machine.
- Reworks signal handler to better accomodate platform-specific signals.
- Removes explicit references to "if" and "of" in dd.rs.
- Removes explicit references to "bs", "ibs", "cbs" and "status" in
parseargs.rs.
- Removes `#[allow(deadcode)]` for OFlags, and IFlags.
- Removes spellchecker ignore from all dd files.
- Adds tests for 'traditional' and 'modern' CLI.
- Address Rust fmt issue
- ignores run-without-noatime test which fails on some build machines
- adds cspell:disable to all project files.
- adds .../dd/fixtures/cspell.json to ignore test fixtures.
- Adds cspell.json. Hopefully this will make you happy, spellchecker.
- Removes non-functional spellchecker-ignore tags
- Adds a sleep call to the no noatime test. Some systems were did not notice a changed
atime without the option present.
- Adds a test with a unicode filename.
- Addresses clippy lints and rustfmt issues.
- adds project header to multiple files
- updates spell check skip words
- removes linux only flags direct,noatime from mac_os build
- applies rustfmt to test_dd
- runs rustfmt on test_dd.rs
- eliminates compiler warnings
- adds many words to spellchecker ignore list
- adds sanity test for vexing conv=nocreat issue. Still WIP.
The `test_du_bytes` testcase for the `du --bytes` command is written to perform
a dynamic comparison on linux hosts, i.e. it compares the output of the command
to that of the hosts `du` from the GNU coreutils.
Previously the test was written such that it would *first* perform the dynamic
comparison, and *after that* continue to a static comparison which may fail for
specific hosts that have a filesystem different from what the test expects.
This patch excludes linux hosts from the static comparison to ensure the test
performs only the dynamic comparison.
This reimplements version_cmp, which is used in sort and ls to sort
according to versions.
However, it is not bug-for-bug identical with GNU's implementation.
I reported a bug with GNU here:
https://lists.gnu.org/archive/html/bug-coreutils/2021-06/msg00045.html
This implementation does not contain the bugs regarding the handling of
file extensions and null bytes.
`LANGUAGE=C` is not enough, `LC_ALL=C` is needed as the environment
variable that overrides all the other localization settings.
e.g.
```bash
$ LANGUAGE=C id foobar
id: ‘foobar’: no such user
$ LC_ALL=C id foobar
id: 'foobar': no such user
```
* replace `LANGUAGE` with `LC_ALL` as environment variable in the tests
* fix the the date string of affected uutils
* replace `‘` and `’` with `'`
Calling `cmd_keepenv("mv")` spawned the system `mv` instead of
the uutils `mv`. Also, `keepenv` isn't necessary because it doesn't
need to inherit environment variables.
We now actually check the stderr, because previously the result of
`ends_with` was not used, making the test pass even when it shouldn't.
I disabled the test on windows because `mkdir` does not support `-m` on
windows, making the test fail because there will be no permission error.
On FreeBSD there isn't a permission error either, and `mv` succeeds.
On some Windows machines this would otherwise cause `std::env::temp_dir`
to fall back to a path that is not writeable (C:\\Windows).
Since by default integration tests don't inherit env vars from the
parent, we have to override this in some cases.
When invoked via '[' name, last argument must be ']' or we bail out with
syntax error. Then the trailing ']' is simply disregarded and processing
happens like usual.
* --check=silent and --check=quiet, which are equivalent to -C.
* --check=diagnose-first, which is the same as --check
We also allow -c=<value>, which confuses GNU sort.
To prevent clap from parsing flags for the command to run as flags for
timeout, remove the "args" positional argument, but allow to pass flags
via the "command" positional arg.
As discussed here: https://github.com/uutils/coreutils/pull/2361
the group IDs returned for GNU's 'group' and GNU's 'id --groups'
starts with the effective group ID.
This implements a wrapper for `entris::get_groups()` which mimics
GNU's behaviour.
* add tests for `id`
* add tests for `groups`
* fix `id --groups --real` to no longer ignore `--real`