* sort: use unstable sort when possible
This results in a very minor performance (speed) improvement.
It does however result in a memory usage reduction, because unstable
sort does not allocate auxiliary memory. There's also an improvement in
overall CPU usage.
* add benchmarking instructions
* add user time
* fix typo
* sort: implement numeric string comparison
This implements -n and -h using a string comparison algorithm instead
of parsing each number to a f64 and comparing those.
This should result in a moderate performance increase and eliminate loss
of precision.
* cache parsed f64 numbers
For general numeric comparisons we have to parse numbers as f64,
as this behavior is explicitly documented by GNU coreutils.
We can however cache the parsed value to speed up comparisons.
* fix leading zeroes for negative numbers
* use more appropriate name for exponent
* improvements to the parse function
* move checks into main loop and fix thousands separator condition
* remove unneeded checks
* rustfmt
* du error output should match GNU
* Created a new error macro which allows the customization of the
"error:" string part
* Match the du output based on the type of error encountered. Can extend
to handling other errors I guess.
* Rustfmt updates
* Added non-windows test for du no permission output
* Various fixes and performance improvements
* fix a typo
Co-authored-by: Michael Debertol <michael.debertol@gmail.com>
* Fix month parse for months with leading whitespace
* Implement test for months whitespace fix
* Confirm human numeric works as expected with whitespace with a test
* Correct arg help value name for --parallel
* Fix SemVer non version lines/empty line sorting with a test
Co-authored-by: Sylvestre Ledru <sledru@mozilla.com>
Co-authored-by: Michael Debertol <michael.debertol@gmail.com>
* cat: Unrevert splice patch
* cat: Add fifo test
* cat: Add tests for error cases
* cat: Add tests for character devices
* wc: Make sure we handle short splice writes
* cat: Fix tests for 1.40.0 compiler
* cat: Run rustfmt on test_cat.rs
* Run 'cargo +1.40.0 update'
* sort: implement basic -k and -t support
This allows to specify keys after the -k flag and a custom field
separator using -t.
Support for options for specific keys is still missing, and the -b flag
is not passed down correctly.
* sort: implement support for key options
* remove unstable feature use
* don't pipe in input when we expect a failure
* only tokenize when needed, remove a clone()
* improve comments
* fix clippy lints
* re-add test
* buffer writes to stdout
* fix ignore_non_printing
and make the test fail in case it is broken :)
* move attribute to the right position
* add more tests
* add my name to the copyright section
* disallow dead code
* move a comment
* re-add a loc
* use smallvec for a perf improvement in the common case
* add BENCHMARKING.md
* add ignore_case to benchmarks
* Various fixes and performance improvements
* fix a typo
Co-authored-by: Michael Debertol <michael.debertol@gmail.com>
Co-authored-by: Sylvestre Ledru <sledru@mozilla.com>
Co-authored-by: Michael Debertol <michael.debertol@gmail.com>
* Replace outdated time 0.1 dependancy with latest version of chrono
I also noticed that times are being miscalculated on linux, so I fixed that.
* Add time test for issue #2042
* Cleanup use declarations
* Tie time test to `touch` feature
- if we compile with the right OS feature flag then we should have it,
even on Windows
Forgot to handle the case where no arguments were passed to the COMMAND.
Because ARGS can be empty, we need two separate cases for handling
options.values_of(options::ARGS)
Fixed some minor ownership issues in converting from the options to the
arguments to the timeout COMMAND.
Additionally, fixed a rustfmt issue in other files (fold/stdbuf.rs)
Changed from optparse to clap.
None of the logic within timeout has been changed, which could use some
refactoring, but that's beyond the scope of this commit.
refactor `is_wsl` to `is_wsl_1` and `is_wsl_2`
On my tests with wsl2 ubuntu2004 all tests pass without special cases
I moved wsl detection into uucore so that it can be shared instead of duplicated
I moved `parse_mode` into uucode as it seemed to fit there better and anyway requires libc feature
Treat tab chars as advancing to the next tab stop rather than having a fixed
8-column width.
Also treat tab as a whitespace split target only when splitting on word
boundaries.
The '-d' flag should create all ancestors (or components) of a
directory regardless of the presence of the "-D" flag.
From the man page:
-d, --directory
treat all arguments as directory names; create all components of the specified directories
With GNU:
$ install -v -d dir1/di2
install: creating directory 'dir1'
install: creating directory 'dir1/di2'
With this version:
$ ./target/release/install -v -d dir3/di4
install: dir3/di4: No such file or directory (os error 2)
install: dir3/di4: chmod failed with error No such file or directory (os error 2)
install: created directory 'dir3/di4'
Also, one of the unit tests misinterprets what a "component" is,
and hence was fixed.
Also don't run chmod when we just failed to create the directory.
Behaviour before this patch:
$ ./target/release/install -v -d dir1/dir2
install: dir1/dir2: Permission denied (os error 13)
install: dir1/dir2: chmod failed with error No such file or directory (os error 2)
install: created directory 'dir1/dir2'
* Issue #1622 port `du` to windows
* Attempt to support Rust 1.32
Old version was getting "attributes are not yet allowed on `if`
expressions" on Rust 1.32
* Less #[cfg]
* Less duplicate code.
I need the return and the semicolon after if otherwise the second #[cfg]
leads to unexpected token complilation error
* More accurate size on disk calculations for windows
* Expect the same output on windows as with WSL
* Better matches output from du on WSL
* In the absence of feedback I'm disabling these tests on Windows.
They require `ln`. Windows does not ship with this utility.
* Use the coreutils version of `ln` to test `du`
`fn ccmd` is courtesy of @Artoria2e5
* Look up inodes (file ids) on Windows
* One more #[cfg(windows)] to prevent unreachable statement warning on linux
* cat: Improve performance, especially on Linux
* cat: Don't use io::copy for splice fallback
On my MacBook Pro 2020, it is around 25% faster to not use io::copy.
* cat: Only fall back to generic copy if first splice fails
* cat: Don't double buffer stdout
* cat: Don't use experimental or-pattern syntax
* cat: Remove nix symbol use from non-Linux
* wc: Don't read() if we only need to count number of bytes
* Resolve a few code review comments
* Use write macros instead of print
* Fix wc tests in case only one thing is printed
* wc: Fix style
* wc: Use return value of first splice rather than second
* wc: Make main loop more readable
* wc: Don't unwrap on failed write to stdout
* wc: Increment error count when stats fail to print
* Re-add Cargo.lock
* Implemented --indicator-style flag on ls.
* Rust fmt
* Grouped indicator_style args.
* Added tests for sockets and pipes.
Needed to modify util.rs to add support for pipes (aka FIFOs).
* Updated util.rs to remove FIFO operations on Windows
* Fixed slight error in specifying (not(windows))
* Fixed style violations and added indicator_style test for non-unix systems
+ aligned 'tee' output with GNU tee when one of the files is '/dev/full'
+ don't stop tee when one of the outputs fails; just continue and return
error status from tee in the end
Co-authored-by: Ivan Rymarchyk <irymarchyk@arlo.com>
* mkfifo: general refactor, move to clap, add unimplemented flags
* chore: update Cargo.lock
* chore: delete unused variables, simplify multiple lines with crash
* test: add tests
* chore: revert the use of crash
* test: use even more invalid mod mode
* install: implement `-C` / `--compare`
GNU coreutils [1] checks the following: whether
- either file is nonexistent,
- there's a sticky bit or set[ug]id bit in play,
- either file isn't a regular file,
- the sizes of both files mismatch,
- the destination file's owner differs from intended, or
- the contents of both files mismatch.
[1] https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/install.c?h=v8.32#n174
* Add test: non-regular files
* Forgot a #[test]
* Give up on non-regular file test
* `cargo fmt` install.rs
Previously this used `print` instead of `println`, and as a result the
prompt would never appear and the command would hang. The Rust docs
note this about print:
> Note that stdout is frequently line-buffered by default so it may be
> necessary to use io::stdout().flush() to ensure the output is emitted
> immediately.
Changing to `println` fixes the issue.
Fixes#1889.
Co-authored-by: Kevin Burke <kevin@burke.dev>
* feat: move unexpand to clap
* chore: allow muliple files
* test: add test fixture, test reading from a file
* test: fix typo on file name, add test for multiple inputs
* chore: use 'success()' instead of asserting
* chore: delete unused variables
* chore: use help instead of long_help, break long line
* fix: use settings to allow leading hyphen and trailing var arg
fixes: https://github.com/uutils/coreutils/issues/1873
* test: add test cases
* test: add more test cases with different order in hyphen values
* chore: add comment to explain why we need TrailingVarArg
- changed some error return codes to match GNU implementation
- changed warning/error messages to match GNU nohup
- replaced getopts dependency with clap
- added a test
This PR adds the options to customize what information is shown in long format regarding author, group & owner. Specifically it adds:
- `--author`: shows the author, which is always the same as the owner. GNU has this feature because GNU/Hurd supports a difference between author and owner, but I don't think Rust supports GNU/Hurd, so I just used the owner.
- `-G` & `--no-group`: hide the group information.
- `-o`: hide the group and use long format (equivalent to `-lG`).
- `-g`: hide the owner and use long format.
The `-o` and `-g` options have some interesting behaviour that I had to account for. Some examples:
- `-og` hides both group and owner.
- `-ol` still hides the group. Same behaviour with variations such as `-o --format=long`, `-gl`, `-g --format=long` and `-ogl`.
- They even retain some information when overridden by another format: `-oCl` (or `-o --format=vertical --format=long`) still hides the group.
My previous solution for handling the behaviour where `-l1` shows the long format did not fit with these additions, so I had to rewrite that as well.
The tests only cover the how many names (author, group and owner) are present in the output, so it can't distinguish between, for example, author & group and group & owner.
It was a draft PR, not ready for merging, and its premature inclusion
caused repeated issues, see 368f47381b & friends.
Close#1841.
This reverts commits 3743a3e1e7,
ce218e01b6, and
b7b0c76b8e.
* date: implement set date for unix and windows
Parsing the date string is not fully implemented yet, as in it relies
on the internals of chrono - things like "Mon, 14 Aug 2006 02:34:56 -0600"
do not work, nor does "2006-08-14 02:34:56" (no TZ / local time). This
is no different to using the "--date" option however, and will get fixed
when `parse_date` is a bit smarter.
Only supports unix and Windows platforms for now.
* factor::tests::recombines_factors: Minor refactor (skip useless bool)
* factor::tests: Check factorizations of powers of factored numbers
* factor::Factors: Add debug assertions to (Factor ^ Exponent)
* factor::tests: Drop obsoleted tests
`factor_correctly_recombines_prior_test_failures` was replaced with
`factor_2044854919485649` as this was the only test not subsumed.
* factor::tests::2044854919485649: Check the expected factorisation
Current implementation of the skip fields logic does not handle
multibyte code points correctly. It assumes each code point (`char`) is
one byte. If the skipped part of the input line has any multibyte code
points then this can cause fields not being skipped correctly (field
start index is calculated to be before it actually starts).
expand has one odd behavior that allows two format for tabstop
From expand --help
```
-t, --tabs=N have tabs N characters apart, not 8
-t, --tabs=LIST use comma separated list of tab positions
```
This patch use one `value_name("N, LIST")` for tabstop and
deal with above behavior in `parse_tabstop`.
Close#1795
* touch: use arggroup for sources
* tests/touch: add tests for multiple sources
* touch: turn macros into functions
* test/touch: fmt
* touch: constant for the sources ArgGroup
- added `-` as the default input, since `paste` reads stdin if no file
is provided
- `paste` also supports providing `-` multiple times
- added a test for it
* muted test not for windows and added windows temp file convention
* Update mktemp.rs
Revert windows mktmp template difference
Co-authored-by: Chad Brewbaker <chad@flyingdogsolutions.com>
When converting to SI or IEC, produce values that align with the conventions
used by GNU numfmt.
- values > 10 are represented without a decimal place, so 10000 becomes 10K
instead of 10.0K
- when truncating, take the ceiling of the value, so 100001 becomes 101K
- values < 10 are truncated to the highest tenth, so 1001 becomes 1.1K
closes#1726
Adjust header option handling to prohibit passing a value of 0 to align
with GNU numfmt. Also report header option parse errors as GNU does.
closes#1708
Align with GNU numfmt by trimming leading whitespace from supplied values.
If the user did not specify a padding, calculate an implied padding from
the leading whitespace and the value.
Also track closer to GNU numfmt’s error message format.
Previously if no --color argument was input, we would always print
colors in the output. This breaks `configure` scripts which run `ls`
and then compare the output against what they expect to see, since the
left side has ANSI escape sequences and the right side doesn't.
Instead, only print escape sequences if a TTY is present, or if
`--color=always` is specified.
Fixes#1638.
* factor: Confine knowledge of num_traits to numeric::traits
This should make it easier to deal with API changes of num_traits,
or eventually switching to an abstraction provided in the stdlib.
This little check, allows us to hide the files that
shouldn't be shown on the listing on Windows operating
systems.
Just like the "dot" in UNIX based operating systems
Windows uses its own file attributes to determine if a file
is hidden or not.
The lack of support for this option is normally an annoyance
for many users, this commit adds full support for this feature
For inputs that are valid base64 but that encode non-utf8 strings (like
garbage), base64 panicks when trying to unwrap the result from
String::from_utf8().
Instead of interpreting the byte stream as utf8, simply dump the raw
bytes to stdout.
Since the test assert that all io is valid utf8, this does not come with
a unit test. See run() in tests/common/utils.rs.
Eg.
"gD63hSj3ScS+wuOeGrubXlq35N1c5Lby/S+T7MNTjxo=" -> ">(Iľ^Z\/S"
- refactor internal version specifications to be ">=M.m.p" (where M.m.p is *already published*)
## [why]
Loosening internal version dependencies decreases the coupling between packages such
that packages can be published in a looser order. It allows the packages to be version
updated and published in tandem (ie, by using `cargo workspace ...`). Once published,
the internal versions can then be updated (again, to an *already published* package
version), as needed.
This avoids allocating on the heap when factoring most numbers,
without using much space on the stack.
This is ~3.5% faster than the previous commit, and ~8.3% faster than “master”.
The invariant is checked by a debug_assert!, and follows from the previous
commit, as `dec` and `factors` only ever contains coprime numbers:
- true at the start: factor = ∅ and dec = { n¹ } ;
- on every loop iteration, we pull out an element `f` from `dec` and either:
- discover it is prime, and add it to `factors` ;
- split it into a number of coprime factors, that get reinserted into `dec`;
the invariant is maintained, as all divisors of `f` are coprime with all
numbers in `dec` and `factors` (as `f` itself is coprime with them.
As we only add elements to `Decomposition` objects that are coprime with the
existing ones, they are distinct.
- change `const`=>`static` and remove unneeded help/version (supplied by default by `clap`)
- update of the ABOUT description
- move to alphabetical order (where reasonable)
- rename OPT_FILES => ARG_FILES
- change the order of the declarations
- add a trailing "." to ABOUT for consistency
- rename OPT_FILES => ARG_FILES
- move to alphabetical order for OPTIONs (where reasonable)
Co-authored-by: Roy Ivy III <rivy.dev@gmail.com>
If the output of sort is piped to another program that closes the file
descriptor, sort currently panics. The GNU coreutils is able to handle
this case.
Replacing panic with crash_if_err reports the closed pipe and exits
with a return code, which seems like the correct behavior. Tested on
my Mac and the panic disappears.
Add a test which pipes data to sort - it won't protect against this
specific regression, but it increases the test coverage, at least.
Fixes#1608.
* factor: Introduce a type alias for exponents
This way, we can easily replace u8 with a larger type when moving to support
larger integers.
* factor::Factors: Split off a Decomposition type
The new type can be used to represent in-progress factorisations,
which contain non-prime factors.
* factor::Decomposition: Use a flat vector representation
~18% faster than BTreeMap, and ~5% faster than “master”.
* factor::Factors: Use a RefCell rather than copy data when printing
~2.9% faster than the previous commit, ~11% faster than “master” overall.
* CI: Only run rustfmt in one environment
- This displays clippy warnings even when rustfmt fails.
- This avoids displaying 3 copies of the same rustfmt warning as Github
annotations.
- Avoids duplicated work.
* CI: Suppress warnings when building for the oldest toolchain version
We had cases of warnings emitted due to `rustc` bugs that were fixed
in non-obsolete versions.
* factor: Remove a workaround for warnings on obsolete rustc
* factor::numeric::gcd: Switch variable names to be more consistent
* factor::numeric::gcd: Improve comments
* factor::numeric::gcd: Extend loop invariant to v
* factor::Factors: Derive implementations of Eq and PartialEq
* factors::Factors: Implement quickcheck::Arbitrary
This generates uniformly-distributed integers with known factorisations,
and will enable further testing.
* factor: Test against random numbers with known factorisations
* factor::Factors::arbitrary: Simplify method signature
* factor::numeric::gcd: Rename test against the Euclidean algorithm
* factor::numeric::gcd: Add various property-based tests
* factor::numeric::modular_inverse: Rename test
* factor::numeric::modular_inverse: Add test on random values
* factor::numeric: Start refactoring into multiple submodules
No change to the module's interface, but it should make it much easier to
keep the tests right next to the code they are related to.
Moreover, build.rs' dependency is now limited to numeric::{modular_inverse,
traits}, meaning that the rest of it can use build-time generated tables etc.
* factor::numeric: Move gcd (and its test) to a submodule
* factor::numeric: Move Montgomery arithmetic to its own module
Finally hollowed-out numeric.rs
* factor: Move numeric.rs to numeric/mod.rs
* factor::numeric: Fix an erroneous lint on obsolete Rust versions
Ignoring a value of that type is a bug: they are only produced by
`miller_rabbin::test`, which is a pure, but expensive, function.
As such, an ignored value is always either a mistake, or an easy
optimisation opportunity (just remove the useless call to `test`).
Also add a property-based test against the Euclidean implementation.
numeric::gcd got ~50-65% faster, according to criterion. The effect on the
overall system is small, but later PRs will use a lot more GCD computations.
- `DoubleInt::Double` renamed to `DoubleWidth`
- `{as,from}_double()` renamed to `{as,from}_double_width()`.
This should hopefully clarify that this is not a “double precision”
floating-point type, but an integer type with a larger range (used
for storing intermediate results, typ. from a multiplication)
Montgomery only works for odd n, so attempting to construct an instance
for an even number results in a panic!
The most obvious solution is to special-case even numbers.
test() takes a modulus that is known to not be even or <2 (otherwise the
Montgomery value could not be constructed), so those checks can be hoisted
into is_prime() and out of the critical path.
Montgomery<_> only works for odd n, so attempting to construct an instance
for an even number results in a panic!
The most obvious solution is to special-case even numbers.