Commit graph

2734 commits

Author SHA1 Message Date
Dean Li
fe25b51a66
chmod: match GNU error
Related to #2260

Signed-off-by: Dean Li <deantvv@gmail.com>
2021-05-26 22:31:02 +08:00
Terts Diepraam
00dd8d29cb
Merge pull request #2213 from syukronrm/du-clap
du: replace getopts with clap
2021-05-26 16:15:36 +02:00
Sylvestre Ledru
c08eae8e9a
Merge pull request #2276 from ycd/clear-macros
uucore: delete unused macros
2021-05-26 13:45:30 +02:00
Syukron Rifail M
eda72b5208 du: replace getopts with clap 2021-05-26 11:23:05 +07:00
Matt Blessed
a8a1ec7faf cp: implement backup control with tests 2021-05-25 23:22:32 -04:00
Matt Blessed
7240b12895 uucore: implement backup control
Most of these changes were sourced from mv's existing backup control
implementation. A later commit will update the mv utility to use this
new share backup control.
2021-05-25 19:44:09 -04:00
Yağız can Değirmenci
898d2eb489 chore: delete 'error:' prefix on show_error 2021-05-26 02:32:02 +03:00
Yağız can Değirmenci
c78a7937f8 chore: delete show_info macro and replace with show_error 2021-05-26 02:27:10 +03:00
Yağız can Değirmenci
a77e92cc96 chore: delete unused macros 2021-05-26 01:53:40 +03:00
Yağız can Değirmenci
e5e7ca8dc5 fix: simplify logic 2021-05-24 21:20:59 +03:00
Yağız can Değirmenci
991fcc548c fix: log error messages properly on permission errors 2021-05-24 21:07:45 +03:00
Terts Diepraam
da085eca98
Merge pull request #2259 from jfinkels/wc-compute-each-file-and-print
wc: print counts for each file as soon as computed
2021-05-24 17:55:10 +02:00
Michael Debertol
218f523e1b expr: make substr infallible
Instead of returning an Err it should return the "null string"
(in our case that's the empty string) when the offset or length
is invalid.
2021-05-23 22:22:34 +02:00
Sylvestre Ledru
7bf076505f
Merge branch 'master' into who_fix_runlevel 2021-05-23 09:32:37 +02:00
Sylvestre Ledru
b175534a97
Merge pull request #2264 from miDeb/sort-sort-flag
sort: support --sort flag and check for conflicts
2021-05-23 09:30:03 +02:00
Sylvestre Ledru
41bd025d00
Merge pull request #2209 from jfinkels/head-ring-buffer
head: add abstractions for "all but last n lines"
2021-05-23 09:28:17 +02:00
Jeffrey Finkelstein
bc9db289e8 head: add abstractions for "all but last n lines"
Add some abstractions to simplify the `rbuf_but_last_n_lines()`
function, which implements the "take all but the last `n` lines"
functionality of the `head` program. This commit adds

- `RingBuffer`, a fixed-size ring buffer,
- `ZLines`, an iterator over zero-terminated "lines",
- `TakeAllBut`, an iterator over all but the last `n` elements of an
  iterator.

These three together make the implementation of
`rbuf_but_last_n_lines()` concise.
2021-05-22 23:56:48 -04:00
Jeffrey Finkelstein
1f1cd3d966 truncate: re-organize into one func for each mode
Reorganize the code in `truncate.rs` into three distinct functions
representing the three modes of operation of the `truncate` program. The
three modes are

- `truncate -r RFILE FILE`, which sets the length of `FILE` to match the
  length of `RFILE`,
- `truncate -r RFILE -s NUM FILE`, which sets the length of `FILE`
  relative to the given `RFILE`,
- `truncate -s NUM FILE`, which sets the length of `FILE` either
  absolutely or relative to its curent length.

This organization of the code makes it more concise and easier to
follow.
2021-05-22 23:54:39 -04:00
Jeffrey Finkelstein
c6d4d0c07d truncate: create TruncateMode::to_size() method
Create a method that computes the final target size in bytes for the
file to truncate, given the reference file size and the parameter to the
`TruncateMode`.
2021-05-22 23:54:00 -04:00
Jeffrey Finkelstein
544ae87575 truncate: add parse_mode_and_size() helper func
Add a helper function to contain the code for parsing the size and the
modifier symbol, if any. This commit also changes the `TruncateMode`
enum so that the parameter for each "mode" is stored along with the
enumeration value. This is because the parameter has a different meaning
in each mode.
2021-05-22 23:53:59 -04:00
Jeffrey Finkelstein
5eb2a5c3e1 truncate: remove read permissions from OpenOptions
Remove "read" permissions from the `OpenOptions` when opening a new file
just to truncate it. We will never read from the file, only write to
it. (Specifically, we will only call `File::set_len()`.)
2021-05-22 23:45:05 -04:00
Jan Scheer
44c033a013 who: exclude --runlevel from non Linux targets (fix #2239) 2021-05-23 03:05:15 +02:00
Sylvestre Ledru
4aaeede3d8 rustfmt the recent change 2021-05-23 00:13:53 +02:00
Michael Debertol
c1f67ed775 sort: support --sort flag and check for conflicts
`sort` supports three ways to specify the sort mode: a long option
(e.g. --numeric-sort), a short option (e.g. -n) and the sort flag
(e.g. --sort=numeric).
This adds support for the sort flag.

Additionally, sort modes now conflict, which means that an error is
shown when multiple modes are passed, instead of silently picking a mode.
For consistency, I added the `random` sort mode to the `SortMode` enum,
instead of it being a bool flag.
2021-05-22 23:10:41 +02:00
Sylvestre Ledru
726f271273
Merge pull request #2239 from devnexen/fbsd_who_build_fix
who freebsd build fix unsupported RUN_LVL option only for other platf…
2021-05-22 21:34:09 +02:00
Jeffrey Finkelstein
4521aa2659 wc: print counts for each file as soon as computed
Change the behavior of `wc` to print the counts for a file as soon as
it is computed, instead of waiting to compute the counts for all files
before writing any output to `stdout`. The new behavior matches the
behavior of GNU `wc`.

The old behavior looked like this (the word "hello" is entered on
`stdin`):

    $ wc emptyfile.txt -
    hello
	  0       0       0 emptyfile.txt
	  1       1       6
	  1       1       6 total

The new behavior looks like this:

    $ wc emptyfile.txt -
	  0       0       0 emptyfile.txt
    hello
	  1       1       6
	  1       1       6 total
2021-05-22 14:27:37 -04:00
Sylvestre Ledru
73fb426b2b
Merge pull request #2252 from jfinkels/realpath-simplify
realpath: use uucore::fs::canonicalize() to reduce code duplication
2021-05-22 19:10:59 +02:00
David Carlier
fcb079e20e who freebsd build fix unsupported RUN_LVL option only for other platforms. 2021-05-22 18:07:02 +01:00
Sylvestre Ledru
542deb8888
Merge pull request #2246 from miDeb/sort-automatic-extsort
sort: automatically fall back to extsort
2021-05-22 17:21:02 +02:00
Sylvestre Ledru
8055f26a73
Merge pull request #2228 from jfinkels/tail-obo-positive-bytes
tail: fix off-by-one issue for +NUM args
2021-05-22 17:18:55 +02:00
Jeffrey Finkelstein
4b5c3efe85 realpath: use uucore::fs::canonicalize()
Use the `uucore::fs::canonicalize()` function to simplify the
implementation of `realpath`.
2021-05-22 11:18:16 -04:00
Jeffrey Finkelstein
bee3b1237c uucore::fs: don't canonicalize last component
Change the behavior of `uucore::fs::canonicalize()` when `can_mode` is
`CanonicalizeMode::None` so that it does not attempt to resolve the
final component if it is a symbolic link. This matches the behavior of
the function for the non-final components of a path when `can_mode` is
`None`.
2021-05-22 11:18:16 -04:00
Sylvestre Ledru
66cfdb8644
Merge pull request #2143 from nbraud/factor/faster/table
factor::table: Implement a batched version w/ improved performance
2021-05-22 17:18:07 +02:00
Sylvestre Ledru
66dd6dbeff
Merge pull request #2244 from jfinkels/truncate-fix-round-up-character
truncate: fix character used to indicate round up
2021-05-22 14:03:59 +02:00
Michael Debertol
088443276a sort: improve handling of buffer size cmd arg
Instead of overflowing when calculating the buffer size, use
saturating_{pow, mul}.

When failing to parse the buffer size, we now crash instead of silently
ignoring the error.
2021-05-22 14:00:07 +02:00
Sylvestre Ledru
4d3be19de3
Merge pull request #2240 from jhscheer/macos_test_coreutils
who/stat/pinky: adjust tests to be compatible with running on macOS
2021-05-22 12:39:05 +02:00
Sylvestre Ledru
424a99f0e6
Merge pull request #2193 from jfinkels/2186-min-width-stdin
wc: compute minimum width to format counts up front
2021-05-22 12:37:42 +02:00
Michael Debertol
e7da8058dc sort: automatically fall back to extsort
To make this work we make default sort a special case of external sort.

External sorting uses auxiliary files for intermediate chunks. However,
when we can keep our intermediate chunks in memory, we don't write them
to the file system at all. Only when we notice that we can't keep them
in memory they are written to the disk.

Additionally, we don't allocate buffers with the capacity of their
maximum size anymore. Instead, they start with a capacity of 8kb and are
grown only when needed.

This makes sorting smaller files about as fast as it was before
(I'm seeing a regression of ~3%), and allows us to seamlessly continue
with auxiliary files when needed.
2021-05-21 23:09:46 +02:00
Anup Mahindre
414c92eed7 ls: Fix printing paths behavior
For any commandline arguments, ls should print the argument as is (and
not truncate to just the file name)
For any other files it reaches (say through recursive exploration), ls
should print just the filename (as path is printed once when we enter
the directory)
2021-05-21 22:22:28 +05:30
Jan Scheer
007e0a4e7f who/stat/pinky: adjust tests to be compatible with running on macOS
A lot of tests depend on GNU's coreutils to be installed in order
to obtain reference values during testing.
In these cases testing is limited to `target_os = linux`.
This PR installs GNU's coreutils on "github actions" and adjusts the
tests for `who`, `stat` and `pinky` in order to be compatible with macOS.

* `brew install coreutils` (prefix is 'g', e.g. `gwho`, `gstat`, etc.
* switch paths for testing to something that's available on both OSs,
    e.g. `/boot` -> `/bin`, etc.
* switch paths for testing to the macOS equivalent,
    e.g. `/dev/pts/ptmx` -> `/dev/ptmx`, etc.
* exclude paths when no equivalent is available,
    e.g. `/proc`, `/etc/fstab`, etc.
* refactor tests to make better use of the testing API
* fix a warning in utmpx.rs to print to stderr instead of stdout
* fix long_usage text in `who`
* fix minor output formatting in `stat`

* the `expected_result` function should be refactored
    to reduce duplicate code
* more tests should be adjusted to not only run on `target_os = linux`
2021-05-21 11:55:20 +02:00
Sylvestre Ledru
df45b20dc1
Merge pull request #2243 from jfinkels/truncate-min-max
truncate: use min() and max() instead of if/else statements
2021-05-21 10:09:43 +02:00
Jeffrey Finkelstein
a23555e857 truncate: fix character used to indicate round up
Fix a bug in which the incorrect character was being used to indicate
"round up to the nearest multiple" mode. The character was "*" but it
should be "%". This commit corrects that.
2021-05-20 23:19:58 -04:00
Jeffrey Finkelstein
17b95246cd truncate: use min() and max() instead of if stmts 2021-05-20 21:24:43 -04:00
Jeffrey Finkelstein
fc29846b45 truncate: fix error message for file not found
Change the error message for when the reference file (the `-r` argument)
is not found to match GNU coreutils. This commit also eliminates a
redundant call to `File::open`; the file need not be opened because the
size in bytes can be read from the result of `std::fs::metadata()`.
2021-05-20 20:59:59 -04:00
Sylvestre Ledru
efb781f59a
Merge pull request #2221 from jfinkels/head-display-multiple-errors-2
head: display errors for each input file instead of terminating at the first error
2021-05-20 23:24:35 +02:00
Sylvestre Ledru
ca196a6dad
Merge pull request #2218 from miDeb/sort-chunks
sort: read files as chunks, off-thread
2021-05-20 23:24:02 +02:00
nicoo
a0a103b15e factor::table::chunked: Add test (equivalent to the single-number version) 2021-05-20 17:01:33 +02:00
nicoo
998b3c11d3 factor: Make random Factors instance generatable for tests 2021-05-20 17:00:49 +02:00
Jeffrey Finkelstein
63b496eaa8 truncate: refactor parse_size() function
Change the interface provided by the `parse_size()` function to reduce
its responsibilities to just a single task: parsing a number of bytes
from a string of the form '123KB', etc. Previously, the function was
also responsible for deciding which mode truncate would operate in.

Furthermore, this commit simplifies the code for parsing the number and
unit to be less verbose and use less mutable state.

Finally, this commit adds some unit tests for the `parse_size()`
function.
2021-05-19 23:07:11 -04:00
Jan Scheer
8032c6d750 fix clippy warnings 2021-05-19 01:37:28 +02:00
Sylvestre Ledru
cacd078a49
Merge pull request #2227 from jfinkels/tail-iocopy-bounded-tail
tail: use std::io::copy() to write bytes to stdout
2021-05-18 20:42:59 +02:00
Jan Scheer
ce5b852a31 stat: remove unused/duplicate tests 2021-05-18 19:58:33 +02:00
Arijit Dey
1596c65dfd
Downgrade crossterm version 2021-05-18 22:29:59 +05:30
Arijit Dey
7a88df9fb4
Fix broken terminal in tests 2021-05-18 12:42:33 +05:30
Jeffrey Finkelstein
bc29645531 tail: fix off-by-one issue for +NUM args
Fix an off-by-one issue for `tail -c +NUM` and `tail -n +NUM` command
line options.
2021-05-17 19:45:42 -04:00
Jeffrey Finkelstein
fea1026669 tail: use std::io::copy() to write bytes to stdout 2021-05-17 18:15:39 -04:00
nicoo
00322b986b factor: Move benchmarks out-of-crate 2021-05-17 19:43:38 +02:00
nicoo
1cd001f529 factor::benches::table: Match BenchmarkId w/ criterion's conventions
See https://bheisler.github.io/criterion.rs/book/user_guide/comparing_functions.html
2021-05-17 19:43:38 +02:00
nicoo
7c649bc74e factor::benches: Add check against ASLR 2021-05-17 19:43:38 +02:00
nicoo
ddfcd2eb14 factor::benchmarking: Add wishlist / planned work 2021-05-17 19:43:38 +02:00
nicoo
1d75f09743 factor::benchmarking(doc): Add guidance on writing µbenches 2021-05-17 19:43:38 +02:00
nicoo
e9f8194266 factor::benchmarking(doc): Add guidance on running µbenches 2021-05-17 19:43:38 +02:00
nicoo
ae15bf16a8 factor::benches::table: Report throughput (in numbers/s) 2021-05-17 19:43:38 +02:00
nicoo
12efaa6add factor: Add BENCHMARKING.md 2021-05-17 19:43:38 +02:00
nicoo
7c287542c7 factor::table: Fixup microbenchmark
Previous version would perform an amount of work proportional to `CHUNK_SIZE`,
so this wasn't a valid way to benchmark at multiple values of that constant.

The `TryInto` implementation for `&mut [T]` to `&mut [T; N]` relies on `const`
generics, and is available in (stable) Rust v1.51 and later.
2021-05-17 19:43:38 +02:00
nicoo
1fd5f9da25 factor::table::factor_chunk: Turn loop inside-out
This keeps the traversal of `P_INVS_U64` (a large table) to a single pass
in-order, rather than `CHUNK_SIZE` passes.
2021-05-17 19:43:38 +02:00
nicoo
cd047425aa factor::table: Add chunked implementation and microbenchmarks
The factor_chunk implementation is a strawman, but getting it in place allows us
to set up the microbenchmarking etc.
2021-05-17 19:43:38 +02:00
nicoo
c68c83c6dd factor::table: Take mutable refs
This will be easier to adapt to working with multiple numbers to process at once.
2021-05-17 19:43:38 +02:00
Jeffrey Finkelstein
eeef8290df head: display errors for each input file
Change the behavior of `head` to display an error for each problematic
file, instead of displaying an error message for the first problematic
file and terminating immediately at that point. This change now matches
the behavior of GNU `head`.

Before this commit, the first error caused the program to terminate
immediately:

    $ head a b c
    head: error: head: cannot open 'a' for reading: No such file or directory

After this commit:

    $ head a b c
    head: cannot open 'a' for reading: No such file or directory
    head: cannot open 'b' for reading: No such file or directory
    head: cannot open 'c' for reading: No such file or directory
2021-05-17 08:19:47 -04:00
Michael Debertol
fcd48813e0 sort: read files as chunks, off-thread
Instead of using a BufReader and reading each line separately,
allocating a String for each one, we read to a chunk. Lines are
references to this chunk. This makes the allocator's job much easier
and yields performance improvements.

Chunks are read on a separate thread to further improve performance.
2021-05-16 21:13:37 +02:00
Arijit Dey
c930509095
Fix clippy warning 2021-05-16 22:30:46 +05:30
Arijit Dey
22ba21d8ab
Fix bug with terminal getting weird 2021-05-16 22:26:54 +05:30
Jeffrey Finkelstein
659bf58a4c head: print headings when reading multiple files
Fix a bug in which `head` failed to print headings for `stdin` inputs
when reading from multiple files, and fix another bug in which `head`
failed to print a blank line between the contents of a file and the
heading for the next file when reading multiple files. The output now
matches that of GNU `head`.
2021-05-16 12:03:10 -04:00
Jeffrey Finkelstein
733d347fa8 head: simplify rbuf_n_bytes() in head.rs
Simplify the code in `rbuf_n_bytes()` to use existing abstractions
provided by the standard library.
2021-05-15 23:04:01 -04:00
Jeffrey Finkelstein
97a49c7c95 wc: compute min width to format counts up front
Fix two issues with the string formatting width for counts displayed
by `wc`.

First, the output was previously not using the default minimum width
(seven characters) when reading from `stdin`. This commit corrects
this behavior to match GNU `wc`. For example,

    $ cat alice_in_wonderland.txt | wc
          5      57     302

Second, if at least 10^7 bytes were read from `stdin` *after* reading
from a smaller regular file, then every output row would have width
8. This disagrees with GNU `wc`, in which only the `stdin` row and the
total row would have width 8. This commit corrects this behavior to
match GNU `wc`. For example,

    $ printf "%.0s0" {1..10000000} | wc emptyfile.txt -
	  0       0       0 emptyfile.txt
	  0       1 10000000
	  0       1 10000000 total

Fixes #2186.
2021-05-15 21:41:47 -04:00
Sylvestre Ledru
620a5a5df6
Merge pull request #2210 from jhscheer/dns_lookup
who: fix `--lookup`
2021-05-15 21:18:12 +02:00
Jeffrey Finkelstein
e8d911d9d5 wc: correct some error messages for invalid inputs
Change the error messages that get printed to `stderr` for compatibility
with GNU `wc` when an input is a directory and when an input does not
exist.

Fixes #2211.
2021-05-15 10:35:21 -04:00
Jan Scheer
a4fc2b5106 who: fix --lookup
This closes #2181.

`who --lookup` is failing with a runtime panic (double free).
Since `crate::dns-lookup` already includes a safe wrapper for `getaddrinfo`
I used this crate instead of further debugging the existing code in
utmpx::canon_host().

* It was neccessary to remove the version constraint for libc in uucore.
2021-05-13 22:16:15 +02:00
Jeffrey Finkelstein
2e621759b2 tail: refactor code into ReverseChunks iterator
Refactor code from the `backwards_thru_file()` function into a new
`ReverseChunks` iterator, and use that iterator to simplify the
implementation of the `backwards_thru_file()` function. The
`ReverseChunks` iterator yields `Vec<u8>` objects, each of which
references bytes of a given file.
2021-05-12 18:43:58 -04:00
Jeffrey Finkelstein
3114fd77be tail: use &mut File instead of mut file: &File 2021-05-12 18:43:35 -04:00
Sylvestre Ledru
2178edf628
Merge pull request #2207 from jhscheer/issue_2204
date: fix format literal for nanoseconds
2021-05-12 13:14:23 +02:00
Jan Scheer
12a43d6eb3 date: fix format literal for nanoseconds 2021-05-12 10:21:24 +02:00
Sylvestre Ledru
a5f8ca60b5
Merge pull request #2199 from jhscheer/refactor_fsext
df/stat: refactor - reduce duplicate code
2021-05-12 08:41:16 +02:00
Sylvestre Ledru
6635301f32
Merge pull request #2194 from miDeb/sort-stable-merge
sort: make merging stable
2021-05-12 08:38:48 +02:00
Sylvestre Ledru
57ae202037
Merge pull request #2195 from nthery/wc_dash
wc: emit '-' in ouput when set on command-line
2021-05-12 08:37:55 +02:00
Sylvestre Ledru
8f24ec9414
Merge pull request #2198 from jfinkels/tail-refactor
tail: simplify unbounded_tail() function
2021-05-12 08:35:45 +02:00
Sylvestre Ledru
68a3488cdc
Merge pull request #2202 from drocco007/test-negated-boolean
test: improve handling of inverted Boolean expressions
2021-05-12 08:34:41 +02:00
Jan Scheer
8200d399e8 date: fix format for nanoseconds 2021-05-11 23:03:59 +02:00
Daniel Rocco
2ec4bee350 test: improve handling of inverted Boolean expressions
- add `==` as undocumented alias of `=`

- handle negated comparison of `=` as literal

- negation generally applies to only the first expression of a Boolean chain,
  except when combining evaluation of two literal strings
2021-05-10 22:48:40 -04:00
Jan Scheer
381f8dafc6 df/uucore: refactor - move duplicate code to uucore/fsext.rs 2021-05-10 23:37:01 +02:00
Sylvestre Ledru
ed42652803
Merge pull request #2200 from jhscheer/fix_clippy
fix clippy warnings
2021-05-10 16:13:27 +02:00
Jan Scheer
4ac75898c3 fix clippy warnings 2021-05-10 15:48:32 +02:00
Jan Scheer
203ee463c7 stat/uucore: refactor - move fsext.rs to uucore 2021-05-10 10:46:00 +02:00
Jeffrey Finkelstein
0cc779c733 tail: simplify unbounded_tail() function
Refactor common code out of two branches of the `unbounded_tail()`
function into a new `unbounded_tail_collect()` helper function, that
collects from an iterator into a `VecDeque` and keeps either the last
`n` elements or all but the first `n` elements.

This commit also adds a new struct, `RingBuffer`, in a new module,
`ringbuffer.rs`, to be responsible for keeping the last `n` elements
of an iterator.
2021-05-09 23:47:13 -04:00
Gilad Naaman
8747800697 Switched 'arch' to use clap instead of getopts 2021-05-09 21:53:03 +03:00
Sylvestre Ledru
7c51fb4946
Merge pull request #2165 from miDeb/sort-optimize-line
sort: optimize the line struct
2021-05-09 18:41:39 +02:00
Nicolas Thery
112b042769 wc: emit '-' in ouput when set on command-line
When stdin is explicitly specified on the command-line with '-', emit it
in the output stats to match GNU wc output.

Fixes #2188.
2021-05-09 15:47:05 +02:00
Michael Debertol
e0ebf907a4 sort: make merging stable
When merging files we need to prioritize files that occur earlier in the
command line arguments with -m.

This also makes the extsort merge step (and thus extsort itself) stable again.
2021-05-09 11:43:38 +02:00
Sylvestre Ledru
d43af35147
Merge pull request #2145 from tertsdiepraam/ls/device_information
`ls`: implement device symbol and id
2021-05-09 00:50:35 +02:00
Terts Diepraam
f6e5f86fe7 Merge branch 'master' into ls/device_information 2021-05-08 23:21:44 +02:00
Michael Debertol
d686f7e48f sort: improve comments 2021-05-08 22:31:53 +02:00
Sylvestre Ledru
01a702c6fd
Merge branch 'master' into issue2167 2021-05-08 20:26:21 +02:00
Michael Debertol
1afeb55881 Merge branch 'master' of https://github.com/uutils/coreutils into sort-optimize-line 2021-05-08 15:47:19 +02:00
Samuel Ainsworth
2ff9cc6570 Typo in comment 2021-05-08 14:25:21 +02:00
Samuel Ainsworth
bacad8ed93 Use u128 instead of usize for large numbers, and consistency across architectures 2021-05-08 14:25:21 +02:00
Samuel Ainsworth
7c1395366e Fix split's handling of non-UTF-8 files 2021-05-08 14:25:21 +02:00
Samuel Ainsworth
a9ac7af9e1 Simplify parsing of --bytes for the split command 2021-05-08 14:25:21 +02:00
Jeffrey Finkelstein
ba8f4ea670 wc: move counting code into WordCount::from_line()
Refactor the counting code from the inner loop of the `wc` program
into the `WordCount::from_line()` associated function. This commit
also splits that function up into other helper functions that
encapsulate decoding characters and finding word boundaries from raw
bytes.

This commit also implements the `Sum` trait for the `WordCount`
struct, so that we can simply call `sum()` on an iterator that yields
`WordCount` instances.
2021-05-08 14:24:07 +02:00
Jeffrey Finkelstein
50f4941d49 wc: refactor WordCount into its own module
Move the `WordCount` struct and its implementations into the
`wordcount.rs`.
2021-05-08 14:24:07 +02:00
Jeffrey Finkelstein
ee43655bdb fixup! wc: rm leading space when printing multiple counts 2021-05-08 13:11:09 +02:00
Jeffrey Finkelstein
525f71bada wc: rm leading space when printing multiple counts
Remove the leading space from the output of `wc` when printing two or
more types of counts.

Fixes #2173.
2021-05-08 13:11:09 +02:00
Jan Scheer
a885376583 uucore: refactor - reduce duplicate code related to fs::display_permissions
This is a refactor to reduce duplicate code, it affects chmod/ls/stat.
* merge `stat/src/fsext::pretty_access` into `uucore/src/lib/feature/fs::display_permissions_unix`
* move tests for `fs::display_permissions` from `test_stat::test_access` to `uucore/src/lib/features/fs::test_display_permissions`
* adjust `uu_chmod`, `uu_ls` and `uu_stat` to use `uucore::fs::display_permissions`
2021-05-08 11:52:41 +02:00
Michael Debertol
38effc93b3 sort: use FileMerger for extsort merge step
FileMerger is much more efficient than the previous algorithm,
which looped over all elements every time to determine the next element.

FileMerger uses a BinaryHeap, which should bring the complexity for
the merge step down from O(n²) to O(n log n).
2021-05-08 11:51:32 +02:00
Michael Debertol
64c1f16421 sort: allow some functions to be called with OsStr 2021-05-08 11:51:32 +02:00
Terts Diepraam
3b6c7bc9e9 Fix mistakes with merging 2021-05-08 00:50:36 +02:00
Michael Debertol
8c9faa16b9 sort: improve memory usage for extsort 2021-05-07 21:51:31 +02:00
Michael Debertol
c38373946a sort: optimize the Line struct 2021-05-07 21:51:25 +02:00
Terts Diepraam
6834d0256e Merge branch 'master' into ls/device_information 2021-05-07 18:56:44 +02:00
Arijit Dey
d2ab0dcded
Make a nice error when file does not exist 2021-05-06 22:12:15 +05:30
Idan Attias
34b9809223 logname: fix test & style warning 2021-05-06 14:19:47 +02:00
Idan Attias
41eb930292 logname: align profile 2021-05-06 14:19:47 +02:00
Idan Attias
b24b9d501b logname: replace getopts with clap 2021-05-06 14:19:47 +02:00
jaggededgedjustice
a2658250fc
Fix fmt crashing on subtracting unsigned numbers (#2178) 2021-05-05 23:12:17 +02:00
Anup Mahindre
7d2b051866
Implement Total size feature (#2170)
* ls: Implement total size feature

- Implement total size reporting that was missing
- Fix minor formatting / readability nits

* tests: Add tests for ls total sizes feature

* ls: Fix MSRV build errors due to unsupported attributes for if blocks

* ls: Add windows support for total sizes feature

- Add windows support (defaults to file size as block sizes related
infromation is not avialable on windows)
- Renamed some functions
2021-05-05 23:03:25 +02:00
rethab
231bb7be93
Migrate mknod to clap, closes #2051 (#2056)
* mknod: add tests for fifo

* mknod: add test for character device
2021-05-05 22:59:40 +02:00
Sylvestre Ledru
f83316f36e
Merge pull request #2156 from miDeb/sort-no-json-extsort
sort: don't rely on serde-json for extsort
2021-05-05 22:33:18 +02:00
Sylvestre Ledru
482e340e11
Merge branch 'master' into implement-more 2021-05-04 13:35:38 +02:00
Sylvestre Ledru
1edf4064f3
Merge pull request #2162 from bashi8128/basename-clap
basename: move from getopts to clap
2021-05-04 10:59:19 +02:00
Sylvestre Ledru
3f5dda66f4
Merge pull request #2138 from jhscheer/who2clap
who: move from getopts to clap (#2124)
2021-05-04 10:58:52 +02:00
Sylvestre Ledru
e3b7a8bd22
Merge pull request #2166 from jfinkels/wc-word-countable-lines
wc: add lines() method for iterating over lines
2021-05-04 09:53:08 +02:00
Jan Scheer
56761ba584 stat: implement support for macos 2021-05-03 22:30:56 +02:00
David CARLIER
224c8b3f94 df output update (non inode mode) proposal specific for mac. on this platform, capacity column is also displayed. 2021-05-03 15:49:55 +01:00
bashi8128
5a4bb610ff basename: rename variable names
Rename variable names to be more explicit ones
2021-05-03 23:32:01 +09:00
bashi8128
74802f9f0f basename: improve error messages
Remove duplicated utility name from error messages
2021-05-03 23:26:46 +09:00
Jeffrey Finkelstein
0a3e2216d7 wc: add lines() method for iterating over lines
Add the `WordCountable::lines()` method that returns an iterator over
lines of a file-like object. This mirrors the
`std::io::BufRead::lines()` method, with some minor differences due to
the particular use case of `wc`.

This commit also creates a new module, `countable.rs`, to contain the
`WordCountable` trait and the new `Lines` struct returned by `lines()`.
2021-05-02 16:32:38 -04:00
Sylvestre Ledru
6c04d0d21e
Merge pull request #2155 from nthery/kill_clap
kill: migrate to clap
2021-05-02 18:45:29 +02:00
Michael Debertol
e99f157e6a Merge branch 'master' of https://github.com/uutils/coreutils into sort-no-json-extsort 2021-05-02 18:08:15 +02:00
Sylvestre Ledru
9b7e7bbbc6
Merge pull request #2144 from miDeb/sort-no-transforms
sort: add some custom string comparisons
2021-05-02 18:04:27 +02:00
Michael Debertol
dc5bd9f0be improve memory usage estimation 2021-05-02 17:27:44 +02:00
Sylvestre Ledru
f8ec4a554c
Merge pull request #2161 from tertsdiepraam/ls/sort_order_and_subdirectory_listing
`ls`: C sort order and fix subdirectory listing
2021-05-02 17:21:56 +02:00
Jan Scheer
acd30526a2 tr: fix clippy warning 2021-05-02 13:53:11 +02:00
Jan Scheer
000bd73edc tr: fix merge conflict 2021-05-02 12:39:25 +02:00
Jan Scheer
8739139a7f Merge branch 'master' into issue2147 2021-05-02 12:35:19 +02:00
Nicolas Thery
1dccbfd21e kill: migrate to clap
Fixes #2122.
2021-05-02 12:31:41 +02:00
Jan Scheer
34c22dc3ad tr: fix complement if set2 is range 2021-05-02 12:15:16 +02:00
Sylvestre Ledru
9554710ab5 cat: the function 'unistd::write' doesn't need a mutable reference 2021-05-02 10:31:28 +02:00
Sylvestre Ledru
09178360d8 date: unneeded 'return' statement 2021-05-02 10:30:28 +02:00
Terts Diepraam
eb3206737b ls: give '.' a file_type 2021-05-02 10:20:14 +02:00
bashi8128
47a5dd0f97 basename: move from getopts to clap (#2117)
Use clap for argument parsing instead of getopts
Also, make the following changes

* Use `executable!()` macro to output the name of utility

* Add another usage to help message
2021-05-02 17:08:14 +09:00
Terts Diepraam
361408cbe5 ls: remove case-insensitivity and leading period of name sort 2021-05-02 10:04:11 +02:00
Terts Diepraam
28c7800f73 ls: fix subdirectory name 2021-05-02 10:03:01 +02:00
Sylvestre Ledru
108f9928ef cp: fix 'variable does not need to be mutable' 2021-05-02 09:39:09 +02:00
Sylvestre Ledru
e723b8db43 factor: unneeded statement 2021-05-02 09:35:59 +02:00
Sylvestre Ledru
5e82b195bd ls: remove redundant import 2021-05-02 09:35:00 +02:00
Sylvestre Ledru
2d0f4daf5b
Merge pull request #2152 from deantvv/link-clap
link: replace getopts with clap
2021-05-02 09:33:11 +02:00
Dean Li
f5c7d9bd80 link: replace getopts with clap 2021-05-02 10:40:48 +08:00
Daniel Rocco
3c126bad72 test: implement parenthesized expressions, additional tests
- Replace the parser with a recursive descent implementation that handles
  parentheses and produces a stack of operations in postfix order.

  Parsing now operates directly on OsStrings passed by the uucore framework.

- Replace the dispatch mechanism with a stack machine operating on the
  symbol stack produced by the parser.

- Add tests for parenthesized expressions.

- Begin testing character encoding handling.
2021-05-01 22:40:47 -04:00
Sylvestre Ledru
7e07438b38
Merge pull request #2151 from jfinkels/2141-translate-and-squeeze
tr: implement translate and squeeze (-s) mode
2021-05-01 23:27:43 +02:00
Michael Debertol
484558e37d
Update src/uu/sort/BENCHMARKING.md
Co-authored-by: Sylvestre Ledru <sledru@mozilla.com>
2021-05-01 21:38:36 +02:00
Michael Debertol
b21a309c3f add a benchmarking example 2021-05-01 21:29:18 +02:00
Michael Debertol
83554f4475 add benchmarking instructions 2021-05-01 21:16:29 +02:00
Ricardo Iglesias
193ad56c2a Removed clippy warnings. 2021-05-01 11:36:46 -07:00
Ricardo Iglesias
f307de22d0 base64: Refactor argument parsing
Moved most of the argument parsing logic to `base32/base_common.rs` to
allow for significant code reuse.
2021-05-01 11:36:46 -07:00
Ricardo Iglesias
05b20c32a9 base64: Moved argument parsing to clap.
Moved argument parsing to clap and added tests to cover using "-" as
stdin, passing in too many file arguments, and updated the "wrap" error
message in the tests.
2021-05-01 11:36:46 -07:00
Jeffrey Finkelstein
5674d09327 fixup! tr: implement translate and squeeze (-s) mode 2021-05-01 13:01:55 -04:00
Jan Scheer
83eb704415 Merge branch 'master' into issue2147 2021-05-01 18:52:35 +02:00
Jan Scheer
117e84eed3 tr: implement complement separately from delete or squeeze (#2147) 2021-05-01 18:46:13 +02:00
Sylvestre Ledru
bffcb431b5
Merge pull request #2148 from jhscheer/pinky2clap
pinky: move from getopts to clap (#2123)
2021-05-01 17:49:10 +02:00
Sylvestre Ledru
34bf7cc5ea
Merge pull request #2150 from jhscheer/fix_clap_short
tr/dirname: fix clap short_alias
2021-05-01 17:39:15 +02:00
Michael Debertol
be0c924c95 Merge branch 'master' of https://github.com/uutils/coreutils into sort-no-json-extsort 2021-05-01 17:29:03 +02:00
Michael Debertol
01d178cf17 sort: don't rely on serde-json for extsort
It is much faster to just write the lines to disk, separated by \n
(or \0 if zero-terminated is enabled), instead of serializing to json.

external_sort now knows of the Line struct instead of interacting with
it using the ExternallySortable trait. Similarly, it now uses the
crash_if_err! macro to handle errors, instead of bubbling them up.

Some functions were changed from taking &[Line] as the input to taking
an Iterator<Item = Line>. This removes the need to collect to a Vec
when not necessary.
2021-05-01 17:20:56 +02:00
Nicolas Thery
70ab0d01d2 kill: change default signal
The default signal is SIGTERM, not SIGKILL.
2021-05-01 16:47:42 +02:00
Sylvestre Ledru
d2913f8080 rustfmt the recent change 2021-05-01 13:12:10 +02:00
Sylvestre Ledru
59ea28628b printf: remove useless declaration 2021-05-01 13:11:41 +02:00
Jeffrey Finkelstein
0f3bc23739 tr: implement translate and squeeze (-s) mode
Add translate and squeeze mode to the `tr` program. For example:

    $ printf xx | tr -s x y
    y

Fixes #2141.
2021-04-30 18:17:05 -04:00
Jan Scheer
798a033311 pinky: move from getopts to clap (#2123) 2021-04-30 20:57:38 +02:00
Jan Scheer
45dd9d4e96 tr/dirname: fix clap short_alias 2021-04-30 20:19:43 +02:00
Terts Diepraam
d300895d28 ls: add birth time for windows and attampt to fix test 2021-04-29 22:23:04 +02:00
Terts Diepraam
d624827913 ls: fix windows and add more file types 2021-04-29 18:44:46 +02:00
Terts Diepraam
c69afa00ff ls: implement device symbol and id 2021-04-29 18:25:34 +02:00
Michael Debertol
fecbf3dc85 sort: remove an unneeded clone() 2021-04-29 18:05:55 +02:00
Michael Debertol
a4813c2646 sort: actually use the f64 cache
This was probably reverted accidentally.
2021-04-29 18:05:43 +02:00
Michael Debertol
9f45431bf0 sort: add some custom string comparisons
This removes the need to allocate a new string for each line when used
with -f, -d or -i. Instead, a custom string comparison algorithm takes
care of these cases.

The resulting performance improvement is about 20% per flag (i.e. there
is a 60% improvement when combining all three flags)

As a side-effect, the size of the Line struct was reduced from 96 to 80
bytes, reducing the overhead for each line.
2021-04-29 18:05:14 +02:00
Arijit Dey
2593b3f2e1
Rewrite the cli usage function
Add crossterm as dependency

Complete the paging portion

Fixed tests

cp: extract linux COW logic into function

cp: add --reflink support for macOS

Fixes #1773

Fix error in Cargo.lock

Quit automatically if not much output is left

Remove unnecessary redox and windows specific code

Handle line wrapping

Put everything according to uutils coding standards

Add support for multiple files

Fix failing test

Use the args argument to get cli arguments

Fix bug where text is repeated multiple times during printing

Add a little prompt

Add a top file prompt for multiple files

Change println in loops to stdout.write and setup terminal only once

Fix bug where all lines were printed in a single row

Remove useless file and fix failing test

Fix another test
2021-04-29 20:23:35 +05:30
nicoo
b89978a4c9
factor: Add annotations for coz, the causal profiler (#2142)
* factor: Add annotations for coz, the causal profiler

* Update Cargo.lock

Generated with `nix-shell -p rustup --run 'cargo +1.40.0 update'`
2021-04-29 15:56:56 +02:00
Jan Scheer
512d206f1e who: move from getopts to clap 2.33.3 (#2124) 2021-04-29 00:11:21 +02:00
Jan Scheer
6f16cafe88 who: move from getopts to clap (#2124) 2021-04-28 22:58:28 +02:00
Rein F
a60fd07bc3
ls: improvements on time handling (#1986)
* ls: added creation time

* ls: Added most time features

Missing support for posix-,Format+, translating via locales. Also required more tests

* ls: rustfmt

* ls: Additional changes and fixes

Fixed the argument order, fixed a wrong iso format.

* ls: additional tests for styles

* ls: perfected arg parsing on time styles

* fix birthime test

* ls: Use 'stdout_str' in new tests

* ls: Disabled birthtime test for windows

* ls: removed indoc as a dependency

* ls: birthime test, sync first created file

* ls: birthime test, add comment explaining sync

* Removed ruby testfile birth_test.rb

This accidentally got commited in a merge
2021-04-28 20:54:27 +02:00
Sylvestre Ledru
167520067c
Merge pull request #2111 from cbjadwani/cut_optimizations
cut: optimizations
2021-04-28 20:40:28 +02:00
Chirag Jadwani
25f99097cc cut: add BENCHMARKING.md
and minor refactoring
2021-04-28 23:28:26 +05:30
Sylvestre Ledru
a37e3181a2
Merge pull request #2130 from electricboogie/master
sort: implement --buffer-size and --temporary-directory (external sort)
2021-04-28 09:21:14 +02:00
Sylvestre Ledru
33139817a2
Merge pull request #2136 from jaggededgedjustice/allow-truncate-size-and-reference
Allow truncate to take --size and --reference
2021-04-27 22:43:25 +02:00
electricboogie
ec19bb72d5 Modified to remove 2 unnecessary consts now that we use std::env::temp_dir 2021-04-27 15:39:20 -05:00
Sylvestre Ledru
30cf6ec235
Merge pull request #2131 from ricardoaiglesias/base32-clap
Base32 clap
2021-04-27 09:20:45 +02:00
Ricardo Iglesias
ae0cabc60a Moved argument parsing to uumain. 2021-04-26 20:15:11 -07:00
Sylvestre Ledru
7a3b44d972
Merge pull request #2133 from tertsdiepraam/ls/fix_color_grid_alignment
`ls`: fix grid alignment with `--color`
2021-04-26 22:51:21 +02:00
Sylvestre Ledru
ece5e14b0d
fix a typo 2021-04-26 22:51:02 +02:00
James Robson
a7037b1ca9 Allow truncate to take --size and --reference 2021-04-26 18:39:32 +01:00
Terts Diepraam
35838dc8a9 ls: document hyperfine script 2021-04-26 18:36:15 +02:00
Terts Diepraam
4023e40174 ls: further reduce OsStr -> String conversions 2021-04-26 18:03:56 +02:00
Ricardo Iglesias
11d0565f0e base32: Moved clap argument parsing to base32.rs
Now, all base_common.rs has is the handle_input function.
2021-04-26 08:22:41 -07:00
Ricardo Iglesias
d56462a4b3 base32: Fixed style violations. Added tests
Tests now cover using "-" as standard input and reading from a file.
2021-04-26 08:00:55 -07:00
electricboogie
f3ed5a100f Possible fix to Windows issues, ext_sort bool setting 2021-04-26 08:54:40 -05:00
Terts Diepraam
c69b72c840 ls: forgot to commit Cargo.{toml, lock} 2021-04-26 15:04:55 +02:00
Terts Diepraam
58fd61b3e8 ls: fix grid alignment for unicode 2021-04-26 15:00:39 +02:00
Terts Diepraam
cfc11b47a5 ls: fix grid alignment with --color 2021-04-26 14:41:41 +02:00
Terts Diepraam
e4c0069493 ls: remove path strip 2021-04-26 09:53:13 +02:00
Terts Diepraam
322478d9a2 ls: document flamegraph 2021-04-26 09:37:47 +02:00
Sylvestre Ledru
7dcc8c2960
Merge pull request #1968 from alstolten/feat2
ls: Implements extension sorting
2021-04-26 09:03:06 +02:00
Ricardo Iglesias
99c13f202e Merge branch 'master' of github.com:uutils/coreutils into base32-clap 2021-04-25 22:36:26 -07:00
Ricardo Iglesias
5578ba6eed base32: move from getopts to clap
Note, I needed to change the error messages in one of the tests because
getopt and clap have different error messages when not providing a
default value
2021-04-25 22:24:55 -07:00
electricboogie
c01c6a7d78 Ran rustfmt 2021-04-25 22:41:11 -05:00
electricboogie
6654519c7d Specify a default tempdir for Windows 2021-04-25 22:39:17 -05:00
electricboogie
e5c19734c8 Change Default Buffer to usize::MAX 2021-04-25 21:38:22 -05:00
electricboogie
1a407c2328 Set a dynamic minimum buffer size 2021-04-25 21:17:56 -05:00
electricboogie
8e258075f6 Potential fix to tests on Windows 2021-04-25 19:21:19 -05:00
electricboogie
fc899ffe7a Implement a minimum readback buffer 2021-04-25 19:07:24 -05:00
electricboogie
32222c1ee7 Remove unneeded condition for use of NumCache 2021-04-25 17:52:20 -05:00
electricboogie
0f707cdb25 Adjust max buffer size for read back as well 2021-04-25 16:33:12 -05:00
Christopher Regali
368e984fac
Change unchecked unwrapping to unwrap_or_default for Args-trait (#1845) (#1852)
* Change unchecked unwrapping to unwrap_or_default for argument parsing (resolving #1845)

* Added unit-testing for the collect_str function on invalid utf8 OsStrs

* Added a warning-message for identification purpose to the collect_str method.

* - Add removal of wrongly encoded empty strings to basename
- Add testing of broken encoding to basename
- Changed UCommand to use collect_str in args method to allow for integration testing of that method
- Change UCommand to use unwarp_or_default in arg method to match the behaviour of collect_str

* Trying out a new pattern for convert_str for getting a feeling of how the API feels with more control

* Adding convenience API for compact calls

* Add new API to everywhere, fix test for basename

* Added unit-testing for the conversion options

* Added unit-testing for the conversion options for windows

* fixed compilation and some merge hiccups

* Remove windows tests in order to make merge request build

* Fix formatting to match rustfmt for the merged file

* Improve documentation of the collect_str method and the unit-tests

* Fix compilation problems with test

Co-authored-by: Christopher Regali <chris.vdop@gmail.com>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2021-04-25 23:28:42 +02:00
electricboogie
6f82cd4f15 Fix errors for usize on 32bit platforms 2021-04-25 16:27:36 -05:00
electricboogie
dbdac22262 Add back unstable sort 2021-04-25 15:48:20 -05:00
electricboogie
5fb7014c2b Add a BufWriter for writes out to temp files 2021-04-25 15:42:36 -05:00
electricboogie
733949b2e7 Add dynamic buffer adjustment, fix test comment 2021-04-25 15:13:27 -05:00
Ricardo Iglesias
c3d7358df6
ls: ignore leading period when sorting by name (#2112)
* ls: ignore leading period when sorting by name

ls now behaves like GNU ls with respect to sorting files by ignoring
leading periods when sorting by main.

Added tests to ensure "touch a .a b .b ; ls" returns ".a  a  .b  b"

* Replaced clone/collect calls.
2021-04-25 21:08:05 +02:00
Alessandro Stoltenberg
43f3f7e01c feat2: Rebased on current master and incorporated changes done to the filetype handling. 2021-04-25 20:13:42 +02:00
electricboogie
2f37b85426 unwrap_or_else can be an unwrap_or 2021-04-25 12:58:04 -05:00
Alessandro Stoltenberg
9c221148a8 ls: Extension sorting, use file_stem() instead of to_string_lossy() 2021-04-25 19:45:59 +02:00
Alessandro Stoltenberg
bbcca3eefd ls: Implements https://github.com/uutils/coreutils/issues/1880 extension sorting. 2021-04-25 19:45:59 +02:00
electricboogie
f0a473f40e Fix tests 2021-04-25 12:38:43 -05:00
electricboogie
094d9a9e47 Fix bug in human_numeric convert 2021-04-25 12:27:11 -05:00
electricboogie
4c395146dd Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-25 10:11:27 -05:00
electricboogie
26fc8e57c7 Fix NumCache and Serde JSON conflict by disabling NumCache during extsort general numeric compares 2021-04-25 10:03:29 -05:00
Sylvestre Ledru
e667cc2641
Merge pull request #2115 from tertsdiepraam/ls/reduce_write_calls
`ls`: reduce write syscalls & cleanup
2021-04-25 11:52:51 +02:00
Sylvestre Ledru
c19e191360
Merge pull request #2113 from siebenHeaven/ls-optimize-sort
ls: Use sort_by_cached_key
2021-04-25 11:13:23 +02:00
Terts Diepraam
fc6c7a279e ls: clean up imports 2021-04-25 10:46:51 +02:00
Anup Mahindre
7e06316ece ls: Use sort_by_cached_key 2021-04-25 13:37:07 +05:30
Sylvestre Ledru
441763b73d
Merge pull request #2059 from cbjadwani/master
uniq: avoid building list of duplicate lines
2021-04-25 09:48:48 +02:00
Sylvestre Ledru
d3775ea0e8
Merge pull request #2110 from nthery/cp_reflink_macos
cp: add  --reflink support to macos, fixes #1773
2021-04-25 09:28:14 +02:00
electricboogie
2b8a6e98ee Working ExtSort 2021-04-25 00:20:56 -05:00
Terts Diepraam
e995eea579 ls: general cleanup 2021-04-25 00:23:14 +02:00
Terts Diepraam
ce04f8a759 ls: use bufwriter to write stdout 2021-04-24 23:46:19 +02:00
Nicolas Thery
4bf33e98a8 cp: add --reflink support for macOS
Fixes #1773
2021-04-24 19:26:15 +02:00
Nicolas Thery
b8e23c20c2 cp: extract linux COW logic into function 2021-04-24 19:22:12 +02:00
Chirag Jadwani
2c1459cbfc cut: optimizations
* Use buffered stdout to reduce write sys calls.

This simple change yielded the biggest performace gain.

* Use `for_byte_record_with_terminator` from the `bstr` crate.

This is to minimize the per line copying needed by
`BufReader::read_until`. The `cut_fields` and `cut_fields_delimiter`
functions used `read_until` to iterate over lines. That required copying
each input line to the line buffer. With
`for_byte_record_with_terminator` copying is minimized as it calls our
closure with a reference to BufReader's buffer most of the time.  It
needs to copy (internally) only to process any incomplete lines at the
end of the buffer.

* Re-write `Searcher` to use `memchr`.

Switch from the naive implementation to one that uses `memchr`.

* Rewrite `cut_bytes` almost entirely.

This was already well optimized. The performance gain in this case is
not from avoiding copying. In fact, it needed zero copying whereas new
implementation introduces some copying similar to `cut_fields` described
above. But the occassional copying cost is more than offset by the use
of the very fast `memchr` inside `for_byte_record_with_terminator`.
This change also simplifies the code significantly. Removed the `buffer`
module.
2021-04-24 22:29:48 +05:30
Sylvestre Ledru
2f17bfc14c
Merge pull request #2106 from miDeb/sort-debug
sort: implement --debug
2021-04-24 18:46:58 +02:00
Sylvestre Ledru
c9b0378ca3
Merge pull request #2104 from tertsdiepraam/ls/skip_metadata
`ls`: skip reading metadata
2021-04-24 18:13:53 +02:00
Sylvestre Ledru
d7e8a03237
Merge pull request #2097 from miDeb/sort-disable-dictionary-mode
sort: disallow certain flags with -d and -i
2021-04-24 14:58:32 +02:00
Sylvestre Ledru
b41951614b
Merge branch 'master' into sort-disable-dictionary-mode 2021-04-24 13:56:39 +02:00
Terts Diepraam
1328d18878 ls: remove outdated comment 2021-04-24 13:19:50 +02:00
Michael Debertol
5dcfb51110 flip default for debug to the effective default 2021-04-24 10:52:40 +02:00
Terts Diepraam
728f0bd61d ls: remove redundant parentheses 2021-04-24 10:47:36 +02:00
Terts Diepraam
ce8c58b93e Merge branch 'master' into ls/skip_metadata 2021-04-24 10:45:43 +02:00
Sylvestre Ledru
8ccc6ade61
Merge branch 'master' into split-wsl-detection 2021-04-24 10:24:13 +02:00
Sylvestre Ledru
9517395839
Merge pull request #2088 from nthery/cp_reflink_never
cp: add support for --reflink=never
2021-04-24 10:07:41 +02:00
Sylvestre Ledru
fb6394554e
Merge pull request #2096 from tertsdiepraam/ls/fix_backslash_escape
ls: improve code cov
2021-04-24 10:05:32 +02:00
Sylvestre Ledru
513ff4e45f
Merge branch 'master' into sort-disable-dictionary-mode 2021-04-24 10:04:23 +02:00
Sylvestre Ledru
b96f7dbaea
Merge pull request #2087 from pedrohjordao/printf-clap-opts
Changes parameter parsing to clap
2021-04-24 10:02:05 +02:00
Sylvestre Ledru
372d08c341
Merge pull request #2098 from miDeb/sort-trailing-separator
sort: fix tokenization for trailing separators
2021-04-24 10:00:20 +02:00
Sylvestre Ledru
a9fa4adddf
Merge pull request #2102 from jaggededgedjustice/fix-tail-sleep-interval
tail --sleep-interval takes a value
2021-04-24 09:59:03 +02:00
Sylvestre Ledru
b10837f180
Merge pull request #2103 from jhscheer/refactor_tests
refactor tests (#1982)
2021-04-24 09:58:20 +02:00
Sylvestre Ledru
46b95fb8bd
Merge pull request #2099 from tertsdiepraam/ls/cross_platform_colors
ls: cross-platform colors
2021-04-24 09:56:46 +02:00
Michael Debertol
e6f6b109a5 sort: implement --debug
This adds a --debug flag, which, when activated, will draw lines below
the characters that are actually used for comparisons.

This is not a complete implementation of --debug. It should, quoting the man page
for GNU sort: "annotate the part of the line used to sort, and warn
about questionable usage to stderr". Warning about "questionable usage"
is not part of this patch.

This change required some adjustments to be able to get the range that
is actually used for comparisons. Most notably, general numeric comparisons
were rewritten, fixing some bugs along the lines.

Testing is mostly done by adding fixtures for the expected debug output of
existing tests.
2021-04-23 22:36:15 +02:00
James Robson
b68ecf1269 Allow space in truncate --size 2021-04-23 16:36:46 +01:00
Terts Diepraam
eccb86c9ed ls: fix -a test 2021-04-23 08:26:20 +02:00
Jan Scheer
646c6cacbc refactor tests (#1982) 2021-04-23 02:28:46 +02:00
Terts Diepraam
3874a24457 ls: add once_cell to Cargo.toml 2021-04-23 00:35:45 +02:00
Terts Diepraam
a114f855f0 ls: revert to_ascii_lowercase 2021-04-22 23:43:00 +02:00
Terts Diepraam
e241f3ad69 ls: skip reading metadata 2021-04-22 22:45:24 +02:00
James Robson
3678777539 tail --sleep-interval takes a value 2021-04-22 16:10:08 +01:00
Terts Diepraam
ea10647a62 Merge remote-tracking branch 'upstream/master' into ls/fix_backslash_escape 2021-04-22 14:23:35 +02:00
Terts Diepraam
b9f4964a96 ls: bring up to date with recent changes 2021-04-22 11:39:08 +02:00
Terts Diepraam
cd1514bd57 Merge branch 'master' into ls/cross_platform_colors 2021-04-22 11:30:26 +02:00
Terts Diepraam
4e4c3aba00 ls: don't color symlink target 2021-04-22 11:16:33 +02:00
Anup Mahindre
8554cdf35b
Optimize recursive ls (#2083)
* ls: Remove allocations by eliminating collect/clones

* ls: Introduce PathData structure

- PathData will hold Path related metadata / strings that are required
frequently in subsequent functions
- All data is precomputed and cached and subsequent functions just
use cached data

* ls: Cache more data related to paths

- Cache filename and sort by filename instead of full path
- Cache uid->usr and gid->grp mappings
https://github.com/uutils/coreutils/pull/2099/files
* ls: Add BENCHMARKING.md

* ls: Document PathData structure

* tests/ls: Add testcase for error paths with width option

* ls: Fix unused import warning

cached will be only used for unix currently as current use of
caching gid/uid mappings is only relevant on unix

* ls: Suggest checking syscall count in BENCHMARKING.md

* ls: Remove mentions of sort in BENCHMARKING.md

* ls: Remove dependency on cached

Implement caching using HashMap and lazy_static

* ls: Fix MSRV error related to map_or

Rust 1.40 did not support map_or for result types
2021-04-22 09:19:17 +02:00
Terts Diepraam
1d7e206d72 ls: fix mac build 2021-04-21 20:04:52 +02:00
Michael Debertol
8a05148d7b sort: fix tokenization for trailing separators
Trailing separators were included at the end of the last token, but they
should not be.

This changes tokenize_with_separator as suggested by @cbjadwani.
2021-04-21 19:07:03 +02:00
Terts Diepraam
3fc8d2e422 ls: make compatible with Rust 1.40 again 2021-04-21 18:05:10 +02:00
Terts Diepraam
ff39538375 ls: further refactor --color and classification 2021-04-21 18:00:43 +02:00
Michael Debertol
8b906b9547 remove feature use stabilized in 1.51 2021-04-21 18:00:01 +02:00
Michael Debertol
4a305b32c6 sort: disallow certain flags with -d and -i
GNU sort disallows these combinations, presumably because they are
likely not what the user really wants.

Ignoring characters would cause things to be put together that aren't
together in the input. For example, -dn would cause "0.12" or "0,12" to
be parsed as "12" which is highly unexpected and confusing.
2021-04-21 17:49:40 +02:00
Terts Diepraam
34a824af71 ls: use lscolors crate 2021-04-21 17:35:02 +02:00
Terts Diepraam
29b5b6b276 ls: fix unit tests to match last change 2021-04-21 13:03:31 +02:00
Terts Diepraam
f34c992932 ls: always quote backslash in shell style 2021-04-21 12:45:21 +02:00
Árni Dagur
387227087f
cat: Put splice code in separate file, handle more failures (#2067)
* cat: Refactor splice code, handle more failures

* cat: Add tests for stdout redirected to files
2021-04-21 12:21:31 +02:00
Terts Diepraam
fd54614130 Merge branch 'master' into ls/fix_backslash_escape 2021-04-21 12:06:54 +02:00
Terts Diepraam
f84f23ddfe tests/ls: add coverage for special shell character after escaped char 2021-04-21 11:22:10 +02:00
Terts Diepraam
795d89f11d ls: don't escape backslash in shell style quoting 2021-04-21 11:08:40 +02:00
electricboogie
25021f31eb Incorporate overhead of Line struct 2021-04-19 21:24:52 -05:00
Sivachandran
0ea35f3fbc
Implement install create leading components(-D) option (#2092)
* Implement install's create leading components(-D) option

* Format changes

* Add install test to check fail on long dir name
2021-04-19 22:03:13 +02:00
electricboogie
b8d667c383 Clippy lints, more work on ext_sorter leads to 2 failing tests 2021-04-19 10:57:53 -05:00
Pedro Jordão
158ae35da5 Commented out code removal 2021-04-19 14:21:49 +01:00
Chirag Jadwani
3bb99e7047 uniq: avoid building list of duplicate lines
This reduces memory usage by only storing two lines of the input file at
a time. The current implementation first builds a list of all duplicate
lines ('group') and then decides which lines of the group should be
printed.
2021-04-19 17:02:59 +05:30
Jan Scheer
049f21a199
du: fix tests on linux (#2066) (#2090) 2021-04-19 10:45:51 +02:00
electricboogie
e7bcd59558 Remove a clone 2021-04-18 18:22:30 -05:00
electricboogie
fcebdbb7a7 Cleanup comment 2021-04-18 17:51:44 -05:00
electricboogie
5efd67b5e2 License cleanup 2021-04-18 17:44:45 -05:00
electricboogie
72858dda42 Ran rustfmt 2021-04-18 17:40:59 -05:00
electricboogie
258325491f Make human_numeric_convert a method 2021-04-18 17:39:42 -05:00
electricboogie
8072e2092a Cleanup loop, run rustfmt 2021-04-18 16:33:18 -05:00
electricboogie
deb94cef7a Cleanup 2021-04-18 15:52:48 -05:00
electricboogie
559f4e81f6 More license cleanup 2021-04-18 15:47:05 -05:00
electricboogie
fb19522ca0 Bring back non-external sort as default 2021-04-18 15:39:20 -05:00
electricboogie
e841bb6a24 More license cleanup 2021-04-18 15:20:16 -05:00
electricboogie
9170e7a511 Modify NOTICE 2021-04-18 15:15:12 -05:00
electricboogie
298e269531 Remove unsed code 2021-04-18 15:08:42 -05:00
electricboogie
0151f30c4e Change directory structure 2021-04-18 15:04:25 -05:00
electricboogie
e3e1ee30eb Add additional notices 2021-04-18 14:37:16 -05:00
electricboogie
0275a43c5b Make modifications clearer per Apache license 2021-04-18 14:05:27 -05:00
electricboogie
42da444f40 Remove unused deps 2021-04-18 13:49:11 -05:00
electricboogie
5bb66b26dd Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-18 13:45:33 -05:00
electricboogie
dad7761be9 Add test 2021-04-18 13:43:41 -05:00
electricboogie
da94e35044 Cleanup, removed unused code, add copyright 2021-04-18 13:02:50 -05:00
electricboogie
d7b7ce52bc Vendored ext_sorter, removed unstable, created a byte buffer sized vector instead of a numbered capacity vector 2021-04-18 11:54:18 -05:00
Nicolas Thery
f36832c392 cp: add support for --reflink=never
- Passing `never` to `--reflink` does not raise an error anymore.
- Remove `Options::reflink` flag as it was redundant with
  `reflink_mode`.
- Add basic tests for this option.  Does not check that a copy-on-write
  rather than a regular copy was made.
2021-04-18 18:51:59 +02:00
Sylvestre Ledru
d3f71810df
Merge pull request #2063 from jhscheer/iss2060
chown: fix #2060
2021-04-18 09:50:23 +02:00
electricboogie
a73d108dd8 Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-17 23:25:02 -05:00
electricboogie
4c8d62c2be More cleanup 2021-04-17 23:24:32 -05:00
electricboogie
3a1e92fdd2 More cleanup 2021-04-17 22:39:05 -05:00
electricboogie
7a8767e359 Cleanup 2021-04-17 22:34:03 -05:00
electricboogie
65e9c7b1b5 Sorta working ExtSort - concat struct elements 2021-04-17 21:30:03 -05:00
Jan Scheer
df2dcc5b99 chown: fix parse_spec() for colon (#2060) 2021-04-18 00:11:59 +02:00
Michael Debertol
519b9d34a6
sort: use unstable sort when possible (#2076)
* sort: use unstable sort when possible

This results in a very minor performance (speed) improvement.
It does however result in a memory usage reduction, because unstable
sort does not allocate auxiliary memory. There's also an improvement in
overall CPU usage.

* add benchmarking instructions

* add user time

* fix typo
2021-04-17 22:40:13 +02:00
Pedro Jordão
01fef70143 Changes parameter parsing to clap
- Uses clap to parse parameters
- Removes of "allow" directive where they are not necessary
- Removes of unused variables
2021-04-17 20:42:05 +01:00
electricboogie
acfe0681d4 Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-17 11:54:45 -05:00
Michael Debertol
4bbbe3a3f2
sort: implement numeric string comparison (#2070)
* sort: implement numeric string comparison

This implements -n and -h using a string comparison algorithm instead
of parsing each number to a f64 and comparing those.

This should result in a moderate performance increase and eliminate loss
of precision.

* cache parsed f64 numbers

For general numeric comparisons we have to parse numbers as f64,
as this behavior is explicitly documented by GNU coreutils.
We can however cache the parsed value to speed up comparisons.

* fix leading zeroes for negative numbers

* use more appropriate name for exponent

* improvements to the parse function

* move checks into main loop and fix thousands separator condition

* remove unneeded checks

* rustfmt
2021-04-17 13:49:35 +02:00
Sylvestre Ledru
481d1ee659
Merge pull request #2077 from tertsdiepraam/ls/dereference-command-line
ls: dereference command line
2021-04-17 13:31:52 +02:00
Sylvestre Ledru
c5b43c0994 rustfmt the recent change 2021-04-17 13:21:30 +02:00
Sylvestre Ledru
eec389fa94
Merge branch 'master' into ls/dereference-command-line 2021-04-17 10:30:42 +02:00
Andrew Rowson
d0c7e8c09e
du error output should match GNU (#1776)
* du error output should match GNU

* Created a new error macro which allows the customization of the
  "error:" string part
* Match the du output based on the type of error encountered. Can extend
  to handling other errors I guess.

* Rustfmt updates

* Added non-windows test for du no permission output
2021-04-17 10:26:52 +02:00
Sylvestre Ledru
fc057b816b
Merge branch 'master' into split-wsl-detection 2021-04-17 10:22:54 +02:00
Aleksandar Janicijevic
fe207640e2
touch: dealing with DST in touch -m -t (#2073) 2021-04-17 10:08:10 +02:00
electricboogie
a76d452f75
Sort: More small fixes (#2065)
* Various fixes and performance improvements

* fix a typo

Co-authored-by: Michael Debertol <michael.debertol@gmail.com>

* Fix month parse for months with leading whitespace

* Implement test for months whitespace fix

* Confirm human numeric works as expected with whitespace with a test

* Correct arg help value name for --parallel

* Fix SemVer non version lines/empty line sorting with a test

Co-authored-by: Sylvestre Ledru <sledru@mozilla.com>
Co-authored-by: Michael Debertol <michael.debertol@gmail.com>
2021-04-17 10:06:19 +02:00
Terts Diepraam
2c130ae7c0 ls: take -l into account with dereference-command-line 2021-04-14 14:42:14 +02:00
Terts Diepraam
5c28ac1b0d ls: dereference command line 2021-04-14 14:12:00 +02:00
electricboogie
c49f93c9af Psuedo working extsort 2021-04-12 18:05:37 -05:00
electricboogie
e6c195a675 ExtSort 2021-04-12 14:24:22 -05:00
Reto Hablützel
a4253d1254 apply more clippy suggestions from nightly 2021-04-12 20:07:10 +02:00
Reto Hablützel
07e9c5896c ignore strip_suffix until minimum rust version is 1.45 2021-04-12 19:53:47 +02:00
Reto Hablützel
d219b6e705 strip_suffix is not avaialble with rust 1.40 2021-04-12 19:50:23 +02:00
Reto Hablützel
d67560c37a fix clippy for unix 2021-04-11 16:34:19 +02:00
Reto Hablützel
b465c34eef fix ls 2021-04-11 16:16:38 +02:00
Reto Hablützel
75a76613e4 fix clippy in cp 2021-04-11 16:09:18 +02:00
Reto Hablützel
97d12d6e3c fix trivial warnings without features 2021-04-11 16:05:25 +02:00
electricboogie
c6021e10c2 Fix SemVer non version lines/empty line sorting with a test 2021-04-10 15:27:16 -05:00
Árni Dagur
eb4971e6f4
cat: Unrevert splice patch (#2020)
* cat: Unrevert splice patch

* cat: Add fifo test

* cat: Add tests for error cases

* cat: Add tests for character devices

* wc: Make sure we handle short splice writes

* cat: Fix tests for 1.40.0 compiler

* cat: Run rustfmt on test_cat.rs

* Run 'cargo +1.40.0 update'
2021-04-10 22:19:53 +02:00
electricboogie
7133273725 Correct arg help value name for --parallel 2021-04-10 14:13:49 -05:00
electricboogie
3c1c76444b Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-10 13:26:56 -05:00
Michael Debertol
69f4410a8a
sort: dedup using compare_by (#2064)
compare_by is the function used for sorting, we should use it for dedup
as well.
2021-04-10 19:49:10 +02:00
electricboogie
2d9f15d12c Fix month parse for months with leading whitespace 2021-04-10 12:02:02 -05:00