Commit graph

945 commits

Author SHA1 Message Date
Jeffrey Finkelstein
fea1026669 tail: use std::io::copy() to write bytes to stdout 2021-05-17 18:15:39 -04:00
nicoo
00322b986b factor: Move benchmarks out-of-crate 2021-05-17 19:43:38 +02:00
nicoo
1cd001f529 factor::benches::table: Match BenchmarkId w/ criterion's conventions
See https://bheisler.github.io/criterion.rs/book/user_guide/comparing_functions.html
2021-05-17 19:43:38 +02:00
nicoo
7c649bc74e factor::benches: Add check against ASLR 2021-05-17 19:43:38 +02:00
nicoo
ddfcd2eb14 factor::benchmarking: Add wishlist / planned work 2021-05-17 19:43:38 +02:00
nicoo
1d75f09743 factor::benchmarking(doc): Add guidance on writing µbenches 2021-05-17 19:43:38 +02:00
nicoo
e9f8194266 factor::benchmarking(doc): Add guidance on running µbenches 2021-05-17 19:43:38 +02:00
nicoo
ae15bf16a8 factor::benches::table: Report throughput (in numbers/s) 2021-05-17 19:43:38 +02:00
nicoo
12efaa6add factor: Add BENCHMARKING.md 2021-05-17 19:43:38 +02:00
nicoo
7c287542c7 factor::table: Fixup microbenchmark
Previous version would perform an amount of work proportional to `CHUNK_SIZE`,
so this wasn't a valid way to benchmark at multiple values of that constant.

The `TryInto` implementation for `&mut [T]` to `&mut [T; N]` relies on `const`
generics, and is available in (stable) Rust v1.51 and later.
2021-05-17 19:43:38 +02:00
nicoo
1fd5f9da25 factor::table::factor_chunk: Turn loop inside-out
This keeps the traversal of `P_INVS_U64` (a large table) to a single pass
in-order, rather than `CHUNK_SIZE` passes.
2021-05-17 19:43:38 +02:00
nicoo
cd047425aa factor::table: Add chunked implementation and microbenchmarks
The factor_chunk implementation is a strawman, but getting it in place allows us
to set up the microbenchmarking etc.
2021-05-17 19:43:38 +02:00
nicoo
c68c83c6dd factor::table: Take mutable refs
This will be easier to adapt to working with multiple numbers to process at once.
2021-05-17 19:43:38 +02:00
Jeffrey Finkelstein
eeef8290df head: display errors for each input file
Change the behavior of `head` to display an error for each problematic
file, instead of displaying an error message for the first problematic
file and terminating immediately at that point. This change now matches
the behavior of GNU `head`.

Before this commit, the first error caused the program to terminate
immediately:

    $ head a b c
    head: error: head: cannot open 'a' for reading: No such file or directory

After this commit:

    $ head a b c
    head: cannot open 'a' for reading: No such file or directory
    head: cannot open 'b' for reading: No such file or directory
    head: cannot open 'c' for reading: No such file or directory
2021-05-17 08:19:47 -04:00
Michael Debertol
fcd48813e0 sort: read files as chunks, off-thread
Instead of using a BufReader and reading each line separately,
allocating a String for each one, we read to a chunk. Lines are
references to this chunk. This makes the allocator's job much easier
and yields performance improvements.

Chunks are read on a separate thread to further improve performance.
2021-05-16 21:13:37 +02:00
Jeffrey Finkelstein
659bf58a4c head: print headings when reading multiple files
Fix a bug in which `head` failed to print headings for `stdin` inputs
when reading from multiple files, and fix another bug in which `head`
failed to print a blank line between the contents of a file and the
heading for the next file when reading multiple files. The output now
matches that of GNU `head`.
2021-05-16 12:03:10 -04:00
Jeffrey Finkelstein
733d347fa8 head: simplify rbuf_n_bytes() in head.rs
Simplify the code in `rbuf_n_bytes()` to use existing abstractions
provided by the standard library.
2021-05-15 23:04:01 -04:00
Jeffrey Finkelstein
97a49c7c95 wc: compute min width to format counts up front
Fix two issues with the string formatting width for counts displayed
by `wc`.

First, the output was previously not using the default minimum width
(seven characters) when reading from `stdin`. This commit corrects
this behavior to match GNU `wc`. For example,

    $ cat alice_in_wonderland.txt | wc
          5      57     302

Second, if at least 10^7 bytes were read from `stdin` *after* reading
from a smaller regular file, then every output row would have width
8. This disagrees with GNU `wc`, in which only the `stdin` row and the
total row would have width 8. This commit corrects this behavior to
match GNU `wc`. For example,

    $ printf "%.0s0" {1..10000000} | wc emptyfile.txt -
	  0       0       0 emptyfile.txt
	  0       1 10000000
	  0       1 10000000 total

Fixes #2186.
2021-05-15 21:41:47 -04:00
Sylvestre Ledru
620a5a5df6
Merge pull request #2210 from jhscheer/dns_lookup
who: fix `--lookup`
2021-05-15 21:18:12 +02:00
Jeffrey Finkelstein
e8d911d9d5 wc: correct some error messages for invalid inputs
Change the error messages that get printed to `stderr` for compatibility
with GNU `wc` when an input is a directory and when an input does not
exist.

Fixes #2211.
2021-05-15 10:35:21 -04:00
Jan Scheer
a4fc2b5106 who: fix --lookup
This closes #2181.

`who --lookup` is failing with a runtime panic (double free).
Since `crate::dns-lookup` already includes a safe wrapper for `getaddrinfo`
I used this crate instead of further debugging the existing code in
utmpx::canon_host().

* It was neccessary to remove the version constraint for libc in uucore.
2021-05-13 22:16:15 +02:00
Jeffrey Finkelstein
2e621759b2 tail: refactor code into ReverseChunks iterator
Refactor code from the `backwards_thru_file()` function into a new
`ReverseChunks` iterator, and use that iterator to simplify the
implementation of the `backwards_thru_file()` function. The
`ReverseChunks` iterator yields `Vec<u8>` objects, each of which
references bytes of a given file.
2021-05-12 18:43:58 -04:00
Jeffrey Finkelstein
3114fd77be tail: use &mut File instead of mut file: &File 2021-05-12 18:43:35 -04:00
Sylvestre Ledru
2178edf628
Merge pull request #2207 from jhscheer/issue_2204
date: fix format literal for nanoseconds
2021-05-12 13:14:23 +02:00
Jan Scheer
12a43d6eb3 date: fix format literal for nanoseconds 2021-05-12 10:21:24 +02:00
Sylvestre Ledru
a5f8ca60b5
Merge pull request #2199 from jhscheer/refactor_fsext
df/stat: refactor - reduce duplicate code
2021-05-12 08:41:16 +02:00
Sylvestre Ledru
6635301f32
Merge pull request #2194 from miDeb/sort-stable-merge
sort: make merging stable
2021-05-12 08:38:48 +02:00
Sylvestre Ledru
57ae202037
Merge pull request #2195 from nthery/wc_dash
wc: emit '-' in ouput when set on command-line
2021-05-12 08:37:55 +02:00
Sylvestre Ledru
8f24ec9414
Merge pull request #2198 from jfinkels/tail-refactor
tail: simplify unbounded_tail() function
2021-05-12 08:35:45 +02:00
Sylvestre Ledru
68a3488cdc
Merge pull request #2202 from drocco007/test-negated-boolean
test: improve handling of inverted Boolean expressions
2021-05-12 08:34:41 +02:00
Jan Scheer
8200d399e8 date: fix format for nanoseconds 2021-05-11 23:03:59 +02:00
Daniel Rocco
2ec4bee350 test: improve handling of inverted Boolean expressions
- add `==` as undocumented alias of `=`

- handle negated comparison of `=` as literal

- negation generally applies to only the first expression of a Boolean chain,
  except when combining evaluation of two literal strings
2021-05-10 22:48:40 -04:00
Jan Scheer
381f8dafc6 df/uucore: refactor - move duplicate code to uucore/fsext.rs 2021-05-10 23:37:01 +02:00
Sylvestre Ledru
ed42652803
Merge pull request #2200 from jhscheer/fix_clippy
fix clippy warnings
2021-05-10 16:13:27 +02:00
Jan Scheer
4ac75898c3 fix clippy warnings 2021-05-10 15:48:32 +02:00
Jan Scheer
203ee463c7 stat/uucore: refactor - move fsext.rs to uucore 2021-05-10 10:46:00 +02:00
Jeffrey Finkelstein
0cc779c733 tail: simplify unbounded_tail() function
Refactor common code out of two branches of the `unbounded_tail()`
function into a new `unbounded_tail_collect()` helper function, that
collects from an iterator into a `VecDeque` and keeps either the last
`n` elements or all but the first `n` elements.

This commit also adds a new struct, `RingBuffer`, in a new module,
`ringbuffer.rs`, to be responsible for keeping the last `n` elements
of an iterator.
2021-05-09 23:47:13 -04:00
Gilad Naaman
8747800697 Switched 'arch' to use clap instead of getopts 2021-05-09 21:53:03 +03:00
Sylvestre Ledru
7c51fb4946
Merge pull request #2165 from miDeb/sort-optimize-line
sort: optimize the line struct
2021-05-09 18:41:39 +02:00
Nicolas Thery
112b042769 wc: emit '-' in ouput when set on command-line
When stdin is explicitly specified on the command-line with '-', emit it
in the output stats to match GNU wc output.

Fixes #2188.
2021-05-09 15:47:05 +02:00
Michael Debertol
e0ebf907a4 sort: make merging stable
When merging files we need to prioritize files that occur earlier in the
command line arguments with -m.

This also makes the extsort merge step (and thus extsort itself) stable again.
2021-05-09 11:43:38 +02:00
Sylvestre Ledru
d43af35147
Merge pull request #2145 from tertsdiepraam/ls/device_information
`ls`: implement device symbol and id
2021-05-09 00:50:35 +02:00
Terts Diepraam
f6e5f86fe7 Merge branch 'master' into ls/device_information 2021-05-08 23:21:44 +02:00
Michael Debertol
d686f7e48f sort: improve comments 2021-05-08 22:31:53 +02:00
Sylvestre Ledru
01a702c6fd
Merge branch 'master' into issue2167 2021-05-08 20:26:21 +02:00
Michael Debertol
1afeb55881 Merge branch 'master' of https://github.com/uutils/coreutils into sort-optimize-line 2021-05-08 15:47:19 +02:00
Samuel Ainsworth
2ff9cc6570 Typo in comment 2021-05-08 14:25:21 +02:00
Samuel Ainsworth
bacad8ed93 Use u128 instead of usize for large numbers, and consistency across architectures 2021-05-08 14:25:21 +02:00
Samuel Ainsworth
7c1395366e Fix split's handling of non-UTF-8 files 2021-05-08 14:25:21 +02:00
Samuel Ainsworth
a9ac7af9e1 Simplify parsing of --bytes for the split command 2021-05-08 14:25:21 +02:00
Jeffrey Finkelstein
ba8f4ea670 wc: move counting code into WordCount::from_line()
Refactor the counting code from the inner loop of the `wc` program
into the `WordCount::from_line()` associated function. This commit
also splits that function up into other helper functions that
encapsulate decoding characters and finding word boundaries from raw
bytes.

This commit also implements the `Sum` trait for the `WordCount`
struct, so that we can simply call `sum()` on an iterator that yields
`WordCount` instances.
2021-05-08 14:24:07 +02:00
Jeffrey Finkelstein
50f4941d49 wc: refactor WordCount into its own module
Move the `WordCount` struct and its implementations into the
`wordcount.rs`.
2021-05-08 14:24:07 +02:00
Jeffrey Finkelstein
ee43655bdb fixup! wc: rm leading space when printing multiple counts 2021-05-08 13:11:09 +02:00
Jeffrey Finkelstein
525f71bada wc: rm leading space when printing multiple counts
Remove the leading space from the output of `wc` when printing two or
more types of counts.

Fixes #2173.
2021-05-08 13:11:09 +02:00
Jan Scheer
a885376583 uucore: refactor - reduce duplicate code related to fs::display_permissions
This is a refactor to reduce duplicate code, it affects chmod/ls/stat.
* merge `stat/src/fsext::pretty_access` into `uucore/src/lib/feature/fs::display_permissions_unix`
* move tests for `fs::display_permissions` from `test_stat::test_access` to `uucore/src/lib/features/fs::test_display_permissions`
* adjust `uu_chmod`, `uu_ls` and `uu_stat` to use `uucore::fs::display_permissions`
2021-05-08 11:52:41 +02:00
Michael Debertol
38effc93b3 sort: use FileMerger for extsort merge step
FileMerger is much more efficient than the previous algorithm,
which looped over all elements every time to determine the next element.

FileMerger uses a BinaryHeap, which should bring the complexity for
the merge step down from O(n²) to O(n log n).
2021-05-08 11:51:32 +02:00
Michael Debertol
64c1f16421 sort: allow some functions to be called with OsStr 2021-05-08 11:51:32 +02:00
Terts Diepraam
3b6c7bc9e9 Fix mistakes with merging 2021-05-08 00:50:36 +02:00
Michael Debertol
8c9faa16b9 sort: improve memory usage for extsort 2021-05-07 21:51:31 +02:00
Michael Debertol
c38373946a sort: optimize the Line struct 2021-05-07 21:51:25 +02:00
Terts Diepraam
6834d0256e Merge branch 'master' into ls/device_information 2021-05-07 18:56:44 +02:00
Idan Attias
34b9809223 logname: fix test & style warning 2021-05-06 14:19:47 +02:00
Idan Attias
41eb930292 logname: align profile 2021-05-06 14:19:47 +02:00
Idan Attias
b24b9d501b logname: replace getopts with clap 2021-05-06 14:19:47 +02:00
jaggededgedjustice
a2658250fc
Fix fmt crashing on subtracting unsigned numbers (#2178) 2021-05-05 23:12:17 +02:00
Anup Mahindre
7d2b051866
Implement Total size feature (#2170)
* ls: Implement total size feature

- Implement total size reporting that was missing
- Fix minor formatting / readability nits

* tests: Add tests for ls total sizes feature

* ls: Fix MSRV build errors due to unsupported attributes for if blocks

* ls: Add windows support for total sizes feature

- Add windows support (defaults to file size as block sizes related
infromation is not avialable on windows)
- Renamed some functions
2021-05-05 23:03:25 +02:00
rethab
231bb7be93
Migrate mknod to clap, closes #2051 (#2056)
* mknod: add tests for fifo

* mknod: add test for character device
2021-05-05 22:59:40 +02:00
Sylvestre Ledru
f83316f36e
Merge pull request #2156 from miDeb/sort-no-json-extsort
sort: don't rely on serde-json for extsort
2021-05-05 22:33:18 +02:00
Sylvestre Ledru
1edf4064f3
Merge pull request #2162 from bashi8128/basename-clap
basename: move from getopts to clap
2021-05-04 10:59:19 +02:00
Sylvestre Ledru
3f5dda66f4
Merge pull request #2138 from jhscheer/who2clap
who: move from getopts to clap (#2124)
2021-05-04 10:58:52 +02:00
Sylvestre Ledru
e3b7a8bd22
Merge pull request #2166 from jfinkels/wc-word-countable-lines
wc: add lines() method for iterating over lines
2021-05-04 09:53:08 +02:00
Jan Scheer
56761ba584 stat: implement support for macos 2021-05-03 22:30:56 +02:00
David CARLIER
224c8b3f94 df output update (non inode mode) proposal specific for mac. on this platform, capacity column is also displayed. 2021-05-03 15:49:55 +01:00
bashi8128
5a4bb610ff basename: rename variable names
Rename variable names to be more explicit ones
2021-05-03 23:32:01 +09:00
bashi8128
74802f9f0f basename: improve error messages
Remove duplicated utility name from error messages
2021-05-03 23:26:46 +09:00
Jeffrey Finkelstein
0a3e2216d7 wc: add lines() method for iterating over lines
Add the `WordCountable::lines()` method that returns an iterator over
lines of a file-like object. This mirrors the
`std::io::BufRead::lines()` method, with some minor differences due to
the particular use case of `wc`.

This commit also creates a new module, `countable.rs`, to contain the
`WordCountable` trait and the new `Lines` struct returned by `lines()`.
2021-05-02 16:32:38 -04:00
Sylvestre Ledru
6c04d0d21e
Merge pull request #2155 from nthery/kill_clap
kill: migrate to clap
2021-05-02 18:45:29 +02:00
Michael Debertol
e99f157e6a Merge branch 'master' of https://github.com/uutils/coreutils into sort-no-json-extsort 2021-05-02 18:08:15 +02:00
Sylvestre Ledru
9b7e7bbbc6
Merge pull request #2144 from miDeb/sort-no-transforms
sort: add some custom string comparisons
2021-05-02 18:04:27 +02:00
Michael Debertol
dc5bd9f0be improve memory usage estimation 2021-05-02 17:27:44 +02:00
Sylvestre Ledru
f8ec4a554c
Merge pull request #2161 from tertsdiepraam/ls/sort_order_and_subdirectory_listing
`ls`: C sort order and fix subdirectory listing
2021-05-02 17:21:56 +02:00
Jan Scheer
acd30526a2 tr: fix clippy warning 2021-05-02 13:53:11 +02:00
Jan Scheer
000bd73edc tr: fix merge conflict 2021-05-02 12:39:25 +02:00
Jan Scheer
8739139a7f Merge branch 'master' into issue2147 2021-05-02 12:35:19 +02:00
Nicolas Thery
1dccbfd21e kill: migrate to clap
Fixes #2122.
2021-05-02 12:31:41 +02:00
Jan Scheer
34c22dc3ad tr: fix complement if set2 is range 2021-05-02 12:15:16 +02:00
Sylvestre Ledru
9554710ab5 cat: the function 'unistd::write' doesn't need a mutable reference 2021-05-02 10:31:28 +02:00
Sylvestre Ledru
09178360d8 date: unneeded 'return' statement 2021-05-02 10:30:28 +02:00
Terts Diepraam
eb3206737b ls: give '.' a file_type 2021-05-02 10:20:14 +02:00
bashi8128
47a5dd0f97 basename: move from getopts to clap (#2117)
Use clap for argument parsing instead of getopts
Also, make the following changes

* Use `executable!()` macro to output the name of utility

* Add another usage to help message
2021-05-02 17:08:14 +09:00
Terts Diepraam
361408cbe5 ls: remove case-insensitivity and leading period of name sort 2021-05-02 10:04:11 +02:00
Terts Diepraam
28c7800f73 ls: fix subdirectory name 2021-05-02 10:03:01 +02:00
Sylvestre Ledru
108f9928ef cp: fix 'variable does not need to be mutable' 2021-05-02 09:39:09 +02:00
Sylvestre Ledru
e723b8db43 factor: unneeded statement 2021-05-02 09:35:59 +02:00
Sylvestre Ledru
5e82b195bd ls: remove redundant import 2021-05-02 09:35:00 +02:00
Sylvestre Ledru
2d0f4daf5b
Merge pull request #2152 from deantvv/link-clap
link: replace getopts with clap
2021-05-02 09:33:11 +02:00
Dean Li
f5c7d9bd80 link: replace getopts with clap 2021-05-02 10:40:48 +08:00
Daniel Rocco
3c126bad72 test: implement parenthesized expressions, additional tests
- Replace the parser with a recursive descent implementation that handles
  parentheses and produces a stack of operations in postfix order.

  Parsing now operates directly on OsStrings passed by the uucore framework.

- Replace the dispatch mechanism with a stack machine operating on the
  symbol stack produced by the parser.

- Add tests for parenthesized expressions.

- Begin testing character encoding handling.
2021-05-01 22:40:47 -04:00
Sylvestre Ledru
7e07438b38
Merge pull request #2151 from jfinkels/2141-translate-and-squeeze
tr: implement translate and squeeze (-s) mode
2021-05-01 23:27:43 +02:00
Michael Debertol
484558e37d
Update src/uu/sort/BENCHMARKING.md
Co-authored-by: Sylvestre Ledru <sledru@mozilla.com>
2021-05-01 21:38:36 +02:00