Commit graph

2208 commits

Author SHA1 Message Date
electricboogie
094d9a9e47 Fix bug in human_numeric convert 2021-04-25 12:27:11 -05:00
electricboogie
4c395146dd Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-25 10:11:27 -05:00
electricboogie
26fc8e57c7 Fix NumCache and Serde JSON conflict by disabling NumCache during extsort general numeric compares 2021-04-25 10:03:29 -05:00
Sylvestre Ledru
e667cc2641
Merge pull request #2115 from tertsdiepraam/ls/reduce_write_calls
`ls`: reduce write syscalls & cleanup
2021-04-25 11:52:51 +02:00
Sylvestre Ledru
c19e191360
Merge pull request #2113 from siebenHeaven/ls-optimize-sort
ls: Use sort_by_cached_key
2021-04-25 11:13:23 +02:00
Terts Diepraam
fc6c7a279e ls: clean up imports 2021-04-25 10:46:51 +02:00
Anup Mahindre
7e06316ece ls: Use sort_by_cached_key 2021-04-25 13:37:07 +05:30
Sylvestre Ledru
441763b73d
Merge pull request #2059 from cbjadwani/master
uniq: avoid building list of duplicate lines
2021-04-25 09:48:48 +02:00
Sylvestre Ledru
d3775ea0e8
Merge pull request #2110 from nthery/cp_reflink_macos
cp: add  --reflink support to macos, fixes #1773
2021-04-25 09:28:14 +02:00
electricboogie
2b8a6e98ee Working ExtSort 2021-04-25 00:20:56 -05:00
Terts Diepraam
e995eea579 ls: general cleanup 2021-04-25 00:23:14 +02:00
Terts Diepraam
ce04f8a759 ls: use bufwriter to write stdout 2021-04-24 23:46:19 +02:00
Nicolas Thery
4bf33e98a8 cp: add --reflink support for macOS
Fixes #1773
2021-04-24 19:26:15 +02:00
Nicolas Thery
b8e23c20c2 cp: extract linux COW logic into function 2021-04-24 19:22:12 +02:00
Chirag Jadwani
2c1459cbfc cut: optimizations
* Use buffered stdout to reduce write sys calls.

This simple change yielded the biggest performace gain.

* Use `for_byte_record_with_terminator` from the `bstr` crate.

This is to minimize the per line copying needed by
`BufReader::read_until`. The `cut_fields` and `cut_fields_delimiter`
functions used `read_until` to iterate over lines. That required copying
each input line to the line buffer. With
`for_byte_record_with_terminator` copying is minimized as it calls our
closure with a reference to BufReader's buffer most of the time.  It
needs to copy (internally) only to process any incomplete lines at the
end of the buffer.

* Re-write `Searcher` to use `memchr`.

Switch from the naive implementation to one that uses `memchr`.

* Rewrite `cut_bytes` almost entirely.

This was already well optimized. The performance gain in this case is
not from avoiding copying. In fact, it needed zero copying whereas new
implementation introduces some copying similar to `cut_fields` described
above. But the occassional copying cost is more than offset by the use
of the very fast `memchr` inside `for_byte_record_with_terminator`.
This change also simplifies the code significantly. Removed the `buffer`
module.
2021-04-24 22:29:48 +05:30
Sylvestre Ledru
2f17bfc14c
Merge pull request #2106 from miDeb/sort-debug
sort: implement --debug
2021-04-24 18:46:58 +02:00
Sylvestre Ledru
c9b0378ca3
Merge pull request #2104 from tertsdiepraam/ls/skip_metadata
`ls`: skip reading metadata
2021-04-24 18:13:53 +02:00
Sylvestre Ledru
d7e8a03237
Merge pull request #2097 from miDeb/sort-disable-dictionary-mode
sort: disallow certain flags with -d and -i
2021-04-24 14:58:32 +02:00
Sylvestre Ledru
b41951614b
Merge branch 'master' into sort-disable-dictionary-mode 2021-04-24 13:56:39 +02:00
Terts Diepraam
1328d18878 ls: remove outdated comment 2021-04-24 13:19:50 +02:00
Michael Debertol
5dcfb51110 flip default for debug to the effective default 2021-04-24 10:52:40 +02:00
Terts Diepraam
728f0bd61d ls: remove redundant parentheses 2021-04-24 10:47:36 +02:00
Terts Diepraam
ce8c58b93e Merge branch 'master' into ls/skip_metadata 2021-04-24 10:45:43 +02:00
Sylvestre Ledru
8ccc6ade61
Merge branch 'master' into split-wsl-detection 2021-04-24 10:24:13 +02:00
Sylvestre Ledru
9517395839
Merge pull request #2088 from nthery/cp_reflink_never
cp: add support for --reflink=never
2021-04-24 10:07:41 +02:00
Sylvestre Ledru
fb6394554e
Merge pull request #2096 from tertsdiepraam/ls/fix_backslash_escape
ls: improve code cov
2021-04-24 10:05:32 +02:00
Sylvestre Ledru
513ff4e45f
Merge branch 'master' into sort-disable-dictionary-mode 2021-04-24 10:04:23 +02:00
Sylvestre Ledru
b96f7dbaea
Merge pull request #2087 from pedrohjordao/printf-clap-opts
Changes parameter parsing to clap
2021-04-24 10:02:05 +02:00
Sylvestre Ledru
372d08c341
Merge pull request #2098 from miDeb/sort-trailing-separator
sort: fix tokenization for trailing separators
2021-04-24 10:00:20 +02:00
Sylvestre Ledru
a9fa4adddf
Merge pull request #2102 from jaggededgedjustice/fix-tail-sleep-interval
tail --sleep-interval takes a value
2021-04-24 09:59:03 +02:00
Sylvestre Ledru
b10837f180
Merge pull request #2103 from jhscheer/refactor_tests
refactor tests (#1982)
2021-04-24 09:58:20 +02:00
Sylvestre Ledru
46b95fb8bd
Merge pull request #2099 from tertsdiepraam/ls/cross_platform_colors
ls: cross-platform colors
2021-04-24 09:56:46 +02:00
Michael Debertol
e6f6b109a5 sort: implement --debug
This adds a --debug flag, which, when activated, will draw lines below
the characters that are actually used for comparisons.

This is not a complete implementation of --debug. It should, quoting the man page
for GNU sort: "annotate the part of the line used to sort, and warn
about questionable usage to stderr". Warning about "questionable usage"
is not part of this patch.

This change required some adjustments to be able to get the range that
is actually used for comparisons. Most notably, general numeric comparisons
were rewritten, fixing some bugs along the lines.

Testing is mostly done by adding fixtures for the expected debug output of
existing tests.
2021-04-23 22:36:15 +02:00
James Robson
b68ecf1269 Allow space in truncate --size 2021-04-23 16:36:46 +01:00
Terts Diepraam
eccb86c9ed ls: fix -a test 2021-04-23 08:26:20 +02:00
Jan Scheer
646c6cacbc refactor tests (#1982) 2021-04-23 02:28:46 +02:00
Terts Diepraam
3874a24457 ls: add once_cell to Cargo.toml 2021-04-23 00:35:45 +02:00
Terts Diepraam
a114f855f0 ls: revert to_ascii_lowercase 2021-04-22 23:43:00 +02:00
Terts Diepraam
e241f3ad69 ls: skip reading metadata 2021-04-22 22:45:24 +02:00
James Robson
3678777539 tail --sleep-interval takes a value 2021-04-22 16:10:08 +01:00
Terts Diepraam
ea10647a62 Merge remote-tracking branch 'upstream/master' into ls/fix_backslash_escape 2021-04-22 14:23:35 +02:00
Terts Diepraam
b9f4964a96 ls: bring up to date with recent changes 2021-04-22 11:39:08 +02:00
Terts Diepraam
cd1514bd57 Merge branch 'master' into ls/cross_platform_colors 2021-04-22 11:30:26 +02:00
Terts Diepraam
4e4c3aba00 ls: don't color symlink target 2021-04-22 11:16:33 +02:00
Anup Mahindre
8554cdf35b
Optimize recursive ls (#2083)
* ls: Remove allocations by eliminating collect/clones

* ls: Introduce PathData structure

- PathData will hold Path related metadata / strings that are required
frequently in subsequent functions
- All data is precomputed and cached and subsequent functions just
use cached data

* ls: Cache more data related to paths

- Cache filename and sort by filename instead of full path
- Cache uid->usr and gid->grp mappings
https://github.com/uutils/coreutils/pull/2099/files
* ls: Add BENCHMARKING.md

* ls: Document PathData structure

* tests/ls: Add testcase for error paths with width option

* ls: Fix unused import warning

cached will be only used for unix currently as current use of
caching gid/uid mappings is only relevant on unix

* ls: Suggest checking syscall count in BENCHMARKING.md

* ls: Remove mentions of sort in BENCHMARKING.md

* ls: Remove dependency on cached

Implement caching using HashMap and lazy_static

* ls: Fix MSRV error related to map_or

Rust 1.40 did not support map_or for result types
2021-04-22 09:19:17 +02:00
Terts Diepraam
1d7e206d72 ls: fix mac build 2021-04-21 20:04:52 +02:00
Michael Debertol
8a05148d7b sort: fix tokenization for trailing separators
Trailing separators were included at the end of the last token, but they
should not be.

This changes tokenize_with_separator as suggested by @cbjadwani.
2021-04-21 19:07:03 +02:00
Terts Diepraam
3fc8d2e422 ls: make compatible with Rust 1.40 again 2021-04-21 18:05:10 +02:00
Terts Diepraam
ff39538375 ls: further refactor --color and classification 2021-04-21 18:00:43 +02:00
Michael Debertol
8b906b9547 remove feature use stabilized in 1.51 2021-04-21 18:00:01 +02:00
Michael Debertol
4a305b32c6 sort: disallow certain flags with -d and -i
GNU sort disallows these combinations, presumably because they are
likely not what the user really wants.

Ignoring characters would cause things to be put together that aren't
together in the input. For example, -dn would cause "0.12" or "0,12" to
be parsed as "12" which is highly unexpected and confusing.
2021-04-21 17:49:40 +02:00
Terts Diepraam
34a824af71 ls: use lscolors crate 2021-04-21 17:35:02 +02:00
Terts Diepraam
29b5b6b276 ls: fix unit tests to match last change 2021-04-21 13:03:31 +02:00
Terts Diepraam
f34c992932 ls: always quote backslash in shell style 2021-04-21 12:45:21 +02:00
Árni Dagur
387227087f
cat: Put splice code in separate file, handle more failures (#2067)
* cat: Refactor splice code, handle more failures

* cat: Add tests for stdout redirected to files
2021-04-21 12:21:31 +02:00
Terts Diepraam
fd54614130 Merge branch 'master' into ls/fix_backslash_escape 2021-04-21 12:06:54 +02:00
Terts Diepraam
f84f23ddfe tests/ls: add coverage for special shell character after escaped char 2021-04-21 11:22:10 +02:00
Terts Diepraam
795d89f11d ls: don't escape backslash in shell style quoting 2021-04-21 11:08:40 +02:00
electricboogie
25021f31eb Incorporate overhead of Line struct 2021-04-19 21:24:52 -05:00
Sivachandran
0ea35f3fbc
Implement install create leading components(-D) option (#2092)
* Implement install's create leading components(-D) option

* Format changes

* Add install test to check fail on long dir name
2021-04-19 22:03:13 +02:00
electricboogie
b8d667c383 Clippy lints, more work on ext_sorter leads to 2 failing tests 2021-04-19 10:57:53 -05:00
Pedro Jordão
158ae35da5 Commented out code removal 2021-04-19 14:21:49 +01:00
Chirag Jadwani
3bb99e7047 uniq: avoid building list of duplicate lines
This reduces memory usage by only storing two lines of the input file at
a time. The current implementation first builds a list of all duplicate
lines ('group') and then decides which lines of the group should be
printed.
2021-04-19 17:02:59 +05:30
Jan Scheer
049f21a199
du: fix tests on linux (#2066) (#2090) 2021-04-19 10:45:51 +02:00
electricboogie
e7bcd59558 Remove a clone 2021-04-18 18:22:30 -05:00
electricboogie
fcebdbb7a7 Cleanup comment 2021-04-18 17:51:44 -05:00
electricboogie
5efd67b5e2 License cleanup 2021-04-18 17:44:45 -05:00
electricboogie
72858dda42 Ran rustfmt 2021-04-18 17:40:59 -05:00
electricboogie
258325491f Make human_numeric_convert a method 2021-04-18 17:39:42 -05:00
electricboogie
8072e2092a Cleanup loop, run rustfmt 2021-04-18 16:33:18 -05:00
electricboogie
deb94cef7a Cleanup 2021-04-18 15:52:48 -05:00
electricboogie
559f4e81f6 More license cleanup 2021-04-18 15:47:05 -05:00
electricboogie
fb19522ca0 Bring back non-external sort as default 2021-04-18 15:39:20 -05:00
electricboogie
e841bb6a24 More license cleanup 2021-04-18 15:20:16 -05:00
electricboogie
9170e7a511 Modify NOTICE 2021-04-18 15:15:12 -05:00
electricboogie
298e269531 Remove unsed code 2021-04-18 15:08:42 -05:00
electricboogie
0151f30c4e Change directory structure 2021-04-18 15:04:25 -05:00
electricboogie
e3e1ee30eb Add additional notices 2021-04-18 14:37:16 -05:00
electricboogie
0275a43c5b Make modifications clearer per Apache license 2021-04-18 14:05:27 -05:00
electricboogie
42da444f40 Remove unused deps 2021-04-18 13:49:11 -05:00
electricboogie
5bb66b26dd Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-18 13:45:33 -05:00
electricboogie
dad7761be9 Add test 2021-04-18 13:43:41 -05:00
electricboogie
da94e35044 Cleanup, removed unused code, add copyright 2021-04-18 13:02:50 -05:00
electricboogie
d7b7ce52bc Vendored ext_sorter, removed unstable, created a byte buffer sized vector instead of a numbered capacity vector 2021-04-18 11:54:18 -05:00
Nicolas Thery
f36832c392 cp: add support for --reflink=never
- Passing `never` to `--reflink` does not raise an error anymore.
- Remove `Options::reflink` flag as it was redundant with
  `reflink_mode`.
- Add basic tests for this option.  Does not check that a copy-on-write
  rather than a regular copy was made.
2021-04-18 18:51:59 +02:00
Sylvestre Ledru
d3f71810df
Merge pull request #2063 from jhscheer/iss2060
chown: fix #2060
2021-04-18 09:50:23 +02:00
electricboogie
a73d108dd8 Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-17 23:25:02 -05:00
electricboogie
4c8d62c2be More cleanup 2021-04-17 23:24:32 -05:00
electricboogie
3a1e92fdd2 More cleanup 2021-04-17 22:39:05 -05:00
electricboogie
7a8767e359 Cleanup 2021-04-17 22:34:03 -05:00
electricboogie
65e9c7b1b5 Sorta working ExtSort - concat struct elements 2021-04-17 21:30:03 -05:00
Jan Scheer
df2dcc5b99 chown: fix parse_spec() for colon (#2060) 2021-04-18 00:11:59 +02:00
Michael Debertol
519b9d34a6
sort: use unstable sort when possible (#2076)
* sort: use unstable sort when possible

This results in a very minor performance (speed) improvement.
It does however result in a memory usage reduction, because unstable
sort does not allocate auxiliary memory. There's also an improvement in
overall CPU usage.

* add benchmarking instructions

* add user time

* fix typo
2021-04-17 22:40:13 +02:00
Pedro Jordão
01fef70143 Changes parameter parsing to clap
- Uses clap to parse parameters
- Removes of "allow" directive where they are not necessary
- Removes of unused variables
2021-04-17 20:42:05 +01:00
electricboogie
acfe0681d4 Merge branch 'master' of https://github.com/uutils/coreutils 2021-04-17 11:54:45 -05:00
Michael Debertol
4bbbe3a3f2
sort: implement numeric string comparison (#2070)
* sort: implement numeric string comparison

This implements -n and -h using a string comparison algorithm instead
of parsing each number to a f64 and comparing those.

This should result in a moderate performance increase and eliminate loss
of precision.

* cache parsed f64 numbers

For general numeric comparisons we have to parse numbers as f64,
as this behavior is explicitly documented by GNU coreutils.
We can however cache the parsed value to speed up comparisons.

* fix leading zeroes for negative numbers

* use more appropriate name for exponent

* improvements to the parse function

* move checks into main loop and fix thousands separator condition

* remove unneeded checks

* rustfmt
2021-04-17 13:49:35 +02:00
Sylvestre Ledru
481d1ee659
Merge pull request #2077 from tertsdiepraam/ls/dereference-command-line
ls: dereference command line
2021-04-17 13:31:52 +02:00
Sylvestre Ledru
c5b43c0994 rustfmt the recent change 2021-04-17 13:21:30 +02:00
Sylvestre Ledru
eec389fa94
Merge branch 'master' into ls/dereference-command-line 2021-04-17 10:30:42 +02:00
Andrew Rowson
d0c7e8c09e
du error output should match GNU (#1776)
* du error output should match GNU

* Created a new error macro which allows the customization of the
  "error:" string part
* Match the du output based on the type of error encountered. Can extend
  to handling other errors I guess.

* Rustfmt updates

* Added non-windows test for du no permission output
2021-04-17 10:26:52 +02:00