Commit graph

72 commits

Author SHA1 Message Date
zhitkoff
4fd598e4d5 split: tests 2023-09-06 13:20:58 -04:00
zhitkoff
e378454a26 split: formatting 2023-09-06 13:15:35 -04:00
zhitkoff
d8a16a2351 split: tests 2023-09-06 12:42:49 -04:00
zhitkoff
a0a9ee6491 split: fixing tests for parse_size_max() 2023-09-05 18:42:16 -04:00
Yury Zhytkou
e0b000a3bc
Merge branch 'main' into split-gnu-test-fail.sh 2023-09-05 17:22:31 -04:00
Daniel Hofstetter
8920ac0123 split: fix clippy warning in test 2023-09-04 07:26:23 +02:00
zhitkoff
e597189be7 split: fixed windows test for invalid unicode args 2023-08-31 20:48:44 -04:00
zhitkoff
d2812cbbc3 split: disable windows test for invalid UTF8 2023-08-31 16:04:44 -04:00
zhitkoff
5bfe9b19ef split: avoid using collect_lossy + test for invalid UTF8 arguments 2023-08-31 14:46:56 -04:00
zhitkoff
6f37b4b4cf split: hyphenated values + tests 2023-08-30 19:29:57 -04:00
zhitkoff
7f905a3b8d split: edge case for obs lines within combined shorts + test 2023-08-29 16:35:00 -04:00
zhitkoff
15c7170d20 split: fix for GNU Tests regression + tests 2023-08-29 15:49:47 -04:00
zhitkoff
fa8d18b826 split: refactor obsolete lines 2023-08-28 18:24:21 -04:00
Yury Zhytkou
1eae064e5c
split: better handle numeric and hex suffixes, short and long, with and without values (#5198)
* split: better handle numeric and hex suffixes, short and long, with and without values
Fixes #5171

* refactoring with overrides_with_all() in args definitions

* fixed comments

* updated help on suffixes to match GNU

* comments

* refactor to remove value_parser()

* split: refactor suffix processing + updated tests

* split: minor formatting
2023-08-28 10:09:52 +02:00
zhitkoff
84d96f9d02 split: refactor for more common use case 2023-08-26 11:11:46 -04:00
Terts Diepraam
c3f9e19a3b all: normalize license notice in all *.rs files 2023-08-24 12:21:09 +02:00
John Shin
631b9892f6 split: loop over chars and remove char_from_digit function 2023-08-11 18:36:08 -07:00
Sylvestre Ledru
d033db3573 split: reject some invalid values
Matches what is done in tests/split/fail.sh
(still doesn't work)
2023-07-03 22:57:37 +02:00
Daniel Hofstetter
7a888da409 tests: remove all "extern crate" statements 2023-04-10 08:31:31 +02:00
Daniel Hofstetter
6988eb7ec6 tests: expand wildcard imports 2023-03-20 15:32:35 +01:00
Daniel Hofstetter
f6b646e4e5 clippy: fix warnings introduced with Rust 1.67.0 2023-01-27 17:37:56 +01:00
Joining7943
1fadeb43b2 tests/util: Do not trim stderr in CmdResult::stderr_is. Add method stderr_trimmed_is.
Fix tests assert whitespace instead of trimming it. Disable some tests in `test_tr` because `tr`
produces too many newlines.
2023-01-22 14:56:19 +01:00
Niyaz Nigmatullin
fdd6a05259 chore: run cargo +nightly clippy --fix 2022-11-16 11:09:44 +02:00
Jeffrey Finkelstein
7dc96697c9 split: implement round-robin arg to --number
Implement distributing lines of a file in a round-robin manner to a
specified number of chunks. For example,

    $ (seq 1 10 | split -n r/3) && head -v xa[abc]
    ==> xaa <==
    1
    4
    7
    10

    ==> xab <==
    2
    5
    8

    ==> xac <==
    3
    6
    9
2022-10-22 23:15:55 -04:00
Sylvestre Ledru
969f821830
Merge pull request #4009 from andrewbaptist/fix_eof_linebytes
Match GNU semantics for missing EOF
2022-10-17 16:23:20 +02:00
Sylvestre Ledru
6e14dea73b Fix some clippy warnings
Fixed with `cargo clippy --features unix  --fix`
and manually
2022-10-13 09:07:22 +02:00
Andrew Baptist
4922d34177 Match GNU semantics for missing EOF
While the rust coreutils semantics were arguably more correct,
they were different than the gnu split semantics when handling a
file without a trailing EOF. This patch addresses that difference
and allows passing one more GNU test suite.
2022-10-07 17:50:26 -04:00
Andrew Baptist
49e1cc6c71 Add support for starting suffix numbers
This commit now allows split to pass split/numeric.sh
2022-10-05 09:52:20 -04:00
Terts Diepraam
9177cb7b24 all: add tests for usage error exit code 2022-09-10 20:59:42 +02:00
Jeffrey Finkelstein
8458bf1387 Clippy fixes in multiple crates 2022-08-23 18:30:43 -04:00
Owen Anderson
9fad6fde35
Fix a bug in split where chunking would be skipped when the chunk size (#3800)
* Fix a bug in split where chunking would be skipped when the chunk size
happened to be an exact divisor of the buffer size used to read the
input stream.

The issue here was that file was being split byte-wise in chunks of 1G.
The input stream was being read in chunks of 8KB, which evenly divides
the chunk size. Because the check to allocate the next output chunk was
done at the bottom of the loop previously, it would never occur because
the current input chunk was fully consumed at that point. By moving the
check to the top of the loop (but still late enough that we know we have
bytes to write) we resolve this issue.

This scenario is unfortunately hard to write a test for, since we don't
explicitly control the input chunk size.

Fixes https://github.com/uutils/coreutils/issues/3790
2022-08-16 11:02:52 +02:00
Andrew Baptist
f2cfc15a70 split: Don't overwrite files
Check that a file exists by calling create_new and changing the
interface of instantiate_current_writer to return a Result rather
than calling unwrap.
2022-07-21 12:06:13 -04:00
Jeffrey Finkelstein
b79ff6b4fd split: avoid writing final empty chunk with -C
Fix a bug in which a final empty file was written when using `split
--line-bytes` mode.
2022-03-20 09:30:58 -04:00
Jeffrey Finkelstein
95f58fbf3c split: handle no final newline with --line-bytes
Fix a panic due to out-of-bounds indexing when using `split
--line-bytes` with an input that had no trailing newline.
2022-03-19 23:50:02 -04:00
Jeffrey Finkelstein
0a226524a6 split: elide all chunks when input file is empty
Fix a bug in the behavior of `split -e -n NUM` when the input file is
empty. Previously, it would panic due to overflow when subtracting 1
from 0. After this change, it will terminate successfully and produce
no output chunks.
2022-03-19 14:32:28 -04:00
Sylvestre Ledru
9796e01df6
Revert "split: implement round-robin arg to --number" 2022-03-18 14:45:29 +01:00
Jeffrey Finkelstein
18bfd1ac68 split: implement round-robin arg to --number
Implement distributing lines of a file in a round-robin manner to a
specified number of chunks. For example,

    $ (seq 1 10 | split -n r/3) && head -v xa[abc]
    ==> xaa <==
    1
    4
    7
    10

    ==> xab <==
    2
    5
    8

    ==> xac <==
    3
    6
    9
2022-03-15 18:22:44 -04:00
Jeffrey Finkelstein
77d92883c7 split: implement --line-bytes option
Implement the `--line-bytes` option to `split`. In this mode, the
program tries to write as many lines of the input as possible to each
chunk of output without exceeding a specified byte limit. The new
`LineBytesChunkWriter` struct represents this functionality.
2022-03-10 22:51:49 -05:00
Sylvestre Ledru
f3bd1f3020 Add onehundredlines in the spell ignore 2022-03-05 10:27:51 +01:00
Jeffrey Finkelstein
ee36dea1a9 split: implement outputting kth chunk of file
Implement `-n l/k/N` option, where the `k`th chunk of the input file
is written to stdout. For example,

    $ seq -w 0 99 > f; split -n l/3/10 f
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
2022-03-05 10:27:51 +01:00
Sylvestre Ledru
346cfa060b
Merge pull request #2980 from jfinkels/split-lines-2
split: add support for "-n l/NUM" option to split
2022-03-01 10:13:44 +01:00
Jeffrey Finkelstein
dbbee573ab split: add support for "-n l/NUM" option to split
Add support for `split -n l/NUM`. Previously, `split` only supported
`-n NUM`, which splits a file into `NUM` chunks by byte. The `-n
l/NUM` strategy splits a file into `NUM` chunks without splitting
lines across chunks.
2022-02-22 18:44:08 -05:00
Omer Tuchfeld
0ce22f3a08 Improve coverage / error messages from parse_size PR
https://github.com/uutils/coreutils/pull/3084 (2a333ab391) had some
missing coverage and was merged before I had a chance to fix it.

This PR adds some coverage / improved error messages that were missing
from that previous PR.
2022-02-22 22:09:45 +01:00
Omer Tuchfeld
fa60898354 Adjust 32-bit tests for tail,split,truncate,head 2022-02-22 13:49:20 +01:00
Jeffrey Finkelstein
6718d97f97 split: add support for -e argument
Add the `-e` flag, which indicates whether to elide (that is, remove)
empty files that would have been created by the `-n` option.

The `-n` command-line argument gives a specific number of chunks into
which the input files will be split. If the number of chunks is
greater than the number of bytes, then empty files will be created for
the excess chunks. But if `-e` is given, then empty files will not be
created.

For example, contrast

    $ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac
    a
    cat: xac: No such file or directory

with

    $ printf 'a\n' > f && split -n 3 f && cat xaa xab xac
    a
2022-02-17 19:03:51 -05:00
Terts Diepraam
e1a611374a
Merge pull request #2981 from jfinkels/split-hex-numbers
split: add support for -x option (hex suffixes)
2022-02-17 23:20:58 +01:00
DevSabb
63fa3c81ed fix failure in test_split 2022-02-14 20:41:58 -05:00
DevSabb
6d6371741a
include io-blksize parameter (#3064)
* include io-blksize parameter

* format changes for including io-blksize

Co-authored-by: DevSabb <devsabb@local>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2022-02-14 19:47:18 +01:00
Jeffrey Finkelstein
a4955b4e06 split: add support for -x option (hex suffixes)
Add support for the `-x` command-line option to `split`. This option
causes `split` to produce filenames with hexadecimal suffixes instead
of the default alphabetic suffixes.
2022-02-13 11:18:37 -05:00
Sylvestre Ledru
6b6d5ee7db
Merge pull request #2827 from jfinkels/split-std-io-copy
split: use std::io::copy() with new writer implementation to improve maintainability and speed
2022-02-12 11:33:12 +01:00