Commit graph

150 commits

Author SHA1 Message Date
Jeffrey Finkelstein
7dc96697c9 split: implement round-robin arg to --number
Implement distributing lines of a file in a round-robin manner to a
specified number of chunks. For example,

    $ (seq 1 10 | split -n r/3) && head -v xa[abc]
    ==> xaa <==
    1
    4
    7
    10

    ==> xab <==
    2
    5
    8

    ==> xac <==
    3
    6
    9
2022-10-22 23:15:55 -04:00
Niyaz Nigmatullin
505865cce8 split: fix ---io for clap 4 2022-10-18 00:21:27 +03:00
Sylvestre Ledru
969f821830
Merge pull request #4009 from andrewbaptist/fix_eof_linebytes
Match GNU semantics for missing EOF
2022-10-17 16:23:20 +02:00
Terts Diepraam
261b18d6f3 split: update to clap 4 2022-10-13 17:50:40 +02:00
Sylvestre Ledru
6e14dea73b Fix some clippy warnings
Fixed with `cargo clippy --features unix  --fix`
and manually
2022-10-13 09:07:22 +02:00
Terts Diepraam
f15c4f2d3e Version 0.0.16 2022-10-11 23:03:39 +02:00
Andrew Baptist
4922d34177 Match GNU semantics for missing EOF
While the rust coreutils semantics were arguably more correct,
they were different than the gnu split semantics when handling a
file without a trailing EOF. This patch addresses that difference
and allows passing one more GNU test suite.
2022-10-07 17:50:26 -04:00
Andrew Baptist
49e1cc6c71 Add support for starting suffix numbers
This commit now allows split to pass split/numeric.sh
2022-10-05 09:52:20 -04:00
Daniel Hofstetter
9e8daf92dd Replace deprecated value_of() with get_one() 2022-09-26 16:42:42 +02:00
Terts Diepraam
975a1d170d change remaining usage codes of 2 to 1 for GNU compat 2022-09-10 20:24:24 +02:00
Daniel Hofstetter
c6e313372e Replace deprecated occurrences_of() 2022-08-23 16:31:32 +02:00
Terts Diepraam
105b0db251 rmdir, split: add fs feature to uucore dep 2022-08-20 15:13:18 +02:00
Terts Diepraam
15180249fc Version 0.0.15 2022-08-20 13:13:22 +02:00
Owen Anderson
9fad6fde35
Fix a bug in split where chunking would be skipped when the chunk size (#3800)
* Fix a bug in split where chunking would be skipped when the chunk size
happened to be an exact divisor of the buffer size used to read the
input stream.

The issue here was that file was being split byte-wise in chunks of 1G.
The input stream was being read in chunks of 8KB, which evenly divides
the chunk size. Because the check to allocate the next output chunk was
done at the bottom of the loop previously, it would never occur because
the current input chunk was fully consumed at that point. By moving the
check to the top of the loop (but still late enough that we know we have
bytes to write) we resolve this issue.

This scenario is unfortunately hard to write a test for, since we don't
explicitly control the input chunk size.

Fixes https://github.com/uutils/coreutils/issues/3790
2022-08-16 11:02:52 +02:00
Daniel Hofstetter
7c3116330e Replace deprecated is_present() with contains_id() 2022-08-02 15:21:39 +02:00
Daniel Hofstetter
fc4544c42b bump clap from 3.1.18 to 3.2.15 2022-07-29 14:05:02 +02:00
Andrew Baptist
f2cfc15a70 split: Don't overwrite files
Check that a file exists by calling create_new and changing the
interface of instantiate_current_writer to return a Result rather
than calling unwrap.
2022-07-21 12:06:13 -04:00
Daniel Hofstetter
2261051239 split: set names for arg values 2022-05-30 09:10:18 +02:00
Terts Diepraam
eae07adfb1
Version 0.0.14 (#3553)
Version 0.0.14
2022-05-22 19:57:19 +02:00
Terts Diepraam
0acfa07d77 all: add value hints 2022-05-13 16:15:50 +02:00
Terts Diepraam
c6c936f529 all: remove explicit imports of TryFrom and TryInto
This is enabled by the changing the edition from 2018 to 2021
2022-04-05 10:39:31 +02:00
Terts Diepraam
af9f718936 Change edition to 2021 2022-04-05 10:39:31 +02:00
Terts Diepraam
b7809bd889 version 0.0.13 2022-04-02 11:04:27 +02:00
Jeffrey Finkelstein
e357d2650c clippy fixes from nightly rust 2022-03-22 21:44:33 -04:00
Sylvestre Ledru
04b219bdef
Merge pull request #3229 from uutils/dependabot/cargo/clap-3.1.6
build(deps): bump clap from 3.0.10 to 3.1.6
2022-03-20 17:44:33 +01:00
Jeffrey Finkelstein
b79ff6b4fd split: avoid writing final empty chunk with -C
Fix a bug in which a final empty file was written when using `split
--line-bytes` mode.
2022-03-20 09:30:58 -04:00
Jeffrey Finkelstein
95f58fbf3c split: handle no final newline with --line-bytes
Fix a panic due to out-of-bounds indexing when using `split
--line-bytes` with an input that had no trailing newline.
2022-03-19 23:50:02 -04:00
Sylvestre Ledru
1d17400b80
Merge pull request #3274 from jfinkels/split-elide-empty-files
split: elide all chunks when input file is empty
2022-03-19 22:52:57 +01:00
Jeffrey Finkelstein
0a226524a6 split: elide all chunks when input file is empty
Fix a bug in the behavior of `split -e -n NUM` when the input file is
empty. Previously, it would panic due to overflow when subtracting 1
from 0. After this change, it will terminate successfully and produce
no output chunks.
2022-03-19 14:32:28 -04:00
Jeffrey Finkelstein
6d2eff9c27 split: catch and handle broken pipe errors
Catch `BrokenPipe` errors and silently ignore them so that `split`
terminates successfully on a broken pipe. This matches the behavior of
GNU `split`.
2022-03-19 12:11:03 -04:00
Sylvestre Ledru
3c88fb460b
Merge branch 'main' into dependabot/cargo/clap-3.1.6 2022-03-19 09:26:05 +01:00
Sylvestre Ledru
9796e01df6
Revert "split: implement round-robin arg to --number" 2022-03-18 14:45:29 +01:00
Terts Diepraam
20212be4c8 fix clippy errors related to clap upgrade from 3.0.10 to 3.1.6 2022-03-17 22:46:56 +01:00
dependabot[bot]
59440d35c0
build(deps): bump clap from 3.0.10 to 3.1.6
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.10 to 3.1.6.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.10...v3.1.6)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-17 13:06:29 +00:00
Jeffrey Finkelstein
18bfd1ac68 split: implement round-robin arg to --number
Implement distributing lines of a file in a round-robin manner to a
specified number of chunks. For example,

    $ (seq 1 10 | split -n r/3) && head -v xa[abc]
    ==> xaa <==
    1
    4
    7
    10

    ==> xab <==
    2
    5
    8

    ==> xac <==
    3
    6
    9
2022-03-15 18:22:44 -04:00
Sylvestre Ledru
bfd1e14137
Merge pull request #3204 from jfinkels/split-line-bytes
split: implement --line-bytes option
2022-03-12 09:45:07 +01:00
Jeffrey Finkelstein
77d92883c7 split: implement --line-bytes option
Implement the `--line-bytes` option to `split`. In this mode, the
program tries to write as many lines of the input as possible to each
chunk of output without exceeding a specified byte limit. The new
`LineBytesChunkWriter` struct represents this functionality.
2022-03-10 22:51:49 -05:00
Jeffrey Finkelstein
b42168e9dc Clippy fixes in multiple crates 2022-03-10 22:31:21 -05:00
Sylvestre Ledru
54a10e955a Update of the cargo.lock url to point to the right branch 2022-03-06 22:13:17 +01:00
Davide Cavalca
19af43222b Include license text in all published crates 2022-03-05 21:21:46 +01:00
Jeffrey Finkelstein
ee36dea1a9 split: implement outputting kth chunk of file
Implement `-n l/k/N` option, where the `k`th chunk of the input file
is written to stdout. For example,

    $ seq -w 0 99 > f; split -n l/3/10 f
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
2022-03-05 10:27:51 +01:00
Sylvestre Ledru
346cfa060b
Merge pull request #2980 from jfinkels/split-lines-2
split: add support for "-n l/NUM" option to split
2022-03-01 10:13:44 +01:00
Jeffrey Finkelstein
dbbee573ab split: add support for "-n l/NUM" option to split
Add support for `split -n l/NUM`. Previously, `split` only supported
`-n NUM`, which splits a file into `NUM` chunks by byte. The `-n
l/NUM` strategy splits a file into `NUM` chunks without splitting
lines across chunks.
2022-02-22 18:44:08 -05:00
Jeffrey Finkelstein
92d461247e split: extend Strategy::Number to add NumberType
Make the `Strategy::Number` enumeration value more general by
replacing the number parameter with a `NumberType` enum parameter.
This allows a future commit to update `split` to support the various
sub-strategies for the `-n`. (This commit does not add support for the
other sub-strategies.)
2022-02-22 18:41:29 -05:00
Omer Tuchfeld
0ce22f3a08 Improve coverage / error messages from parse_size PR
https://github.com/uutils/coreutils/pull/3084 (2a333ab391) had some
missing coverage and was merged before I had a chance to fix it.

This PR adds some coverage / improved error messages that were missing
from that previous PR.
2022-02-22 22:09:45 +01:00
Gilad Naaman
159a1dc1db Fix type-error when calling parse_size from split 2022-02-22 13:49:20 +01:00
Terts Diepraam
53070141c1
all: add format_usage function (#3139)
This should correct the usage strings in both the `--help` and user documentation. Previously, sometimes the name of the utils did not show up correctly.
2022-02-21 17:14:03 +01:00
Terts Diepraam
938c5acbbe
Merge pull request #3146 from ndd7xv/split-suffix-check
split: error when num. of chunks is greater than num. of possible filenames
2022-02-20 17:12:05 +01:00
Jeffrey Finkelstein
6718d97f97 split: add support for -e argument
Add the `-e` flag, which indicates whether to elide (that is, remove)
empty files that would have been created by the `-n` option.

The `-n` command-line argument gives a specific number of chunks into
which the input files will be split. If the number of chunks is
greater than the number of bytes, then empty files will be created for
the excess chunks. But if `-e` is given, then empty files will not be
created.

For example, contrast

    $ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac
    a
    cat: xac: No such file or directory

with

    $ printf 'a\n' > f && split -n 3 f && cat xaa xab xac
    a
2022-02-17 19:03:51 -05:00
Terts Diepraam
e1a611374a
Merge pull request #2981 from jfinkels/split-hex-numbers
split: add support for -x option (hex suffixes)
2022-02-17 23:20:58 +01:00