Commit graph

130 commits

Author SHA1 Message Date
Terts Diepraam
c6c936f529 all: remove explicit imports of TryFrom and TryInto
This is enabled by the changing the edition from 2018 to 2021
2022-04-05 10:39:31 +02:00
Terts Diepraam
af9f718936 Change edition to 2021 2022-04-05 10:39:31 +02:00
Terts Diepraam
b7809bd889 version 0.0.13 2022-04-02 11:04:27 +02:00
Jeffrey Finkelstein
e357d2650c clippy fixes from nightly rust 2022-03-22 21:44:33 -04:00
Sylvestre Ledru
04b219bdef
Merge pull request #3229 from uutils/dependabot/cargo/clap-3.1.6
build(deps): bump clap from 3.0.10 to 3.1.6
2022-03-20 17:44:33 +01:00
Jeffrey Finkelstein
b79ff6b4fd split: avoid writing final empty chunk with -C
Fix a bug in which a final empty file was written when using `split
--line-bytes` mode.
2022-03-20 09:30:58 -04:00
Jeffrey Finkelstein
95f58fbf3c split: handle no final newline with --line-bytes
Fix a panic due to out-of-bounds indexing when using `split
--line-bytes` with an input that had no trailing newline.
2022-03-19 23:50:02 -04:00
Sylvestre Ledru
1d17400b80
Merge pull request #3274 from jfinkels/split-elide-empty-files
split: elide all chunks when input file is empty
2022-03-19 22:52:57 +01:00
Jeffrey Finkelstein
0a226524a6 split: elide all chunks when input file is empty
Fix a bug in the behavior of `split -e -n NUM` when the input file is
empty. Previously, it would panic due to overflow when subtracting 1
from 0. After this change, it will terminate successfully and produce
no output chunks.
2022-03-19 14:32:28 -04:00
Jeffrey Finkelstein
6d2eff9c27 split: catch and handle broken pipe errors
Catch `BrokenPipe` errors and silently ignore them so that `split`
terminates successfully on a broken pipe. This matches the behavior of
GNU `split`.
2022-03-19 12:11:03 -04:00
Sylvestre Ledru
3c88fb460b
Merge branch 'main' into dependabot/cargo/clap-3.1.6 2022-03-19 09:26:05 +01:00
Sylvestre Ledru
9796e01df6
Revert "split: implement round-robin arg to --number" 2022-03-18 14:45:29 +01:00
Terts Diepraam
20212be4c8 fix clippy errors related to clap upgrade from 3.0.10 to 3.1.6 2022-03-17 22:46:56 +01:00
dependabot[bot]
59440d35c0
build(deps): bump clap from 3.0.10 to 3.1.6
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.10 to 3.1.6.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.10...v3.1.6)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-03-17 13:06:29 +00:00
Jeffrey Finkelstein
18bfd1ac68 split: implement round-robin arg to --number
Implement distributing lines of a file in a round-robin manner to a
specified number of chunks. For example,

    $ (seq 1 10 | split -n r/3) && head -v xa[abc]
    ==> xaa <==
    1
    4
    7
    10

    ==> xab <==
    2
    5
    8

    ==> xac <==
    3
    6
    9
2022-03-15 18:22:44 -04:00
Sylvestre Ledru
bfd1e14137
Merge pull request #3204 from jfinkels/split-line-bytes
split: implement --line-bytes option
2022-03-12 09:45:07 +01:00
Jeffrey Finkelstein
77d92883c7 split: implement --line-bytes option
Implement the `--line-bytes` option to `split`. In this mode, the
program tries to write as many lines of the input as possible to each
chunk of output without exceeding a specified byte limit. The new
`LineBytesChunkWriter` struct represents this functionality.
2022-03-10 22:51:49 -05:00
Jeffrey Finkelstein
b42168e9dc Clippy fixes in multiple crates 2022-03-10 22:31:21 -05:00
Sylvestre Ledru
54a10e955a Update of the cargo.lock url to point to the right branch 2022-03-06 22:13:17 +01:00
Davide Cavalca
19af43222b Include license text in all published crates 2022-03-05 21:21:46 +01:00
Jeffrey Finkelstein
ee36dea1a9 split: implement outputting kth chunk of file
Implement `-n l/k/N` option, where the `k`th chunk of the input file
is written to stdout. For example,

    $ seq -w 0 99 > f; split -n l/3/10 f
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
2022-03-05 10:27:51 +01:00
Sylvestre Ledru
346cfa060b
Merge pull request #2980 from jfinkels/split-lines-2
split: add support for "-n l/NUM" option to split
2022-03-01 10:13:44 +01:00
Jeffrey Finkelstein
dbbee573ab split: add support for "-n l/NUM" option to split
Add support for `split -n l/NUM`. Previously, `split` only supported
`-n NUM`, which splits a file into `NUM` chunks by byte. The `-n
l/NUM` strategy splits a file into `NUM` chunks without splitting
lines across chunks.
2022-02-22 18:44:08 -05:00
Jeffrey Finkelstein
92d461247e split: extend Strategy::Number to add NumberType
Make the `Strategy::Number` enumeration value more general by
replacing the number parameter with a `NumberType` enum parameter.
This allows a future commit to update `split` to support the various
sub-strategies for the `-n`. (This commit does not add support for the
other sub-strategies.)
2022-02-22 18:41:29 -05:00
Omer Tuchfeld
0ce22f3a08 Improve coverage / error messages from parse_size PR
https://github.com/uutils/coreutils/pull/3084 (2a333ab391) had some
missing coverage and was merged before I had a chance to fix it.

This PR adds some coverage / improved error messages that were missing
from that previous PR.
2022-02-22 22:09:45 +01:00
Gilad Naaman
159a1dc1db Fix type-error when calling parse_size from split 2022-02-22 13:49:20 +01:00
Terts Diepraam
53070141c1
all: add format_usage function (#3139)
This should correct the usage strings in both the `--help` and user documentation. Previously, sometimes the name of the utils did not show up correctly.
2022-02-21 17:14:03 +01:00
Terts Diepraam
938c5acbbe
Merge pull request #3146 from ndd7xv/split-suffix-check
split: error when num. of chunks is greater than num. of possible filenames
2022-02-20 17:12:05 +01:00
Jeffrey Finkelstein
6718d97f97 split: add support for -e argument
Add the `-e` flag, which indicates whether to elide (that is, remove)
empty files that would have been created by the `-n` option.

The `-n` command-line argument gives a specific number of chunks into
which the input files will be split. If the number of chunks is
greater than the number of bytes, then empty files will be created for
the excess chunks. But if `-e` is given, then empty files will not be
created.

For example, contrast

    $ printf 'a\n' > f && split -e -n 3 f && cat xaa xab xac
    a
    cat: xac: No such file or directory

with

    $ printf 'a\n' > f && split -n 3 f && cat xaa xab xac
    a
2022-02-17 19:03:51 -05:00
Terts Diepraam
e1a611374a
Merge pull request #2981 from jfinkels/split-hex-numbers
split: add support for -x option (hex suffixes)
2022-02-17 23:20:58 +01:00
ndd7xv
494d709e0f split: small tweaks to wording
changes `SuffixType` enums to have better names and hex suffix help to be consistent with numeric suffix help
2022-02-17 01:04:26 -05:00
ndd7xv
6c3fc7b214 split: throw error when # chunks > # filenames from suffix length 2022-02-16 23:53:56 -05:00
Jeffrey Finkelstein
4470430c89 split: add support for -x option (hex suffixes)
Add support for the `-x` command-line option to `split`. This option
causes `split` to produce filenames with hexadecimal suffixes instead
of the default alphabetic suffixes.
2022-02-16 23:53:56 -05:00
Jeffrey Finkelstein
891c5d1ffa split: add SuffixType::NumericHexadecimal
Add a `NumericHexadecimal` member to the `SuffixType` enum so that a
future commit can add support for hexadecimal filename suffixes to the
`split` program.
2022-02-16 23:53:56 -05:00
Jeffrey Finkelstein
aa4c5aea50 split: refactor to add SuffixType enum
Refactor the code to use a `SuffixType` enumeration with two members,
`Alphabetic` and `NumericDecimal`, representing the two currently
supported ways of producing filename suffixes. This prepares the code
to more easily support other formats, like numeric hexadecimal.
2022-02-16 23:53:56 -05:00
DevSabb
6d6371741a
include io-blksize parameter (#3064)
* include io-blksize parameter

* format changes for including io-blksize

Co-authored-by: DevSabb <devsabb@local>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2022-02-14 19:47:18 +01:00
Jeffrey Finkelstein
a4955b4e06 split: add support for -x option (hex suffixes)
Add support for the `-x` command-line option to `split`. This option
causes `split` to produce filenames with hexadecimal suffixes instead
of the default alphabetic suffixes.
2022-02-13 11:18:37 -05:00
Jeffrey Finkelstein
494dc7ec57 split: add SuffixType::NumericHexadecimal
Add a `NumericHexadecimal` member to the `SuffixType` enum so that a
future commit can add support for hexadecimal filename suffixes to the
`split` program.
2022-02-13 11:18:37 -05:00
Jeffrey Finkelstein
7fbd805713 split: refactor to add SuffixType enum
Refactor the code to use a `SuffixType` enumeration with two members,
`Alphabetic` and `NumericDecimal`, representing the two currently
supported ways of producing filename suffixes. This prepares the code
to more easily support other formats, like numeric hexadecimal.
2022-02-13 11:18:37 -05:00
Sylvestre Ledru
6b6d5ee7db
Merge pull request #2827 from jfinkels/split-std-io-copy
split: use std::io::copy() with new writer implementation to improve maintainability and speed
2022-02-12 11:33:12 +01:00
Jeffrey Finkelstein
2f65b29866 split: error when --additional-suffix contains /
Make `split` terminate with a usage error when the
`--additional-suffix` argument contains a directory separator
character.
2022-02-10 19:33:33 -05:00
Jeffrey Finkelstein
b37718de10 split: add BENCHMARKING.md documentation file 2022-02-08 22:58:00 -05:00
Jeffrey Finkelstein
70ca1f45ea split: remove unused ByteSplitter and LineSplitter 2022-02-08 22:58:00 -05:00
Jeffrey Finkelstein
1d7e1b8732 split: use ByteChunkWriter and LineChunkWriter
Replace `ByteSplitter` and `LineSplitter` with `ByteChunkWriter` and
`LineChunkWriter` respectively. This results in a more maintainable
design and an increase in the speed of splitting by lines.
2022-02-08 22:57:57 -05:00
Jeffrey Finkelstein
b31d63eaa9 split: add ByteChunkWriter and LineChunkWriter
Add the `ByteChunkWriter` and `LineChunkWriter` structs and
implementations, but don't use them yet. This structs offer an
alternative approach to writing chunks of output (contrasted with
`ByteSplitter` and `LineSplitter`). The main difference is that
control of which underlying file is being written is inside the writer
instead of outside.
2022-02-08 22:53:56 -05:00
Jeffrey Finkelstein
8fa6797255 split: add structure to errors that can be created
Add some structure to errors that can be created during parsing of
settings from command-line options. This commit creates
`StrategyError` and `SettingsError` enumerations to represent the
various parsing and other errors that can arise when transforming
`ArgMatches` into `Settings`.
2022-02-06 20:09:29 -05:00
Jeffrey Finkelstein
e5361a8c11 split: correct error message on invalid arg. to -a
Correct the error message displayed on an invalid parameter to the
`--suffix-length` or `-a` command-line option.
2022-02-06 20:09:29 -05:00
Daniel Eades
4f8d1c5fcf add additional lints 2022-01-31 20:40:47 +01:00
Terts Diepraam
7b3cfcf708
Merge pull request #2868 from jfinkels/split-filename-iterator
split: use iterator to produce filenames
2022-01-30 22:37:37 +01:00
Jeffrey Finkelstein
a5b435da58 split: use iterator to produce filenames
Replace the `FilenameFactory` with `FilenameIterator` and calls to
`FilenameFactory::make()` with calls to `FilenameIterator::next()`. We
did not need the fully generality of being able to produce the
filename for an arbitrary chunk index. Instead we need only iterate
over filenames one after another. This allows for a less
mathematically dense algorithm that is easier to understand and
maintain. Furthermore, it can be connected to some familiar concepts
from the representation of numbers as a sequence of digits.

This does not change the behavior of the `split` program, just the
implementation of how filenames are produced.

Co-authored-by: Terts Diepraam <terts.diepraam@gmail.com>
2022-01-30 11:18:58 -05:00